Programming Languages Going Above and Beyond
Having been a programmer for a long time now, I have experienced my fair share of programming languages. What strikes me the most is that programming languages have not improved much over the years. Java, for example, has certainly improved from when I started using it in the mid-nineties — but only in pretty minor ways. We still get buffer overflows and integer overflows. The compiler still cannot tell when our loops will terminate (yes, this is possible). Aliasing is still a complete unbridled mess. Even Rust, my favourite language du jour, only offers a minor improvement on the status quo. These are not stop-the-world everything has changed kinds of improvements.
Still, big improvements are possible. I now use, on a daily basis, a language (Dafny) which often genuinely amazes me. I’ll admit, its not super easy to use and perhaps not ready yet for mainstream — but, it gives a glimpse of what is possible. Dafny’s ability to statically check critical properties of your program goes well beyond what mainstream languages can do (that includes you, Rust). Here’s a simple example:
function Gcd(a:nat, b:nat) : (g:nat,x:int,y:int)
// Bezout's identity
ensures (a*x)+(b*y) == g
{
if a == 0 then (b,0,1)
else
var (g,x,y) := Gcd(b%a,a);
(g,y-((b/a)*x),x)
}
This is a recursive implementation of Euclid’s extended GCD
algorithm
(i.e. an integral part of modern cryptography, including
RSA and
ECC).
Using the ensures
clause I’ve specified a
postcondition — that
is, a property that should always hold of the return value. This
particular postcondition corresponds to Bézout’s
identity.
That is, a theorem about numbers proved by Étienne Bézout in the 18th
century. The amazing thing is that, in order to check my program is
correct, Dafny has reproved this theorem for me.
Bits and Bytes
The above example was taken from our DafnyEVM codebase. Here’s another example which is, perhaps, more concrete:
function ToBytes(v:nat) : (r:seq<u8>)
ensures |r| > 0 {
// Extract the byte
var byte : u8 := (v % 256) as u8;
// Determine what's left
var w : nat := v/256;
if w == 0 then [byte]
else
ToBytes(w) + [byte]
}
The above is a straightforward (recursive) function for converting an
arbitrary-sized unsigned integer (nat
) into a sequence of one or
more bytes (following a big endian representation). Here, ToBytes()
has an ensures
clause clarifying it always returns a non-empty
sequence. Dafny checks at compile time that this is always true.
Now, here is a function taking us in the opposite direction:
function FromBytes(bytes:seq<u8>) : (r:nat) {
if |bytes| == 0 then 0
else
var last := |bytes| - 1;
var byte := bytes[last] as nat;
var msw := FromBytes(bytes[..last]);
(msw * 256) + byte
}
There isn’t anything too special going on here (yet). But, now the magic begins! What we want to know is that these two functions are “opposites” of each other. We can do this like so:
lemma LemmaFromToBytes(v: nat)
ensures FromBytes(ToBytes(v)) == v {
// Dafny does all the work!
}
This might not seem like much, but it represents something you cannot
easily do in other languages. This lemma
asks Dafny to check that,
for any possible nat
value v
, it always holds that
FromBytes(ToBytes(v)) == v
. Again, Dafny checks this at compile
time — no testing required!
Finally, an interesting puzzle remains. For an arbitrary sequence
bs
of bytes, does ToBytes(FromBytes(bs)) == bs
always hold?
Again, Dafny can answer this for us almost immediately. If
you’re interested to know the answer, take a look at the lemma we
came up
with.
Follow the discussion on Twitter