Formalising the Ethereum Virtual Machine in Dafny
Since starting at ConsenSys, the main project I have been involved with is a formalisation of the Ethereum Virtual Machine in Dafny called the “DafnyEVM”. Our goals share some similarity with Runtime Verification’s KEVM project. We want a formal semantics of the EVM which is executable so that, for example, we can test it against the Ethereum common tests. Using Dafny, however, offers some advantages over the K framework:
(Verification) With Dafny, we can verify security properties over EVM bytecode directly. For example, we can verify simple properties (e.g. no arithmetic overflow/underflow) as well as more complex properties (e.g. that an invariant between fields of the contract always holds).
(Documentation) Our Dafny formalisation gives a concise and readable description of the Ethereum Virtual Machine. In my opinion, it compares well against the Yellow Paper and the official Python spec (with the caveat that the DafnyEVM remains a work-in-progress).
(Compilation) In principle, we could develop a compiler for a high-level language using the DafnyEVM which guarantees that generated bytecode is correct. A good example of this is the CompCert compiler developed for the C language.
Whilst the DafnyEVM remains a work-in-progress, it can already pass a large number of the Ethereum common tests. So, let’s take a look inside!
Machine State
An executing EVM contains various key components of the executing state such as gas, pc, stack, code, memory, and worldstate. Roughly speaking, we implement this in Dafny like so:
datatype ExecutingEvm = EVM(
gas: nat,
pc: nat,
stack: Stack,
code: Code,
mem: Memory,
world: WorldState,
...
)
On top of this, we have a notion of the machine state which maybe an
executing EVM (as above), or a terminated EVM (e.g. having executed a
RETURN
or REVERT
instruction, or failed with some kind of error):
datatype State =
EXECUTING(evm: ExecutingEvm)
| REVERTS(gas:nat, data:seq<u8>)
| RETURNS(gas:nat, data:seq<u8>, ...)
| INVALID(Error)
| ...
Here, the REVERTS
and RETURNS
states include their RETURNDATA
as
a sequence of bytes (i.e. u8
) along with any gas
returned to the
caller. To simplify our code, we can define a new type which captures
the notion of an executing EVM as follows:
type ExecutingState = st:State | st.EXECUTING?
Variables of type ExecutingState
have type State
with the
additional constraint that they are instances of State.EXECUTING
.
Bytecode Semantics
There are over 140 bytecode instructions supported in the Ethereum Virtual Machine. These include simple arithmetic operations, memory reading/writing, storage reading/writing, contract calls, etc. The semantics (i.e. meaning) of every instruction is formalised in the DafnyEVM.
As a first example, here is our formalisation of the ADD
instruction
(opcode 0x01
):
function Add(st: ExecutingState): (st': State)
// Execution either continues or halts with stack underflow
ensures st'.EXECUTING? || st' == INVALID(STACK_UNDERFLOW)
// Execution always continues if at least two stack operands
ensures st'.EXECUTING? <==> st.Operands() >= 2
// Execution reduces stack height by one
ensures st'.EXECUTING? ==> st'.Operands() == st.Operands() - 1
{
if st.Operands() >= 2
then
var lhs := st.Peek(0) as int;
var rhs := st.Peek(1) as int;
var res := (lhs + rhs) % TWO_256;
st.Pop().Pop().Push(res as u256).Next()
else
INVALID(STACK_UNDERFLOW)
}
This function takes an executing EVM state, and produces an updated
state (which may or may not be still executing). The function
requires at least two operands on the stack, otherwise execution halts
with a stack underflow exception. Furthermore, the lhs
(i.e. left-hand side) is at offset 0
from the top of the stack,
whilst the rhs
is at offset 1
. The two operands are then added
together modulo TWO_256
to ensure the result fits into a u256
.
Thus, we can see from this that the EVM does not automatically catch
arithmetic overflow for us.
Our semantics above for the ADD
instruction illustrates another
feature of Dafny. That is, we can add postconditions for our
functions to capture salient properties which we want to check. Often
these properties correspond to parts of the Yellow Paper which can
help make clear the connection between them. Whilst these are not
strictly required (technically speaking, because we’re using
function
s here), they can add a form of useful (and checked)
documentation of what we can expect.
As a second example, consider the semantics given for the MLOAD
instruction (i.e. opcode 0x51
):
function MLoad(st: ExecutingState): (st': State)
// Execution either continues or halts with stack underflow
ensures st'.EXECUTING? || st' == INVALID(STACK_UNDERFLOW)
// Execution always continues if at least one stack operands
ensures st'.EXECUTING? <==> st.Operands() >= 1
// Execution does not affect stack height
ensures st'.EXECUTING? ==> (st'.Operands() == st.Operands())
{
if st.Operands() >= 1
then
var loc := st.Peek(0) as nat;
// Expand memory as necessary
var nst := st.Expand(loc,32);
// Read from expanded state
nst.Pop().Push(nst.Read(loc)).Next()
else
INVALID(STACK_UNDERFLOW)
}
In this case, only one operand on the stack is required (i.e. the
address in memory to load from). The purpose of the Expand(loc,32)
is to ensure sufficient memory to store a u256
word at address
loc
. This also illustrates some benefits from using a language like
Dafny. The Read
function is defined like so:
function method Read(address:nat) : u256
requires (address + 31) < Memory.Size(evm.memory) {
...
}
Thus, if the Expand(loc,32)
call above were omitted or specified
incorrectly (e.g. as Expand(loc,31)
) then Dafny would flag an error
for us at compile time. Of course, Read()
could automatically
expand memory for us — but, making explicit the places where memory
expansion can occur helps ensure expansion is done correctly.
Verification Example
Using the DafnyEVM we can prove properties of interest on bytecode sequences. An example proof is the following:
method AddBytes(x: u8, y: u8) {
// Initialise an EVM.
var st := InitEmpty(gas:=1000);
// Execute three bytecodes
st := Push1(x);
st := Push1(y);
st := Add();
// Check top of stack is sum of x and y
assert st.Peek(0) == (x as u256) + (y as u256);
}
This proof roughly states that: for any two byte values x
and y
on the stack, ADD
returns their sum. Observe that, since x
and
y
are bytes, overflow is not possible and Dafny recognises this. If
x
and y
had type u256
then Dafny would flag the potential
arithmetic overflow for us at compile time.
As a second (and more interesting) example, let’s consider a more complete contract:
// Load counter on stack
0x00: PUSH1 0x0
0x02: SLOAD
// Increment by one
0x03: PUSH 0x1
0x05: ADD
// Check for overflow
0x06: DUP1
0x07: PUSH1 0xf
0x09: JUMPI
// Overflow, so revert
0x0a: PUSH1 0x0
0x0c: PUSH1 0x0
0x0e: REVERT,
// No overflow
0x0f: JUMPDEST
// Write back
0x10: PUSH 0x0
0x12: SSTORE
// Done
0x13: STOP
This contract maintains a counter at storage location 0
which is
incremented by one on every contract call. The contract should revert
if an overflow occurs. Thus, we should be able to prove that whenever
the contract is called: (i) either the counter is incremented; or
(ii), the contract reverts. Using the DafnyEVM, the proof of this
looks (roughly speaking) as follows:
method IncProof(st: ExecutingState) returns (st': State)
// Initial state has PC == 0 and an empty stack
requires st.PC() == 0 && st.Operands() == 0
// Assume there is enough gas
requires st.Gas() >= 40000
// Success guaranteed if can increment counter
ensures st'.RETURNS? <==> (st.Load(0) as nat) < MAX_U256
// If success, counter incremented
ensures st'.RETURNS? ==> st'.Load(0) == (st.Load(0) + 1) {
var nst := st;
// Load counter
nst := Push1(nst,0x0);
nst := SLoad(nst);
// Increment it by one
nst := Push1(nst,0x1);
nst := Add(nst);
// Check for overflow
nst := Dup(nst,1);
nst := Push1(nst,0xf);
nst := JumpI(nst);
// Case analysis on outcome
if nst.Peek(0) == 0 {
// Overflow occurred
assert nst.PC() == 0xa;
nst := Push1(nst,0x0);
nst := Push1(nst,0x0);
nst := Revert(nst);
assert nst.REVERTS?;
} else {
// No overflow!
assert nst.PC() == 0xf;
nst := JumpDest(nst);
// Write back
nst := Push1(nst,0x0);
nst := SStore(nst);
nst := Stop(nst);
assert nst.RETURNS?;
}
return nst;
}
The proof is quite long, but it essentially follows the control-flow
of the original contract. The nice thing is we can reuse Dafny
language constructs (e.g. if
and assert
) to help manage the proof.
More importantly, Dafny checks at compile time that the various
assert
statements and ensures
clauses are always true (i.e. that
they hold for all possible inputs which meet the requires
conditions).
Conclusion
The DafnyEVM represents an interesting formalisation of the Ethereum Virtual Machine. This allows us to show strong correctness properties for arbitrary bytecode sequences. Whilst its not yet finished, it is certainly showing promise. However, a key challenge lies in scaling up the proofs to handle larger real-world contracts (and that is something I will talk more about in the future).