Dr. David J. Pearce

Formalising a Simple Virtual Machine

Since starting my new role at ConsenSys, I have become interested in formalising virtual machines. For example, there are already some formalisations of the Ethereum Virtual Machine (e.g. in K, Lem, Coq). So, I figured it would be interesting to explore VM formalisation in a tool like Whiley (or similarly Dafny).


In this post, I’m going to formalise a Simple Virtual Machine (SVM) which has only a few instructions. The goal is just to illustrate the technique and show that, using the formalisation, we can prove interesting properties.

The machine state includes a stack of u16 words and a memory store of u16 words. In addition, we have a program counter (pc) and a stack pointer (sp), along with a sequence of bytecodes. To illustrate, here are descriptions for a few example bytecodes:

NOPNo-operationPC += 1
PUSH wPush (immediate) word.. ⟹ w ..PC += 2
POPPop word from stack.. w ⟹ ..PC += 1
STORE kStore word in memory.. w ⟹ ..data[k] = wPC += 2
LOAD kLoad word from memory.. ⟹ w ..w = data[k]PC += 2
ADDAdd words.. v w ⟹ u ..PC += 1
where u = (v+w) % 0x10000
SUBSubtract words.. v w ⟹ u ..PC += 1
where u = (v-w) % 0x10000
MULMultiply words.. v w ⟹ u ..PC += 1
where u = (v*w) % 0x10000
DIVDivide words.. v w ⟹ u .. if w != 0PC += 1
where u = v/w
JMP kUnconditional Branchpc += 2+k
JZ kConditional Branchv ⟹ ..pc += 2 if v!=0
pc += 2+k if v==0

This provides a fairly typical description for a bytecode machine (e.g. similar in style to that for the JVM). Each bytecode typically has an effect on the stack, and may also update memory. The effect on the stack is described as a rewrite popping zero or more elements on the stack, and pushing zero or more elements on the stack. For example, the effect .. w ⟹ .. for POP states that, for the bytecode to execute, there must be at least one word w on the stack and, afterwards, that word is removed (and everything else remains the same). Likewise, the memory effect for STORE is that the word w is written to address k in data. Finally, for completeness, note that (like most programming languages) integer division is non-Euclidean (i.e. rounds towards zero).

Whilst the above description is fairly clear and easy to follow, there are still some questions. For example, what happens if an address k is out of bounds for the data store? Likewise, can the stack ever be full? We might also be interested in checking whether an optimisation preserves the program’s semantics (e.g. that PUSH 0; POP is equivalent to an empty code sequence). With these questions in mind, it is useful to try and formalise our description in a machine-readable (and checkable) manner. To do this, I am of course going to use Whiley. However, it is worth pointing out that other systems could easily be used as well, (e.g. Dafny, the K framework, etc).


We begin the formalisation process by defining the notion of a machine state as follows:

type SVM is {
   u16 pc,
   u16 sp,
   u16[] data,
   u16[] stack,
   u8[] code
where sp <= |stack| && |stack| < 0xFFFF 
where |code| <= 0xFF00

This defines the essential components of our virtual machine. For simplicity, the code sequence is an array of bytes separate from the data section (i.e. we’re not following a Von Neumann Architecture). I’ve also imposed some additional constraints: (1) on sp and the maximum stack size; (2) on the maximum code size. In principle, we could limit the data size as well.

Before formalising individual bytecodes, its useful to have a notion of the machine being halted. This is just helpful to signal error states, and we can define it as follows:

property isHalted(SVM st) -> (bool r):
  return st.pc >= |st.code|
property exitCode(SVM st) -> (u8 r)
requires isHalted(st):
  return |st.code| - st.pc

Basically, whenever the pc is passed the end of the code size, then the machine has “halted”. Once it reaches this state execution is finished. Furthermore, we’ll exploit the final pc value to determine an exit code, such that exitCode(vm) == 0 indicates execution completed normally (more on this later).


We now formalise the semantics of each bytecode. That is, specify how each bytecode should execute. A simple example is the following:

property evalPOP(SVM st) -> (SVM nst):
  if st.sp < 1:
    return halt(st, STACK_OVERFLOW)
    return pop(st)

This specifies that evaluating a POP bytecode requires at least one element on the stack, otherwise the machine halts (with exit code STACK_OVERFLOW). If there is one element, then it is popped off. This uses two helpers halt and pop, defined as follows:

property pop(SVM st) -> SVM
requires st.sp > 0:
   return st{sp:=st.sp-1}

property halt(SVM st) -> SVM:
   return st{pc:=|st.code|}

Whilst these could have been written inline, I find it helpful to have descriptive names for the operations. I typically refer to these low-level building blocks as microcodes and, as for a physical machine, we’ll see these are reused a lot in defining the semantics of our bytecodes. Observe the precondition for pop() is that the stack cannot be empty. Also, the expression st{sp:=st.sp-1} returns st with field sp updated to sp-1 and all others unchanged.

Another example to illustrate is the following:

property evalADD(SVM st) -> (SVM nst):
  if st.sp < 2:
    return halt(st, STACK_UNDERFLOW)
    // Read operands
    u16 r = peek(st,1)
    u16 l = peek(st,2)
    u16 v = (l + r) % 0x10000
    // done
    return push(pop(pop(st)),v)

Here, the right-hand side r is taken from the first (i.e. topmost) stack item, whilst the left-hand side l is from the second stack item. Furthermore, the addition itself must be modulo 0x10000 (i.e. 65536) in order that the value assigned to v remains within bounds. Indeed if the modulo operation was left out, then the verifier would highlight a potential integer overflow (and this is where tools like Whiley really help). Again, our definition above uses some more microcodes:

property push(SVM st, u16 k) -> SVM
requires st.sp < |st.stack|:
  SVM nst = st{stack:=st.stack[st.sp:=k]}
  return nst{sp:=st.sp+1}

property peek(SVM st, int n) -> u16
requires 0 < n && n <= st.sp:
  return st.stack[st.sp - n]

This time, the precondition for push() requires that the stack cannot be full. Likewise, for peek(), we must have enough items on the stack to cover the one we’re after.

Finally, whilst I’ve only illustrated the semantics for a few example bytecodes above, you can see them all in the repo.


The next piece of the jigsaw is bringing the semantics of all bytecodes together into one method responsible for executing the next bytecode. For this, we have eval():

// Execute a "single step" of the machine.
property eval(SVM st) -> (SVM res)
requires !isHalted(st):
  u8 opcode = st.code[st.pc]
  // increment pc
  SVM nst = st{pc:=st.pc+1}
  // Decode opcode
  if opcode == NOP:
    return evalNOP(nst)
  else if opcode == POP:
    return evalPOP(nst)
    // Force machine to halt
    return halt(nst)

There are a few things to note about this: Firstly, it is rather ugly having this large sequence of if / else if statements. It would be better if we could at least use a switch statement. Unfortunately, at the moment, Whiley does not support this (though in the future it should). Secondly, the machine is set to halt() whenever an unknown instruction is encountered. Finally, a precondition for eval() is that the machine has not already halted.

Something Useful!

Right, having specified our simple virtual machine the question is: what can we do with it? Well, the first and most obvious thing would be to generate code from it to give us a reference implementation. But, there are some other things we can do as well. For example, we can prove safety properties of bytecode sequences (e.g. that they don’t unexpectedly halt), or that certain compiler optimisations are sound, etc.

Example 1

As a first example to illustrate, consider the following:

method test_add_01():
  SVM m1 = execute(
     [LDC,2,LDC,1,ADD], // code
     [], // data
     1024) // stack size
  // Check expected output.
  assert exitCode(m1) == OK
  assert peek(m1,1) == 3

This method is statically verified by Whiley. However its pretty simplistic since (in this case) it does not depend on any unknown (i.e. symbolic values). So, there isn’t any benefit to static verification over just executing the test.

We can make our example more interesting by introducing an unknown value as follows:

method test_add_02(u16 x):
  SVM m1 = execute(
  // Check expected output.
  assert exitCode(m1) == OK
  assert peek(m1,1) > x

Here, x is an arbitrary value and we initialise our memory with x at address 0. Thus, the above computes x + 1. However, attempting to statically verify this with Whiley produces an error:

main.whiley:5:assertion may not hold
  assert peek(m1,1) > x

The problem is that the ADD may overflow around to 0. That’s right! Our assertion doesn’t hold for the instruction sequence above. We can test this hypothesis by restricting our unknown value as follows:

method test_add_02(u16 x):
requires x < 65535:

This essentially prevents the overflow case and, with that, the assertion will now statically verify.

Example 2

As another example, lets consider the case for division.

method test_div_01(u16 x, u16 y):
  SVM m1 = execute(
  // Check expected output.
  assert y == 0 || exitCode(m1) == OK

This statically verifies because our final assertion is quite weak. Let’s assume we wanted something stronger. That is, a code sequence which would execute division safely for all values of x and y. Clearly, the above is not there yet. We need a check against y to handle when y == 0. Here’s one solution:

method test_div_02(u16 x, u16 y):
  SVM m1 = execute(
  // Check expected output.
  assert exitCode(m1) == OK

This introduces a conditional check using the conditional branch instruction, JZ. To keep the example short, the above just returns x when y == 0 (but in principle it can do whatever we want). The above now statically verifies with Whiley. Just think about that for a second. It means we’ve proved (for all input values x and y) that the above code sequence always executes to completion on our simple virtual machine!


We’ve formalised a very simple virtual machine in Whiley, and then proved some useful properties about (albeit small) bytecode sequences. In fact, there’s a lot more we could do with this tool, such as proving optimisations preserve a program’s meaning, etc. But, we can save those discussions for another day!