Dr. David J. Pearce

Actors on the JVM


The Actor Model is an interesting alternative to the standard threading model used in languages like Java. Its been around for a while, but Erlang has recently brought it into the mainstream. Roughly speaking, an actor corresponds to a Thread, except that actors do not share state. Actors communicate by sending messages, and Synchronisation is implicit because an actor can only process one message at a time. Thus, a message is queued until the actor is free and can process it.

There are several examples of actors being implemented on top of the JVM, including Kilim (see also [1][2][3]) and Erjang (see also [1][2][3]). These systems implement actors on top of JVM threads in a many-to-one fashion, where several actors can map onto a single thread at any given moment. This requires the ability to interrupt a thread at certain points, in order to allocate it to another actor — in otherwords, time-slicing several actors across a thread. Unfortunately, the JVM does not provide any explicit mechanism for restarting threads at different bytecode locations, or for saving their state — that is, it does not provide support for arbitrary continuations.

To work around the limitations of the JVM, systems like Kilim and Erjang intersperse special “pause” points throughout methods executed by actors. A pause point is essentially a point where the executing thread checks whether or not it’s time to switch to a different actor: if so, the thread “backs out” of the current method by unwinding the stack, saving all necessary state (essentially, the contents of local variables) as it goes. Once this is completed, the thread chooses another actor to execute, and “unpauses” it by winding its saved state back onto the stack.

The following recursive method illustrates the main ideas:

int height(Node n) {
 int left = height(n.left());
 int right = height(n.right());

 if(left < right) { return right; }
 else { return left; }
}

A system like Kilm will rewrite this method to insert pause points, producing something (very roughly) as follows:

Object height(Node n, StackFrame stack) {
 Frame frame = null;
 int pc = 0;
 if(!stack.isEmpty()) {
  frame = stack.pop();
  pc = frame.pc();
 }
 switch(pc) {
 case 0:
   Object tmp = height(n.left(),stack);
   if(tmp instanceof StackFrame) {
    stack.put(new StackFrame(0));
    return stack;
   }
   frame.set(0,tmp);
 case 1:
   int left = (Integer) frame.get(0);
   Object tmp = height(n.right(),stack);
   if(tmp instanceof StackFrame) {
    stack.put(new StackFrame(1,left));
    return stack;
   }
   frame.set(1,tmp);
 case 2:
   int right = (Integer) frame.get(1);
   if(left < right) { return right; }
   else { return left; }
 }
}

Overall, it’s pretty messy! In practice, there’s quite a bit more to it, including several optimisation opportunities. But, hopefully, you get the picture — we’re essentially dividing the method up into points that we can quickly restore from.

There are quite a few questions left here. In particular, where do we put the pause points? Kilim, for example, always puts them at calls sites to other methods (in fact, it uses a special annotation to indicate which methods are @pausable). This approach means, for example, that a loop could contain no pause points and, hence, must run to completion without being interrupted (and, as with nonpremptive multitasking, if that loop does not terminate then neither will the actor). This seems less than ideal, but at the same time there doesn’t seem to be a better alternative.

Scheduling

Putting aside the question of where to locate pause points, there’s another issue which has been vexing me for a while:  how do we gain from this form of concurrency? It seems to me there are two possibilities:

My conclusion from thinking about this, is that there’s little or no advantage from blocking actors unless they either: explicitly invoke an I/O operation, or send a message which cannot immediately be processed.  So, it’s not really multi-tasking in the way that I think about multi-threading — it’s quite different, and much more specific.

Comments welcome!