SableVM's interpreter is a threaded interpreter.
Pure bytecode interpreters
suffer from expensive
dispatch costs: on every iteration, the dispatch loop
fetches the next bytecode, looks up the associated implementation address in a
table (explicitly, or
through a switch statement), then transfers the
control to that address.
Direct threading[20]
reduces this overhead: in the executable code stream, each bytecode is replaced
by the address of its associated implementation. In addition, each bytecode
implementation ends with the code required to dispatch the next
opcode. This
is illustrated in figure 2.
This technique eliminates the table
lookup and the central dispatch loop (thus eliminating a branch instruction to
the head of the loop).
As these operations are expensive on modern processors, this
technique has been shown to be quite effective[20,27].
= currsize.80
|
Method bodies are translated to threaded code on their first
invocation. We take advantage
of this translation to do some optimizations. For example, we precompute
absolute branch destinations, we translate overloaded bytecodes like
the
GET_FIELD instruction to separate implementation addresses
(GET_FIELD_INT, GET_FIELD_FLOAT, ...),
and we inline
constant pool references to direct
operand values.
This one pass translation is much simpler than the translation done by even the most naive just-in-time compiler, as each bytecode maps to an address, not a variable sized implementation. However, unlike a JIT, the threaded interpreter still pays the cost of an instruction dispatch for each bytecode. Piumarta[27] has shown a technique to eliminate this overhead within a basic block using selective inlining in a portable manner, at the cost of additional memory6. SableVM implements this technique optionally through a compile-time flag, as it might not be appropriate for systems with little memory.