Check out the new USENIX Web site. next up previous
Next: Implementation Walk-Through Up: Prototype Prover Previous: Prototype Prover

General Approach

The prover that was constructed takes native machine code and executes it using a simple simulator. The implementation currently only supports the x86 instruction set, though there is nothing in this approach that would prevent an implementation for another Instruction Set Architecture (ISA) (and most ISAs will actually be far easier to build an implementation for, but the popularity of the x86 instruction set motivated its use in this prototype). Privileged machine instructions and machine instructions that would not be output by a normal linear-mode x86 compiler (e.g., media instructions and instructions using segmentation) are not allowed in extensions. Floating point is also not allowed because hardware floating point instruction support may not be present on the machine and is not automatically emulated for code inside many kernels. Known values are tracked through the simulated execution of the program using a simple linear representation:

Where is either 0 or an abstract unknown base variable whose concrete value will be supplied by the runtime environment (such as the initial value of the stack pointer, or the pointer to the function's input structure). Preconditions, such as the input values, descriptions of the accessible memory objects (alignment, size, and permissions), and the C calling convention's stack layout, are set up with their values. All other memory and register values in the system are set to ``undefined,'' and the simulation is run. At all return points, postconditions such as register and stack restoration are checked.

Memory accesses are required to be to known values. Given a known value as above and memory objects in a similar representation, it is possible to check whether a memory access will always be in a defined region for which the program has permission to read or write. Known values are tracked on write, so that register spills to the stack do not lose information.

The simplifying assumption that makes the simulator function is that it is always safe to declare a value to be unknown if a known value cannot be easily computed. For example, most bitwise logical instructions yield a result of unknown because the simulator cannot create a linear function representation of their output given a linear function representation of their input. The main exception to this is bitwise AND, which is treated as constraining the linear function's range. This works in practice because most memory accesses are done through linear functions (scale-index-base, as seen in the x86 ISA), so the linear values tracked by the simulator contain enough information to reason about most memory addresses, while the other values used by a program aren't usually important for memory access safety.

The end result is a prover that can take short binaries compiled with gcc and verify their safety. The verified code can then be installed into a running kernel and used for small extension functions in the same way that the BPF virtual machine is used. These functions can be called as C functions through a front-end function, and are guaranteed to return and not to write except where explicitly permitted.


next up previous
Next: Implementation Walk-Through Up: Prototype Prover Previous: Prototype Prover
Craig Metz 2000-05-08