In order to make discussion of the implementation more concrete, Figures 1-4 give an example of a short real-world filter program that was used extensively in the development of the prototype prover. First, a short libpcap filter program shown in Figure 1 was used as a starting point. It was chosen to be a short example of how BPF is typically used. The tcpdump program was used as a front-end to libpcap's internal compiler and optimizer, which generated the BPF program shown in Figure 2.
Then the C program shown in Figure 3 was constructed by hand, attempting to be as faithful to the contents of the BPF program as possible. However, the C implementation uses abstractions for protocol headers and address information for the sake of readability and to show that programs that might be written by a real programmer compile down to analyzable code. The C program adds some run-time bounds checking that is not present in the BPF instructions - the BPF virtual machine does these checks implicitly, while a native program must do them explicitly.
Finally, the C program was compiled with the GNU C compiler into a native x86 program whose assembly listing is shown in Figure 4.
|
|
The first thing that the prover does is load the code of the program to be proven into a memory buffer. This requires the program to parse the i386 ELF binary object format, which is currently done in a reasonable but minimal fashion. Only the .text segment is loaded, which is assumed to contain exactly one C function, and no relocation is performed. The program loader is currently linked into the same binary as the prover, but is otherwise decoupled from it. In practice, the loader would be separated from the prover and would be an untrusted component (for example, a part of the user process).
Next, the prover initializes preconditions. The register pre- and post-conditions are listed in Figure 5, and are generic to the x86 C calling convention. The first four general purpose registers are initially set to ``undefined,'' which is a special value in the prover that indicates that these cannot be read from until changed to a defined value (to prevent unknown or unintended values from being available to the program). The last four general purpose registers are initially set to abstract variables representing their initial value. This allows the program to generate values as offsets from their initial values (such as stack pointer relative addresses, which need to be resolved to an address relative to the initial value of the stack pointer) and it also allows the prover to check as a postcondition that the initial value has been faithfully restored when the program returns.
The memory preconditions are listed in Figure 6, and are specific to the problem of a BPF filter function with the function signature seen in Figure 3. Different functions will require different memory preconditions. The current prover implementation allows these to be changed easily. Memory words are defined to contain each input parameter, and the input packet buffer that is passed to the program is also defined. The input packet buffer is currently defined as a fixed-length buffer in order to greatly simplify the prover's operation; it is much easier to reason about falling within a known bound than an unknown bound (though this would be a useful feature to add in the future). At run-time, the caller would need to copy packets into a buffer of 8192 bytes in order for this to be safe. The input parameters are all treated as read-only because there is no legitimate need for them to be modifiable. In order to have some available scratch space, twelve words of stack space is set aside for read/write local variables. This scratch space is easily increased by defining more such words in the prover, but arbitrary amounts of temporary space are not supported. There are no memory postconditions; values that are read-only are never writable and thus need no postcondition checking, and values that are read-write are considered mutable.
Register Precondition Postcondition %eax - %edx (undefined) n/c %esp sp_0 sp_0 %ebp bp_0 bp_0 %esi si_0 si_0 %edi di_0 di_0 |