The pH prototype is implemented as a patch for the Linux 2.2 kernel, and was developed and tested on systems running a pre-release of the Debian/GNU Linux 2.2 distribution [35]. The modified kernel is capable of monitoring every executed system call, recording profiles for every executable. An overview of the system is shown in Figure 1.
Program profiles for each executable are stored on disk. Each profile contains both a training and testing array, and so is actually two ``profiles'' by the terminology in Section 2. The kernel loads the current profile when a new program begins executing (on execve), and then writes it out again when the process terminates. When a new executable is loaded via the execve system call, the kernel attempts to load the appropriate profile from disk; if it is not present, a new profile is created. If another process runs the same executable, the profile is shared between both processes. To prevent consistency problems due to interleaving, each executing process maintains its own record of recent system calls (its current sequence). When all processes using a given profile terminate, the updated profile is saved to disk. A loaded profile consumes approximately 80K of kernel (non-swappable) memory.
We modified the system call dispatcher so that it calls a pH function (pH_process_syscall) prior to dispatching the system call. pH_process_syscall implements the monitoring, response, and training logic. pH is controlled through its own system call, sys_pH, which allows the superuser (root) to take the following actions:
More specifically, we extended the Linux task structure (the kernel data structure used to represent processes and kernel-level threads) with a new structure which contains the following fields: the current window of system calls for the task, a locality frame, and a pointer to the current profile. A profile is a structure containing two byte-arrays for storing pairs (the training and testing arrays) and some additional training statistics described in Section 3.