With the introduction of autonomic computing, grid computing and on demand computing there is an increasing need to be able to securely identify the software stack that is running on remote systems. For autonomic computing, you want to determine that the correct patches have been installed on a given system. For grid computing, you are concerned that the services advertised really exist and that the system is not compromised. For on demand computing, you may be concerned that your outsourcing partner is providing the software facilities and performance that have been stipulated in the service level agreement. Yet another scenario is where you are interacting with your home banking or bookselling webservices application and you want to make sure it has not been tampered with.
The problem with the scenarios above is, who do you trust to give you that answer? It cannot be the program itself because is could be modified to give you wrong answers. For the same reason we cannot trust the kernel or the BIOS on which these programs are running since they may be tampered with too. Instead we need to go back to an immutable root to provide that answer. This is essentially the secure boot problem [1], although for our scenarios we are interested in an integrity statement of the software stack rather than ensuring compliance with respect to a digital signature.
The Trusted Computing Group (TCG) has defined a set of standards [2] that describe how to take integrity measurements of a system and store the result in a separate trusted coprocessor (Trusted Platform Module) whose state cannot be compromised by a potentially malicious host system. This mechanism is called trusted boot. Unlike secure boot, this system only takes measurements and leaves it up to the remote party to determine the system's trustworthiness. The way this works is that when the system is powered on it transfers control to an immutable base. This base will measure the next part of BIOS by computing a SHA1 secure hash over its contents and protect the result by using the TPM. This procedure is then applied recursively to the next portion of code until the OS has been bootstrapped.
The TCG trusted boot process is composed of a set of ordered sequential steps and is only defined up to the bootstrap loader. Conceptually, we would like to maintain the chain of trust measurements up to the application layer, but unlike the bootstrap process, an operating system handles a large variety of executable content (kernel, kernel modules, binaries. shared libraries, scripts, plugins, etc.) and the order in which the content is loaded is seemingly random. Furthermore, an operating system almost continuously loads executable content and measuring the content at each load time incurs a considerable performance overhead.
The system that we describe in this paper addresses these concerns. We have modified the Linux kernel and the runtime system to take integrity measurements as soon as executable content is loaded into the system, but before it is executed. We keep an ordered list of measurements inside the kernel. We change the role of the TPM slightly and use it to protect the integrity of the in-kernel list rather than holding measurements directly. To prove to a remote party what software stack is loaded, the system needs to present the TPM state using the TCG attestation mechanisms and this ordered list. The remote party can then determine whether the ordered list has been tampered with and, once the list is validated, what kind of trust it associates with the measurements. To minimize the performance overhead, we cache the measurement results and eliminate future measurement computations as long as the executable content has not been altered. The amount of modifications we made to the Linux system were minimal, about 4000 lines of code.
Our enhancement keeps track of all the software components that are executed by a system. The number of unique components is surprisingly small and the system quickly settles into a steady state. For example, the workstation used by this author which runs RedHat 9 and whose workload consists of writing this paper, compiling programs, and browsing the web does not accumulate more than 500 measurement entries. On a typical web server the accumulated measurements are about 250. Thus, the notion of completely fingerprinting the running software stack is surprisingly tractable.
Contributions: This paper makes the following contributions:
Outline: Next, we introduce the structure of a typical run-time system, for which we will establish an integrity-measurement architecture throughout this paper. In Section 3, we present related work in the area of integrity protecting systems and attestation. In Sections 4 and 5, we describe the design of our approach and its implementation in a standard Linux operating environment. Section 6 describes experiments that highlight how integrity breaches are made visible by our solution when validating measurement-lists. It also summarizes run-time overhead. Finally, Section 7 sketches enhancements to our architecture that are being implemented or planned. Our results show and validate that our architecture is efficient, scales with regard to the number of elements, successfully recognizes integrity breaches, and offers a valuable platform for extensions and future experiments.