|
WIESS '02 Paper   
[WIESS '02 Tech Program Index]
Building an "impossible" verifier on a Java Card1Damien Deville deville@lifl.fr, https://www.lifl.fr/~deville Gilles Grimaud grimaud@lifl.fr, https://www.lifl.fr/~grimaud
AbstractJava is a popular development platform for mobile code systems. It ensures application portability and mobility for a variety of systems, while providing strong security features. The intermediate code (byte code) allows the virtual machine to verify statically (during the loading phase) that the program is well-behaved. This is done by a software security module called the byte code verifier. Smart Cards that provide a Java Virtual Machine, called Java Card, are not supplied with such a verifier because of its complexity. Alternatives are being studied to provide the same functionality outside the card. In the present paper, we propose to integrate the whole verifier inside the smart card. This ensures that the smart card becomes entirely autonomous, which allows full realization of smart cards potential as pervasive computing devices. Our verifier uses a specialized encoding and a software cache with a variety of cache polices to adapt to the hardware constraints of smart card. Our experimental results confirm the feasibility of such a security system being implemented in a smart card.Keywords: Smart card, Java Card, static verification, type inference, secure embedded system.
IntroductionThis paper presents the motivations and techniques used to develop a stand-alone verifier inside a smart card. The verification process is one of the most important parts of the security in a Java Virtual Machine. In a JVM, this process is performed while a new application is loaded into the virtual machine. Due to smart card constraints, a verifier as it was defined by Sun is said to be impossible to embed in a Java Card Virtual Machine (JCVM).``Bytecode verification as it is done for Web Applet is a complex and expensive process, requiring large amount of working memory, and therefore believed to be impossible to implement on a smart card'' [12]. ``Clearly, bytecode verification is not realizable on a small system in its current form. It is commonly assumed that downloading bytecode to a JAVACARD requires that verification is performed off-card'' [18]. Nevertheless, we will present here some solutions to obtain a stand-alone verifier in a JCVM. We first outline the standard verification process, in order to introduce the reader to standard verification principles. Next we present degraded solutions of this process that have been designed to cope with supposed smart card constraints. To overcome these difficulties, we show that these constraints are not actually a problem by detailing the properties of smart card hardware properties used for our stand-alone verifier. We then present our approach that consists of hadrware-specific adaptations to efficiently match the standard algorithm with these detailed hardware characteristics. We do not change the standard verification algorithm, but instead propose some innovative techniques that allow an improved execution on top of the smart card limited hardware. In the last part we give some experimental results extracted from our prototype. Our prototype subsequently became a proof of concept of an embedded smart card verifier for our industrial partner.
Verification principlesFirst we present the standard verification process performed by the Java Virtual Machine in a conventional implementation. This verification process consists of performing an abstract interpretation of the byte code that composes each method of an application. The aim is to statically check that the control flow and data flow do not generate errors (underflow or overflow of the stack, variable used with invalid type, ...). This algorithm is called type inference and its concepts were first introduced in [10]. The paper [11] gives an overview of the whole Java verification process. Java Card Virtual Machine (JCVM [3]) is a stack-based machine that uses particular registers named local variables. The Java specification [14] imposes some constraints on the byte code generated by compilers so that it can be verified. Because of these constraints, the type for each variable used by the program can be inferred. We use stack map to mean a structure giving the type of each variable, for each particular point of our program. Java programs are not linear; one can jump to a particular instruction from various sources. Thus we can have different hypotheses for the type of the variables; we need to merge all these hypothesis into one, and we need to find the corresponding type that matches all, i.e. a compatible one. This operation is called unification and is notated . For example, for classes, it corresponds to the inheritance relation. We give, in the next paragraph, the standard version of the verification algorithm [14]. We also use this later on for the description of our implementation techniques for smart cards.The standard algorithm uses the hypothesis that each instruction has an associated stack map, a TOS (Top of stack), and one changed/unhanged bit. Initialization:
Java Cards and verifiersThe standard verification algorithm is usually presented as being impossible to embed in a smart card, due to hardware limitations. Instead, the verification process is simplified and adapted in order to fit on small devices. In the next parts we present existing solutions that guarantee a good security level in a standard Java Card.In a regular Java architecture, the produced application is verified while being loaded in the virtual machine. The byte-code verifier is define by [14] JVM specification as the basic component of safety and security in a standard JVM. Currently, SUN Java Card is built upon a split VM scheme: one part is called the ``off-card VM''; the other one is called the ``on-card VM''. As the on-card VM cannot load a CLASS file format because of its complexity, a converter is used to produce CAP files (Converted APplet) that is more convenient to smart card constraints (no runtime linking, easy to link on load, ready to use class descriptor [20], ready to run byte code). While converting the class file, verification is performed in order to ensure that the produced CAP file is coherent with the CAP file structure definition. Smart cards are considered secure devices, and because of the lack of on-card verification it is currently impossible to download a new application to a card after it has been issued to an end-user. Some solutions have been proposed in order to achieve a good security level of the Java Card without using a verifier with regard to smart card constraints.
In the next part we give some information about the hardware of smart cards, in order to explain the difficulties of having a stand-alone verifier on board.
Smart card constraintsSmart card has some hardware limitations due to ISO [9] conbstraints that are mainly defined to enforce smart card tamper resistance. Now we are going to focus on each of the hardware features of the smart card, starting with the microprocessor, then the memory and ending with the I/O.
A fully embedded Java Card byte code verifier?The byte code verifier is said to be impossible to be implemented inside a smart card; the impossibility is due to its high algorithmic complexity and also to its large memory consumption. In this part, we focus on all these constraints, and give solutions for allowing its usage on board, overcoming the hardware limits described previously.
Time and space complexityThe elementary operation we are going to use is the interpretation for a byte code working on a simple variable. Complexity is limited by . is the depth of the type lattice. is the number of instructions of the verified program. is the maximum number of jump (branching instructions), and is the number of variable used at most. In the worst case, all instructions are jumps (). By analyzing all the different byte codes of Java Card we can find the one that manipulates the highest number of variables. Let represent this number, thus in the worst case. We can also state that we need one instruction for creating a new type in our lattice, thus . Finally time complexity is . Hence, type inference is a polynomial form of the number of analyzed instructions. Let us now evaluate the memory that is needed for performing a typing inference. We need the type information for each label (destination of a branch) of our program. The size of each of them is the number of local variables plus the maximal stack size. We also need one more frame for the current one. Thus the memory we need is where is the number of branching instructions and/or exception handlers, the size needed to encode one type, the stack size, and the number of local variables. Practically, we can easily require more than 3 KB of RAM. Thus we need to store the proof in persistent memory. Doing this we need to be aware of the stressing problem and also of the slowness of writing of persistent memory.
Our solutionsWe propose to use persistent memory (i.e. EEPROM or FLASHRAM) for storing stack maps of the current method. Nevertheless, the amount of memory needed is smaller than the one for a PCC verifier [2] that needs to store the proof for every methods in persistent memory. Our strategy consists of using RAM to hold the entire stack map whenever possible, and when not possible to use it as a cache for persistent memory. The first challenge is the bad property of persistent memory related to cells stressing. We propose the usage of a non stressing type encoding. The second challenge is to find the best cache policy according to our problem.
A non stressing type encodingIn order to solve the problem of persistent memory stress, we propose to find a good encoding of the type lattice used during type inference. The goal is to use a non-stressing write when storing a type in persistent memory. The paper [8] presents examples of type lattice encoding that are used in compilation or in databases. It proposes techniques to perform a bijection between lattice operations and Boolean operations. Described encodings allow finding the Least Upper Bound (LUB) of two elements using a composition of logical functions. Type unification consists of finding the LUB of two types. Typing information needs, sometimes, to be store in persistent memory due to its size, and persistent memory has a stressing problem. We observe that when unifying two types, the result type is higher in the type lattice, so we would like to be able to move upward in the lattice while only changing bits from 1 to 0. Such an encoding causes no stress on persistent memory, and unification is reduced to a logical AND that has the property of only clearing bits (1 to 0 transitions). As non stressing writes takes less time than a stressing one, unification goes faster.
We could find a more compact encoding (i.e. using less memory) by using a more complex Boolean function, but this would take less into account persistent memory properties. In our prototype, dynamic downloaded classes do not take benefits of the encoding, as it would require computing the whole type encoding. However our experiments show that complex unification (i.e. one between two classes, producing a class that is not java.lang.Object) happens very rarely. In fact, smart card software engineering implies using specific rules that reduce the number of different class used at the same time. Accordingly, we do not try to optimize these cases, we just accept the minor performance degradation. This choice has no effect on security. Experimentally, we found that less than 1% of unification are performed between two classes. All these techniques reduce computation time and ensure a non stressing usage of persistent memory but they do not deal with the latency of write operation.
Efficient fixed points using software cache policyAs described earlier, type inference is an iterative process that stops when a fixed point is reached. By using our type encoding we decrease the persistent memory stress, but we can also obtain better results by trying to maximize the RAM usage. The aim is to maximize the size of the data we can store and work on in a RAM. For example, some specific methods extracted from our application test pool need more than 4 KB of memory for the verification process. As RAM memory is faster than persistent memory, but is less present on a smart card, we use a software cache for working on typing information. Such a technique can speed up the verification of huge (13KB) applets as it highly decreases the number of write in persistent memory.
In order to perform type inference, we need to find the target of branching instructions. Having the branching instructions and also the branched ones, we can build (at low cost) the control graph flow of the verified program. Then, when verifying a particular code section we know which are the ones we can jump to by using the control graph flow. If we look at the example given in Figure 2, when verifying the last code section (the one beginning with Label 3), we can see that none of the other code section are accessible. The control graph flow is used for piloting the eviction of data in the cache. Moreover, we can sometimes extract from the graph the property that a label is not reachable from any further point of the program. In this case, eviction performs no write in persistent memory as the type information will not be used anymore. Pratically, this happens frequently in conventional Java Card code. The eviction significantly increase the speed of stand-alone byte code verification.
ExperimentsWe have implemented a prototype that includes each of the techniques we have mentioned. This prototype shows that the stand-alone byte code verifier is feasible in a smart card. Moreover, we have used common benchmarks for evaluating its performance against a system using a PCC verifier.
Our prototypeWe have implemented the concepts and techniques described in the earlier part of this article on an AVR 3232SC smart card microprocessor from ATMEL. This chip includes 32 KB of persistent memory, 32 KB of ROM, and a maximum of 768 bytes of RAM. The amount of available RAM highly depends on the state of the verifier and also of the virtual machine. Practically, in most of cases, using less than 512 bytes of RAM does not alter stand-alone byte code verification's efficiency. The type inference algorithm has an important advantage: we can compute the amount of memory we will need to verify a particular method of an application. We propose different algorithms and heuristics: we have implemented different cache policies that suit a particular amount of RAM (an ``all in RAM policy'', a LRU policy, a more evolved LRU, and finally the one described earlier that uses control graph flow of the verified program). We also use some heuristics for selecting the current label to verify. These heuristics have different properties in term of performance. We choose a heuristic dynamically in order to fit the amount of working memory. We give in Table 5 the size of the important part of our verifier.
We have a total of 30 KB of code for our stand-alone type verifier. The remaining part which is not described in table 5 consists of routines that deal with the cap file format, I/O, and RAM and EEPROM allocators. We did not tune this implementation for our proof of concept, and we would expect a production version to be both smaller and faster. For example, a standard JCVM represents 20KB of code; its API represents 60 to 100KB; and a huge user application is about 13KB (these measures are extracted from products from our industrial partner).
Practical metricsWe give in Table 6 the duration of the loading phase in milliseconds, and also of verifications, for three applets which size are from 3 KB to 4.5 KB. The PCC verifier is the one described in [2]. The loading phase for the PCC is higher than the one for the stand-alone (between 20% to 30%). It is due to the fact that it needs to load and write the entire proof in persistent memory. The overhead for the proof in a PCC verifier is said to be between 20 to 30 % in related work. The time taken to write the proof in persistent memory before verification by the PCC could be avoid by performing the verification on the fly. Thus the only difference between a PCC and a stand-alone verifier would be the extra loading time of the proof. Of course, the techniques described earlier could be used on the PCC verifier to reduce the time needed to use the proof (on the fly verification, non stressing encoding to speed up operations on types, ...). We need also to remind that the off-card generation of the proof for a PCC verifier as a cost in terms of deployment tools and also in terms of time. Thus PCC as a model is more expensive than a stand-alone verifier. If we look at the global time taken for verification, we can point out that stand-alone verification is not slower than PCC one which has a linear complexity. In some particular case, stand-alone verification can be faster than PCC one. Smart card application are often simple in terms of generated byte code. Thus stand-alone verification is nearly linear. With this table we show that the extra time needed for stand-alone verification (with our technical approach) is often less than those needed for downloading and storing the PCC proof. Some experimental results of a verifier using code transformation were presented in [13] but with only few details. Nevertheless, we are in the same order of magnitude for verification execution time (1 sec for 1 kb of code). We did not have access to smart card using others verifications strategies so we could not compare them with our prototype. But cryptographic signature supposes an extra time to check the basic cap file signature after download (for example, a smart card usually performs a RSA in something like 10ms per 64 bytes). Concerning the solution using digital signature, it is the worst solution in term of infrastructure as it assumes a trusted third party to sign applications. Conclusion We have shown that careful attention to the smart card hardware allows us to integrate a stand-alone verifier on a smart card. Stand-alone verification is no longer a mismatch to the smart card constraints. The usage of hardware-specific techniques allows the stand-alone verifier to be as efficient as a PCC one.
Bibliography
Damien.Deville |
This paper was originally published in the
Proceedings of the
2nd Workshop on Industrial Experiences with Systems Software,
December 8,
Boston, MA, US
Last changed: 6 Nov. 2002 aw |
|