Equivalent heap states can be identified by transforming a heap state to a canonical representation shared by all equivalent heaps. Informally, the objects in the heap along with their pointers form a heap graph. Equivalent heaps have the same underlying graph structure. A canonicalization algorithm works by relocating each object to its canonical location and modifying the contents of all pointers to the object to reflect its new location.
The previously known algorithm [22] does not scale to large
heaps. This algorithm requires processing large portions of the heap,
which can reduce the speed of state space
exploration (§4.2). Specifically, it performs a depth
first traversal of the heap graph, and the canonical location of each
object depends on its depth first ordering. Even small changes to the
heap, such as an object allocation or a deletion, can change the depth
first ordering for a large number of objects. This forces CMC to traverse
and compute the hash for large portions of the heap, as shown in
Table 3. During model checking, the heap, on average,
consists of objects that total
KB. Note that the heap also
includes non-TCP related data structures allocated by the Linux kernel. A
TCP transition, on an average modifies
of these objects. However, this
change requires CMC to recompute the hash value of almost half the objects
in the heap.
Our contribution is an improved, incremental heap canonicalization
algorithm. We briefly describe this algorithm; interested readers can
refer [28] for more details. The incremental
algorithm generates the canonical location of a heap object from the
shortest path of the object to some global variable in the heap
graph. When a transition makes small changes to the heap structure, the
shortest path of most objects is likely to remain the
same [19,31]. After a transition, CMC recomputes the hash
only for those objects whose shortest paths have changed in a
transition. This algorithm works well in practice, as shown in
Table 3; in most cases CMC traverses only the objects that
are modified in a transition. This improves the performance of CMC by
as indicated in Table 2.