Page diffing resembles page grain logging. However, it tries to reduce the size of the log by computing differences between old and new page contents. When a database file is opened, the storage manager creates two memory objects. The storage object is mapped onto the application's address space, and it caches the up-to-date contents of the file. The shadow object holds old buffer contents. It is not mapped onto address spaces; rather, it is used only to group pages together. Figure 5 shows how the two memory objects are used.
Before each transaction, all MMU mappings for the database region are invalid. In response to a page fault caused by an application store, the storage manager brings the page into the storage object, if necessary. It then allocates a page on the shadow object and copies the contents of the storage object page onto the new page. Finally, the storage object page is mapped onto the application's address space.
Upon commit, it compares the contents of each modified page and its shadow word by word and computes the differences between them. The ``diff''s are logged as redo records. After commit, all mappings for the database region are invalidated.
When either a storage page or its shadow is chosen as a pageout victim, and the storage page is modified by some transaction, Rhino computes a ``reverse diff'' of the shadow and the storage page. The reverse diff is logged and flushed as the undo record. Next, the contents of the storage page are written out to the database file. Finally, both the storage page and the shadow page are removed from the memory objects. When the storage page is dirty but is not being modified by a transaction, Rhino writes the page to the database file without generating an undo record.
With this design, if the number of pages accessed by the transaction is smaller than the main memory size, no undo records are generated. Undo records are generated only when buffer pages are evicted.
Page diffing was first proposed in QuickStore [white]. Unlike Rhino, QuickStore does not allow dirty pages to be flushed before commit (no-steal policy). Thus, QuickStore does not generate reverse diffs.