Actually, our system chooses a target uncompressed cache size, and the corresponding overall effective cache size is computed based on the number of page frames left for the compressed cache, multiplied by an estimate of the compression ratio for recently compressed pages. This means that the statistics kept by our adaptivity mechanism are not exact (our past information may contain hits that are in a different recency region than that indicated by the current compressibility estimate). Nevertheless, this does not seem to matter much in our simulations; approximate statistics about which pages have been touched how recently are quite sufficient. This indicates that our system will not be sensitive to the details of the replacement policy used for the uncompressed cache; any normal LRU approximation should work fine. (E.g., a clock algorithm using reference bits, a FIFO-LRU segmented queue using kernel page traps, or a RANDOM-LRU segmented queue using TLB miss handlers.)
The overheads of updating the statistics, performing the cost/benefit analyses, and adaptively choosing a target split are low--just a few hundred instructions per uncompressed cache miss, if the LRU list is implemented as a tree with an auxiliary table (a hash table or sparse page-table like structure).