Accurate prediction of which blocks will be invalidated soon is the key to the success of our strategy. We looked at both the temporal and spatial locality of file accessing patterns. File system accesses show strong temporal locality: many files are overwritten again and again in a short period of time. For example, Hartman and Ousterhout [7] pointed out that 36%-63% of data would be overwritten within 30 seconds and 60%-95% within 1000 seconds in the system they measured. In year 2000, Roselli et al. [14] pointed out that file accesses obey a bimodal distribution pattern: some files are written repeatedly without being read; other files are almost exclusively read. Data that have been actively written, should be put into active segments, and others into inactive segments.
File system accesses also show strong spatial locality, as many data blocks are accessed together. For example, data blocks of one file are likely to be changed together. Similarly, when a file block is modified, the inode of the file, together with the data blocks and the inode of the directory containing the file, are also likely to be updated. These blocks should therefore be grouped together in semantics such that when one block is invalidated, all or most other blocks in the same segment will be invalidated also.