Check out the new USENIX Web site. next up previous
Next: File Access Locality Up: Introduction Previous: Motivation

Our New Scheme

Based on this observation, we propose a new method called WOLF (reordering Write buffer Of Log-structured File system) that can dramatically reduce the garbage collection overhead. Instead of using one segment buffer, we use two or more segment buffers(here is two), as shown in Figure 2. When write data arrives, the system sorts them into different buffers according to their expected longevity. Active data are grouped into one buffer, while less-active data are grouped into the other buffer. When the buffers are full, two buffers are written into two disk segments using two large disk writes (one write for each buffer).

Figure 2: Our new scheme-WOLF
\includegraphics[width=3.2in, height=2.1in]{New_Scheme.eps}

Because data are sorted into active and inactive segments before reaching the disk, garbage collection overhead is drastically reduced. Since active data are grouped together, most of an active segment will be quickly invalidated (sometimes the entire segment will be invalidated, and the segment can be reused right away without garbage collection). On the other hand, very few data blocks in an inactive segment will be invalidated, resulting in few holes. The outcome is that data on the disk have a bimodal distribution, namely segments are either mostly full or mostly empty. Similar to Rosenblum and Ousterhout's analysis [15], this is an ideal situation. In a bimodal distribution, segments tend to be nearly empty or nearly full, but few segments are in between. The cleaner can select many nearly empty segments to clean and compact their data into a small number of segments. The old segments are then freed, resulting in a large number of available empty segments for future use. Furthermore, there is no need to waste time to clean the nearly-full segments.

Basically, while previous researchers agreed that the cleaner plays one of the most important roles in LFS, their work focused only on making the cleaner more efficient after data are written onto the disk. We believe that there exists another opportunity to improve the LFS performance. By re-organizing data in RAM before they reach the disk, we could also make the system do less garbage collection work. Traditional LFS did try to separate active data from inactive data and force a bimodal distribution, but only during the garbage collection period, long after files are written to the disk. Our simulation shows that significant performance gain can be obtained by applying our new method.


next up previous
Next: File Access Locality Up: Introduction Previous: Motivation
Jun Wang 2001-10-31