Our SCA support was integrated into FiST [29,25]. The FiST system includes portable stackable file system templates for several operating systems as well as a high-level language that can describe new stackable file systems [26,28]. Most of the work was put into the stackable templates, where we added substantially more code to support SCAs: 2119 non-comment lines of C code, representing a 60% increase in the size of the templates. Because this additional code is substantial and carries an overhead with it that is not needed for non-size-changing file systems (Section 7), we made it optional. To support that, we added one additional declaration to the FiST language, to allow developers to decide whether or not to include this additional support.
To use FiST to produce a size-changing file system, developers need to include a single FiST declaration in their input file and then write only two routines: encode_data and decode_data. The main advantage of using FiST for this work has been the ease of use for developers that want to write size-changing file systems. All the complexity is placed in the templates and is mostly hidden from developers' view. Developers need only concentrate on the core implementation issues of the particular algorithm they wish to use in their new file system.
The FiST system has been ported to Linux, Solaris, and FreeBSD. Current SCA support is available for Linux 2.3 only. Our primary goal in this work was to prove that size-changing stackable file systems can be designed to perform well. When we feel that the design is stable and addresses all of the algorithmic issues related to the index file, we will port it to the other templates. We would then be able to describe an SCA file system once in the FiST language; from this single portable description, we could then produce a number of working file systems.
There are two implementation-specific issues of interest: one concerning Linux and the other regarding writes in the middle of files. As mentioned in Section 3, we write any modified index information out when the main file is closed and its data flushed to stable media. In Linux, neither data nor meta-data are automatically flushed to disk. Instead, a kernel thread (kflushd) runs every 5 seconds and asks the page cache to flush any file system data that has not been used recently, but only if the system needs more memory. In addition, file data is forced to disk when either the file system is unmounted or the process called an explicit fflush(3) or fsync(2). We take advantage of this delayed write to improve performance, since we write the index table when the rest of the file's data is written.
To support writes in the middle correctly, we have to make an extra copy of data pages into a temporary location. The problem is that when we write a data page given to us by the VFS, we do not know what this data page will encode into, and how much space that new encoding would require. If it requires more space, then we have to shift data outward in the encoded data file before writing the new data. For this first implementation, we chose the simplified approach of always making the temporary copy, which affects performance as seen in Section 7. While our code shows good performance, it has not been optimized much yet; we discuss avenues of future work in Section 9.