As noted in Section 2, most file systems have focused on repairing the file system after a system crash or other cause of unclean unmount. A few file systems, such as ZFS[2], have gone as far as checksumming and duplicating all file system metadata, which reduces the frequency of fsck but not the overall time. At least two file systems, ext3 and XFS, have plans underway to parallelize fsck further within the constraints of the existing on-disk format.
A method of reducing fsck time proposed in [12] involves tracking active block groups and using this information plus a few other kinds of metadata to speed up fsck. Some of the authors independently ``rediscovered'' and evaluated a similar solution, called per-block group dirty bits (see Section 7 in [7]). We quickly discovered that block groups were not a meaningful boundary for file system metadata, since any inode can refer to any block in the file system, and any directory entry to any inode. For example, operations that change link counts are not helped by per-block group information. The method of working around this problem used in [12], link tags, led to a significant performance hit without an NVRAM write cache. In Section 6 in [7], we outlined a similar approach, linked writes, which creates a list of dirty inodes and orders writes to them such that the file system can be recovered by scanning data pointed to by the dirty inodes--a not particularly elegant approach which we did not pursue.
Many distributed file systems have mechanisms for improving fault isolation between individual servers[9,15], mostly through replication of metadata and/or data, but none that we know of explicitly address improving file system repair time.
Valerie Henson 2006-10-18