rat: A Secure Archiving Program With Fast Retrieval
Willem A. (Vlakkies) Schreüder and Maria Murillo, University of Colorado at Boulder
Abstract
A new archive format called rat was developed. This format was
designed to allow very fast retrieval of individual files. This is
achieved using a table of contents to quickly find the file.
Each file in the archive is individually compressed with a
compression method specific to the file. A user created configuration
file is used to specify what type of compression to use on each file
based on parameters such as the file extension and file size. Multiple
sets of rules can be defined and activated from the command line to
achieve different aims such as speed or size or to deal with different
types of file sets. Parameters passed to the compression algorithms
may also be specified.
The format also provides for signatures to
be stored with the files. The program will generate and save the
signature when the archive is created and verify the file when the
archive is restored. Encryption is possible but not implemented.
The format is quite robust. If the archive is truncated or the
table of contents is lost, the files in the portion that survived can
still be recovered. Every file and table is preceded by a magic number
so even recovery from bit rot may be possible.
The current implementation incorporates gzip (as zlib), bzip2, and
LZO compression. Library versions of these compression algorithms are
linked in for performance reasons. Only PGP signatures are currently
implemented. Due to export restrictions on encryption software, a
child process is spawned to execute a separate binary to do the
signature creation and verification.
A library called librat implements all the functionality
required to create the archive and restore files from the archive.
Alternate user interfaces or embedded applications are therefore quite
readily created. Three front ends to librat have been
implemented. The first front end is a simple command line interface
similar to tar. The second front end is character based
interface that allows the user to browse the archive and selectively
restore files similar to the restore program used with
dump. The third front end is a GUI implemented using Qt.