To the best of our knowledge, this is the first work that attempts to propagate file updates by operation. However, some ideas and techniques used in this work have been studied in previous research.
Uses in database. The idea of operation-based update propagation is not new to the database community [15], but we apply it in a new context: distributed file system. First of all, we need to log and ship operations at a level higher than the file system itself, because the low-level file-system operations are not appropriate for operation shipping. Therefore, we need cooperation between the applications (or meta-applications) and the file system. Also, the new context requires several new concepts: re-execution by surrogate, adjustment of meta-data, validation of re-execution, and handling of non-repeating side effects. Finally, our file system can attempt operation shipping more boldly, because it has a fall-back mechanism of value shipping.
Directory operations. Logging and shipping of directory operations have been implemented in Coda prior to this work [20,19]. When a directory is updated on a Coda client (e.g., a new entry is inserted), instead of shipping the whole new directory to the server, the client ships only the update operation (e.g., the insertion operation). Directory operations are more like database operations, since they can be mapped directly to insertion, deletion, and modification of directory entries. In contrast, this work focuses on operation shipping of general user operations.
Repeatable re-execution. Several previous research projects have investigated the conditions for repeatable re-executions. Repeatable re-executions were desired for fault tolerance [1] or load balancing [2]. In the former case, a process Pcan be backed up by another process Pb. If P crashes, then Pbwill repeat the execution of P since a recent checkpoint, and will thereafter assume the role of P. In the latter case, a process can migrate to another host to reduce the load imposed on the original host. In our work, repeatable re-executions are used to re-produce some file modifications that are identical to those produced by the original executions.
Re-execution for transactional guarantee. A previous Coda project has implemented a mechanism for re-execution of operations [9] [10]. It addresses the update conflicts that may be incurred in optimistically controlled replica. It proposes that a user can declare a portion of execution as an Isolation-Only Transaction (IOT). If an update conflict happens, Coda will re-execute the transaction. Our work is different in that we focus on performance improvement. Also, in our work, re-executions take place in a different host, whereas re-execution of IOTs take place in the same host. This implies that we must handle the case where re-execution does not produce the same results as the original execution.