Muralidharan Rangarajan and Florin Sultan
Rutgers University, Department of Computer Science
High-Performance Fault-Tolerant Distributed Shared-Memory for
Linux-based PC clusters
The DSM runtime package for Linux-based clusters of PCs can be downloaded from here. The key feature of this package is that it virtually transforms a Linux-based cluster into a shared-memory parallel programming environment identical (from the programmer's point of view) with the multiprocessor one. A paper describing the implementation and the performance at the protocol will be presented at the 4th Annual Linux Showcase & Conference, Atlanta.
We have also developed a highly scalable fault tolerance scheme based on independent checkpointing exclusively. A description of this work can be found at http://discolab.rutgers.edu/projects/dsm/