Check out the new USENIX Web site. next up previous
Next: The MultiView Technique Up: MultiView and Millipage - Previous: MultiView and Millipage -

Introduction

The basic mechanism for implementing software distributed shared memory systems ( DSMs) was described for the first time in a seminal paper by Li and Hudak [15] and implemented in the Ivy system [14]. The method relies heavily on the operating system's virtual memory page protection mechanisms, enforcing a sharing granularity which is equal to the size of the virtual memory page (page-based DSMs). Page sizes, typically a few kilobytes, are usually much larger than the actual sharing granularity of the applications. Therefore, the main problem that researchers have faced in developing page-based DSMs has been false sharing, where two or more hosts use different variables that happen to reside in the same page. False sharing can cause a severe performance degradation of programs running on software DSMs and may even lead to slowdowns.

There have been many attempts to overcome the false sharing problem. An extensive study has been conducted and numerous works on relaxing the memory consistency have been written, including [1,3,5,7,9,12,22,26], to mention only a few. Relaxing the consistency enhances parallelism and may significantly reduce the required communication for memory synchronization. Memory synchronization is generally controlled by calls to a synchronization primitive or a method which is associated with one. Even when not much work is involved, relaxed consistency models do require that the programmer modify the code and be aware of the semantics of the memory behavior. As a result, DSMs using relaxed consistency models trade the abstraction of the underlying memory system for added efficiency.

A different approach was proposed in the Blizzard and the Shasta systems [18,19,20]. In order to circumvent the false-sharing problem and provide fain-grain access, Shasta avoids using the virtual memory protection mechanism. Rather, it instruments binary code, wrapping loads and stores with instructions to check for the availability of the data and to maintain its consistency. The result is a fine-grained DSM, capable of sharing memory blocks of arbitrary size. However, high overhead is introduced by the wrapping instructions, which necessitates aggressive optimization techniques.

In this paper we propose a new method, called MULTIVIEW, which allows the efficient implementation of fine-grained DSM. Although MULTIVIEW does use the virtual memory protection mechanism, it is capable of manipulating the memory in variable size blocks, called minipages, which are smaller than the virtual memory page size. MULTIVIEW involves little overhead; it usually requires no modifications by the programmer, nor does it require post compilation or code instrumentation of any kind.

MULTIVIEW provides two notable advantages. First, false sharing can be avoided simply by associating variables with individual minipages. The system then manages sharing of program variables rather than full pages. Second, MULTIVIEW enables a DSM implementation that completely avoids buffer copying in the DSM layer. For this reason MULTIVIEW is well suited for integration with high performance messaging layers like Active Messages [24], FastMessages [16] and the VIA interface.

We have implemented a system named MILLIPAGE - a high-performance fine-granularity page-based DSM. MILLIPAGE uses MULTIVIEW both for achieving fine granularity and for enhancing performance. Despite the fact that it uses the virtual memory page protection mechanism (and thus can be viewed as ``page-based''), MILLIPAGE supports sharing of memory at any granularity. Furthermore, sharing in small granularity imposes only a negligible overhead in MILLIPAGE.

It has recently been noted, e.g. in [23], that the latest advances in communication speed make the complexity of the underlying DSM protocols a non negligible factor in the overall system performance. A notable aspect of MILLIPAGE is its efficient support of Sequential Consistency with a very simple and ``clean'' protocol, which leads us to the notion of a thin-layer DSM. The key element in thin-layer DSMs is the simplicity of handling a request for shared data. There is no need for page twinning, which consumes memory, nor for diff operations, which occupy the cpu, code instrumentation, which blows up the instruction count, or sophisticated protocols which complicate the system. As a result, thin-layer DSMs are simple to develop and debug, easy to use, and impose little overhead on the local operating system and the communication network, beyond that which is required by the applications.

It was recently shown that reducing the granularity in systems which implement strict consistency may achieve performance comparable to that of systems implementing relaxed consistency memory models [19,27]. In accordance with these findings, Sequential Consistency was employed in MILLIPAGE: initial performance evaluation shows results comparable or superior to those obtained in systems which employ relaxed consistency models.

MILLIPAGE is fully operational in the Distributed Systems Laboratory at the Technion - Israel Institute of Technology 1. MILLIPAGE uses the Illinois FastMessages [16] on a cluster of 8 Compaq 300Mhz Pentium II machines, interconnected by a Myrinet switch, and running Microsoft Windows-NT.

The rest of this paper is organized as follows. The following section describes the MULTIVIEW technique and how to generate and control minipages. In Section 3 we describe the design of MILLIPAGE; we discuss important issues that affected the DSM architecture, issues which arise from applying the MULTIVIEW technique and the integration of fast messaging libraries. Initial performance evaluation of MILLIPAGE is provided in Section 4. Finally, we discuss future work and open research topics.


next up previous
Next: The MultiView Technique Up: MultiView and Millipage - Previous: MultiView and Millipage -
Ayal Itzkovitz and Assaf Schuster, The Technion