The goal of our transformation mechanism is to reduce the time and space overheads when applied to single-writer, multiple-reader algorithms. We now evaluate how much improvement we can achieve with the proposed transformation. Specifically, we will compare a total of 9 different IPC mechanisms, including Chen's, Improved Chen's, Double Buffer, and Improved Double Buffer algorithms.
We also consider another wait-free, single-writer, multiple-reader IPC mechanism, Peterson's algorithm, as well as our transformed version of it. In Peterson's algorithm [24], the reader determines if its read is corrupted, and may have to perform the read up to 3 times. The writer may also have to write a message up to (P + 2) times, where P is the number of readers. The mechanism has been revised [34] such that readers read a message at most 2 times, and the writer writes a message at most (P + 1) times to avoid corruption. We only consider the revised version here. We derive the Improved Peterson's algorithm by applying our general transformation as described in Section 3.5.
For the purpose of comparison, we also evaluate the NBW protocol and the EMERALDS variant of this. As discussed earlier, NBW is the most efficient algorithm in terms execution time, but may induce high space overheads. The EMERALDS IPC mechanism tries to limit memory use at some cost to performance. Finally, we also include a very efficient implementation of synchronization-based IPC, using a lock algorithm that relies on the atomic Test-And-Set instruction, to show the tradeoffs between synchronization-based and synchronization-free mechanisms.
To make fair and comprehensive comparisons between these algorithms, we have considered various parameters trying to answer the following questions.
We evaluate the algorithms for memory usage and execution time overheads, in both average and worst cases, and for both uniprocessor and symmetric multiprocessor (SMP) environments. The only exception is for the EMERALDS IPC mechanism, which is evaluated only for uniprocessors. Because it assumes that operations are atomic if interrupts are disabled, it will not work correctly with SMP architectures where this assumption does not hold.