The past two years have seen much discussion of RCU, along with the design and coding of a number of implementations and uses of RCU, one of which is now part of the Linux 2.5 kernel [Sarma02].
Comparisons with other concurrent update mechanisms [McK01b,Linder02a] have shown that RCU can greatly simplify and improve performance of code accessing read-mostly linked-list data structures. This paper adds a performance evaluation of RCU applied to the Linux System-V IPC primitives. RCU can also improve performance of code modifying linked-list structures when there is a high system-wide aggregate update rate across all such structures [McK98a].
Comparison of multiple RCU implementations [McK02a] showed, as noted in the abstract, that there is no overall best algorithm. The rcu-poll algorithm had the shortest latency, while the rcu-ltimer algorithm had the lowest overhead. This paper presents a parallelized variant of rcu-poll in an attempt to gain the best of both worlds.
Section 2 provides background on RCU, Section 3 reviews attempts to produce a single best RCU implementation, Section 4 describes an RCU-based implementation Linux's System V IPC primitives, and Section 5 describes future plans.