The semaphore data structures are shown in Figure 8. The global ipc_ids structure tracks the state of all semaphores currently in use. Among other things, it contains a global lock ary and a pointer entries that points to an array of pointers of ipc_id structures. Each such entry is either NULL or points to a sem_array structure, which represents a set of semaphores that has been created by a single semget() system call. The array of ipc_id structures is dynamically expanded as required; see the discussion of the grow_ary() function in Section 4.5. The sem_array structure is allocated by a semget() system call and deleted by a semctl(IPC_RMID) system call. The individual semaphores in a set are each represented by a sem structure.
Each semop() system call presents the semid for the semaphore, which must be looked up in this data structure to locate the corresponding sem_array. Thus, each and every semaphore operation requires that this data structure be traversed.
The ipc_ids field ary is a spinlock_t that protects the entire data structure. This simple locking design prevents System-V semaphore operations from proceeding in parallel. In addition, the cacheline containing the ary spinlock is thrashed among all CPUs.
Use of RCU permits fully parallel operation of different semaphores and fully parallel translation of a semaphore ID into the corresponding sem_array pointer. However, a few changes to the data structure are required, as shown in Figure 9. To begin with, the ipc_id array and the sem_array are each prefixed with an ipc_rcu_kmalloc structure which contains the rcu_head structure that RCU's call_rcu() function needs to track these structures during a grace period. In addition, since there is no longer a global ary lock, each individual sem_array must have its own individual lock to protect operations on the corresponding set of semaphores.
The final change is motivated by fact that the translation from semaphore ID to kern_ipc_perm cannot tolerate the stale data that could result when an ID translation races with an semctl(IPC_RMID) removing that same ID. The possibility of stale data is avoided through use of a deleted flag in the kern_ipc_perm structure, guarded by that structure's lock field. This deleted flag is set just after removing the corresponding sem_array but before starting the grace period. The entire removal operation is performed holding the lock field in the kern_ipc_perm structure. Any attempt to lock a semaphore structure that has the deleted flag set then behaves as if the structure is nonexistent, as will be shown in the following sections.