Next: 7 Conclusions Up: Speeding Up Kernel Scheduler Previous: 5.2.2 Chat

6 Related work

Several schemes for speeding up Linux scheduler have been proposed.

Kravetz et al. proposed the multi-queue scheduler to enhance scalability of the Linux 2.4.x on large scale SMP machines [8]. The multi-queue scheduler separates the global run queue into a number of queues and distributes them to each CPU. Each CPU maintains its own run queue. This scheduler aims to reduce the run queue lock contention among processors. On the other hand, our coloring scheme aims to reduce cache conflicts during the run queue traversal. These two methods have different purposes and are orthogonal with each other. Furthermore, our coloring scheme can also be merged with the multi-queue scheduler and accelerate its performance.

Molloy et al. proposed the ELSC scheduler [11]. This scheduler focuses on pre-calculating base priorities and sorting the run queue for efficient task selection. The base priority is calculated by variables, such as counter and priority, which do not change while the runnable task is not running on a processor. This scheduler maintains a table which is sorted in the base priority. Each entry of the table contains a double-linked list which involves processes with same priority. The Priority Level Scheduler (PLS) [8] is proposed as a similar scheme. Both ELSC and PLS can make scheduling decisions faster, but cannot eliminate run queue lock contention, therefore do not show further scalability as the number of CPUs increases.

In [12], Sears has pointed out severe cache misses, which occur in the current scheduler, and provided the solution by using a prefetch technique. This improvement has already been merged to the latest kernel (after 2.4.10 version). However, in order for the prefetch technique to perform efficiently, the memory access latency and goodness() execution must overlap almost perfectly. The potential problem of this method is that these values depends on the memory system configuration.

Development of the Linux kernel is performed very rapidly on the Linux kernel mailing list. We have already contributed our scheme as a patch. Concurrently, Spraul has also contributed a patch with another implementation for coloring task structures. In his implementation, the kernel allocates task structures through the Slab Allocator [4,13]. His implementation does not fix the number of colors nor does not depend on the cache configuration. In this respect, it is much more general than ours. On the other hand, our approach needs only a small modification to get_current() function, and the coloring effect could be quickly verified. Using our implementation, we can control the number of colors. Therefore, we could analyze the effect of the coloring, reduction of bus traffic, L2 cache misses and lock contentions, as a function of the number of colors.

Next: 7 Conclusions Up: Speeding Up Kernel Scheduler Previous: 5.2.2 Chat

Shuji YAMAMURA 2002-04-16