Next: Reducing the Cost of
Up: Reducing the Frequency of
Previous: Reducing the OS TLB
The core of Linux memory management is based on the x86 two-level page
tables. We could change the organization of the PTEs in these tables to match
the requirements of the PPC architecture (a hash table instead of a
two-level page table), but we were committed
to using these page tables as the initial source of PTE's due to the design
of Linux. Note that any PPC OS must
have a data structure to serve this function, because the PTEs that do not fit in either
the primary or overflow bucket must be stored somewhere.
It is possible, but impractical, to resize the hash table when both buckets
overflow.
Various techniques for handling overflow are
discussed in [8] and [12]. A reasonable PPC OS must
minimize the number of overflows in the hash table so the cost of handling
overflows was not a serious concern for us. Instead, we focused on
reducing the contention in the hash table to increase the efficiency of the
hash table which reduces the number of overflows.
Our original strategy for both the 603 and 604 processors was to use
the hash table as a second level TLB cache and, thus, it became important
to reduce hash table ``hot spots'' and overflow.
The obvious strategy is to derive VSIDs from the process
identifier so that each process has a distinct virtual address space. Keep
in mind that the hardware will treat each set of VSIDs as a separate
address space.
Multiplying the process id by a small non-power-of-two constant
proved to be necessary in order to scatter PTEs within the hash table.
Note that the logical address spaces of processes tend to be similar
so the hash
functions rely on the VSIDs to provide variation. We tuned the VSID
generation algorithm by making Linux keep a hash table miss histogram and
adjusting the constant until hot-spots disappeared. We began with
37% use of the hash table and were able to bring that up to an average of
57% with the hash table containing both user and kernel PTE's. After
removing the kernel PTE's from the hash table we were eventually able to
achieve 75% use of the hash table with find tuning of the constant.
Next: Reducing the Cost of
Up: Reducing the Frequency of
Previous: Reducing the OS TLB
Cort Dougan
1999-01-04