Re: Variable-sized pages and TLB
- From: "Paul A. Clayton" <paaronclayton@xxxxxxxxxxxxx>
- Date: Mon, 27 Aug 2007 08:24:32 -0700
On Aug 24, 8:57 pm, "Piotr Wyderski" <wyder...@xxxxxxxxxxxxxxxxxxxxx
ii.uni.wroc.pl> wrote:
How does the TLB buffers work on the architectures that
support multiple page sizes? On IA-32 it is pretty simple,
as there are only two (3 in fact, but it doesn't matter, because
it is not possible to use 4- and 2-MiB pages simultaneously)
supported pages sizes: 4KiB and 4MiB and they are handled
by two separate TLB arrays. But there are architectures with
many more supported page sizes. In case of fully associative
TLBs it is easy (I guess...), because each entry can hold
appropriate upper and lower virtual address boundaries and
compare that range with a given address. However, since
true full associativity is rather expensive (especially when you
have 256+ entries), some tricks like the direct mapping technique
are being widely used. But direct mapping requires fixed-size
pages. So where is the trick? :-)
You might enjoy reading Andre Seznec's "Concurrent Support of
Multiple Page Sizes On a Skewed Associative TLB"
(www.irisa.fr/caps/people/seznec/SKEWEDTLBperso.pdf) for one
possible way (AFAIK never implemented). I think most
implementations supporting a wide range of page sizes have
used full associativity.
Supporting a different page size for different threads is, of
course, relatively easy at a small latency cost (the shifter
and address bits ORing logic only need to be programmed at
context switches; presumably one would want the lower bits for
a larger page to be zero to allow for a simple ORing).
For a small number of page sizes in an L2 TLB, sequential
hashing would be an option (or L1 TLB in something like 64-bit
PowerPC where an Effective to Real Address Translation table
is usually implemented using a single base page size [TLB
lookup requires a previous Segment Lookaside Buffer lookup in
64-bit PoweerPC]), perhaps using a predictor to order the
lookups.
One could also combine multiple TLBs with sequential hashing
(or multiple non-conflicting bank [skewed] hashes). One
could presumably also provide multiple row decoders and
multiple read ports, though that might be somewhat expensive
(I suspect full associativity would be more attractive at a
somewhat small number of page sizes.), and that could be
combined with the other mechanisms for multiple hashings.
(x86's case is even more simple because the entries for the
2MiB pages can alternately be used to hold Page Directory
Entries [AFAIK no x86 implementation does this, but it COULD
be done], so one need not fear wasting large page entries if
only 4KiB pages are used [a cached PDE allows a single memory
lookup to find the translation for a 4KiB page]. If MIPS
architected linear page tables, a similar optimization would
be possible [a TLB that held huge page translations could also
hold addresses in the lowest part of the address space {where
the page table is mapped}].)
Paul A. Clayton
just a technophile
.
- References:
- Variable-sized pages and TLB
- From: Piotr Wyderski
- Variable-sized pages and TLB
- Prev by Date: Re: Is a RISC chip more expensive?
- Next by Date: Re: Variable-sized pages and TLB
- Previous by thread: Re: Variable-sized pages and TLB
- Index(es):
Relevant Pages
|