On Tue, Sep 06, 2005 at 08:31:26AM +0800, Dong, Eddie wrote:
> Do u have any measurement data for the locality in LVHPT Linux
> code? You are the right person know this :-)
I don't have any good measurement data, however in Linux:
(1) RIDs of processes which communicate tend to be close together, as:
- the communicating processes may have been started together, or
- one process was created from the other process with fork(), which
will assign two new (probably sequential) RIDs, or
- one process was created from a server process using fork() in
response to a client request. Now all three will likely have
close together RIDs.
(2) VPNs tend to be close together, and clustered at the bottom of
regions. Text, data, and libraries are all allocated sequentially
from the bottom of regions 2, 3 and 1 respectively. Communicating
processes may often have similar address space layout (either
because they are the same binary, or use similar libraries).
Thus, given that most of the entropy is in the bottom bits for both RID
and VPN, RID xor VPN is a very bad hash function. We need to give the
bottom RID bits higher significance.
I now realise that the current Xen mangling function, which moves them
bottom bits into bits 16..23, actually doesn't achieve this, unless the
VHPT is very large. bit 16 of the RID produces bit 21 of the thash
address, i.e. sequential RIDs are spaced 2MB apart in the VHPT. RIDs
spaced 8 apart, as consecutive Linux processes are, are spaced 16MB
apart in the VHPT. I think the size of the VHPT in Xen is 16MB, so
actually Linux processes with consecutive RIDs collide, and it is almost
as pathological as not having mangling.
I think the ideal mangling function would be to reverse the bottom n
bits of the RID, where n is the number of bits used in the hash (and
depends on the VHPT size). Thus consecutive RIDs would result in
accessing diametrically opposite portions of the VHPT, while consecutive
VPNs achieve cache locality within the halves. Further apart processes
would be progressively more likely to collide, but are also less likely
to be communicating.
Unfortunately, Itanium doesn't provide bit-reversal instructions, which
is why in the Linux long VHPT work I decided to just do byte-reversal on
the bottom n/8 bytes, to approximate this bit-reversal. For typical
VHPT sizes n/8 is around 2.
Obviously one could write functions that better approximate bit-
reversal, in the extreme case using a lookup table for each byte or
nibble, though I'm not sure whether that's worthwhile.
I think it would be worth changing the Xen mangling so that it
switches bytes 1 and 2 instead of 1 and 3, and seeing if that makes
an improvement.
Matt
_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel
|