[Xen-devel] RE: Xen/ia64 - global or per VP VHPT

To:	"Magenheimer, Dan \(HP Labs Fort Collins\)" <dan.magenheimer@xxxxxx>, "Yang, Fred" <fred.yang@xxxxxxxxx>, "Dong, Eddie" <eddie.dong@xxxxxxxxx>
Subject:	[Xen-devel] RE: Xen/ia64 - global or per VP VHPT
From:	"Munoz, Alberto J" <alberto.j.munoz@xxxxxxxxx>
Date:	Fri, 29 Apr 2005 09:09:55 -0700
Cc:	Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>, ipf-xen <ipf-xen@xxxxxxxxx>, xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date:	Fri, 29 Apr 2005 16:10:01 +0000
Envelope-to:	www-data@xxxxxxxxxxxxxxxxxxx
List-help:	<mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id:	Xen developer discussion <xen-devel.lists.xensource.com>
List-post:	<mailto:xen-devel@lists.xensource.com>
List-subscribe:	<http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe:	<http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender:	xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index:	AcVKfesR741jQGkzQvWmdNAbNvskDgAAPPdQAAnH97AAJJJm8AAGJlmgACr730AABzFgcAAYTImwABTis7AAAN6T8A==
Thread-topic:	Xen/ia64 - global or per VP VHPT

Hi Dan,

Magenheimer, Dan (HP Labs Fort Collins) <mailto:dan.magenheimer@xxxxxx>
wrote on Friday, April 29, 2005 8:29 AM:

> (Sorry for the cross-posting... I wanted to ensure that
> the xen-ia64-devel list members are able to chime in.
> With the high traffic on xen-devel, I'm sure many -- like
> myself -- fall behind on reading xen-devel!)
> 
>> With a single  global VHPT and force the same page size
>> limitation, it means all the Domaons must be paravirtualized
>> to a hard defined pag size; this is definitely to limit the
>> capability of the Xen/ia64.   Will this also imply only
>> certain version of the Domains can run on a same platform?
>> What will be the scability issue with a single VHPT table?
>> Imaging multi-VP/Multi-LP, all the LPs walking on the same
>> table, you would need to global purge or send IPI to all
>> processor for purge a single entry.  Costly!
> 
> No, multiple page sizes are supported, though there does have
> to be a system-wide minimum page size (e.g. if this were defined
> as 16KB, a 4KB-page mapping request from a guestOS would be rejected).
> Larger page sizes are represented by multiple entries in the
> global VHPT.

In my opinion this is a moot point because in order to provide the
appropriate semantics for physical mode emulation (PRS.dt, or PSR.it, or
PSR.rt == 0) it is necessary to support a 4K page size as the minimum
(unless you special case translations for physical mode emulation). Also in
terms of machine memory utilization, it is better to have smaller pages (I
know this functionality is not yet available in Xen, but I believe it will
become important once people are done working on the basics).

> 
> Purging is definitely expensive but there may be ways to
> minimize that.  That's where the research comes in.

It is not just purging. Having a global VHPT is, in general, really bad for
scalability. Every time the hypervisor wants to modify anything in the VHPT,
it must guarantee that no other processors are accessing that VHPT (this is
a fairly complex thing to do in TLB miss handlers). If you make this
synchronization mechanism dependent on the number of domains (and
processors/cores/threads) in the system, rather than on the degree of SMP of
a domain, as it would be with a per domain VHPT, you will severely limit
scalability. 

Also consider the caching effects (assuming you have hash chains, which I
think you would need in order to avoid forward progress issues). Every time
a processor walks a hash chain, it must bring all those PTEs into its cache.
Every time you set an access or dirty bit, you must get the line private.

If you are only considering 2-way (maybe even 4-way) machines this is not a
big deal, but if we are talking about larger machines (IPF's bread and
butter), these problems are really serious.

Another important thing is hashing into the VHPT. If you have a single VHPT
for multiple guests (and those guests are the same, e.g., same version of
Linux) then you are depending 100% on having a good RID allocator (per
domain) otherwise the translations for different domains will start
colliding in your hash chains and thus reducing the efficiency of your VHPT.
The point here is that guest OSs (that care about this type of stuff) are
designed to spread RIDs such that they minimize their own hash chain
collisions, but there are not design to not collide with other guest's.
Also, the fact that the hash algorithm is implementation specific makes this
problem even worse.


> I expect the answer to be that global VHPT will have advantages
> for some workloads and the per-domain VHPT will have advantages
> for other workloads.  (The classic answer to any performance
> question... "it depends..." :-)

As you point out this is ALWAYS the case, but what matters is what are your
target workloads and target systems are. How many domains per system do you
expect to support, and how many processors/cores/threads do you expect per
system, etc.

> 
>>> I agree that purging is a problem, however Linux does not
>>> currently change page sizes frequently.
>> Again, I hope we are not limiting only one OS version to run
>> on this Xen/ia64.  I believe Linux also needs to purge
>> entries other than page size changes!
>> Through per-VP VHPT and VT-I feature of ia64, we can expend
>> Xen/ia64 capability to enable multiple unmodified OS run on
>> Xen/ia64 without knowing what the page size the Domain is using.
> 
> Per-domain VHPT will have its disadvantages too, namely a large
> chunk of memory per domain that is not owned by the domain.
> Perhaps this is not as much of a problem on VT which will be
> limited to 16 domains, but I hope to support more non-VT domains
> (at least 64, maybe more).

Memory footprint is really not that big a deal for these large machines, but
in any case, the size of the VHPT is typically proportional to the size of
physical memory (some people suggest 4 PTEs per physical page frame and some
people suggest 2, but in any case, there is a linear relationship between
the two). If you follow this guide line, then individual VHPTs for 5 guests
should be 1/5 of the size of the combined VHPT for all 5 guests.

> 
> Is the per-domain VHPT the same size as whatever the domain allocates
> for its own VHPT (essentially a shadow)?  Aren't there purge
> performance problems with this too?

Purges are always an issue for SMP machines, but you do not want the problem
to scale with the number of domains and the number of
processors/cores/threads in the system.

> 
> Dan

Bert

smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

WARNING - OLD ARCHIVES

xen-devel

[Xen-devel] RE: Xen/ia64 - global or per VP VHPT