WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] SMP Guest Proposal RFC

To: Ian Pratt <m+Ian.Pratt@xxxxxxxxxxxx>
Subject: Re: [Xen-devel] SMP Guest Proposal RFC
From: Ryan Harper <ryanh@xxxxxxxxxx>
Date: Fri, 1 Apr 2005 19:46:05 -0600
Cc: Bryan Rosenburg <rosnbrg@xxxxxxxxxx>, Ryan Harper <ryanh@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx, Orran Krieger <okrieg@xxxxxxxxxx>
Delivery-date: Sat, 02 Apr 2005 01:46:18 +0000
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <A95E2296287EAD4EB592B5DEEFCE0E9D1E39B0@xxxxxxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <A95E2296287EAD4EB592B5DEEFCE0E9D1E39B0@xxxxxxxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.6+20040907i
* Ian Pratt <m+Ian.Pratt@xxxxxxxxxxxx> [2005-04-01 18:55]:
>  
> > Attached is a proposal authored by Bryan Rosenburg, Orran 
> > Krieger and Ryan Harper.  Comments, questions, and criticism 
> > requested.
> 
> Ryan,
> 
> Much of what you're proposing closely matches our own plans: It's always
> better that a domain have the minimum number of VCPUs active that are
> required to meet its CPU load, and gang scheduling is clearly preferred
> where possible. 

That sounds good.

> However, I'm convinced that pre-emption notifcations are not the way to
> go: Kernels typically have no way to back-out of holding a lock early,
> so giving them an active call-back is not very useful.

With a notification method using interrupts the kernel is informing the
hypervisor when it is safe to preempt.  That is, the interrupt is
serviced only when no locks are being held which is ideal for avoiding
preemption of a lock-holder.  If the kernel does not yield in time, then
we are no worse off than preemption with no notification w.r.t.
preempting lock-holders.  The notification allows the kernel to prepare
for preemption, such as migrating applications to other cpus that are
not being preempted.

> I think its better to have a counter that the VCPU increments whenever
> it grabs a lock and decrements when it releases a lock. When the
> pre-emption timer goes off, the hypervisor can check the counter. If its
> non zero, the hypervisor can choose to hold-off the preemption for e.g.
> 50us. It can also set a bit in another word indiciating that a
> pre-emption is pending. Whenever the '#locks held' counter is
> decremented to zero, the pre-emption pending bit can be checked, and the
> VCPU should imediately yield if it is.

One of our concerns was the accounting overhead incurred during each
spinlock acquisition and release.  Linux acquires and release spinlocks
at an incredible rate.  Rather than affect the fast path of the spinlock
code, in our proposal, we only pay when we need to preempt.  

> An alternative/complementary scheme would be to have each lock able to
> store the number of the VCPU that's holding it. If a VCPU finds that a
> lock is already taken, it can look in the shared info page to see if the
> VCPU that's holding the lock is actually running. If its not, it can
> issue a hypervisor_yield_to_VCPU X hypercall and avoid further spinning,
> passing its time slice to the VCPU holding the lock. 

The directed yield is complementary to any of the schemes dicussed here
as it helps out when lock-holder preemption actually occurs.  This is
the current method employed by the IBM production hypervisor.  You can
see the Linux/Power implementation in arch/ppc64/lib/locks.h

Thanks for the comments.  I look forward to further discussion.

Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
(512) 838-9253   T/L: 678-9253
ryanh@xxxxxxxxxx

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>