[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [Patch 2 of 2]: PV-domain SMP performance Linux-part



Keir Fraser wrote:
On 16/01/2009 09:36, "Juergen Gross" <juergen.gross@xxxxxxxxxxxxxxxxxxx>
wrote:

The approach taken in Linux is not merely 'yield on spinlock' by the way, it
is 'block on event channel on spinlock' essentially turning a contended
spinlock into a sleeping mutex. I think that is quite different behaviour
from merely yielding, and expecting the scheduler to do something sensible
with your yield request.
Could you explain this a little bit more in detail, please?

Jeremy Fitzhardinge did the implementation for Linux, so I'm cc'ing him in
case he remembers more details than me.

Basically each CPU allocates itself an IPI event channel at start of day.
When a CPU attempts to acquire a spinlock it spins a short while (perhaps a
few microseconds?) and then adds itself to a bitmap stored in the lock
structure (I think, or it might be a linked list of sleepers?). It then
calls SCHEDOP_poll listing its IPI evtchn as its wakeup requirement. When
the lock holder releases the lock it checks for sleepers and if it sees one
then it pings one of them (or is it all of them?) on its event channel, thus
waking it to take the lock.

Yes, that's more or less right. Each lock has a count of how many cpus are waiting for the lock; if its non-zero on unlock, the unlocker kicks all the waiting cpus via IPI. There's a per-cpu variable of "lock I am waiting for"; the kicker looks at each cpu's entry and kicks it if its waiting for the lock being unlocked.

The locking side does the expected "spin for a while, then block on timeout". The timeout is settable if you have the appropriate debugfs option enabled (which also produces quite a lot of detailed stats about locking behaviour). The IPI is never delivered as an event BTW; the locker uses the event poll hypercall to block until the event is pending (this hypercall had some performance problems until relatively recent versions of Xen; I'm not sure which release versions has the fix).

The lock itself is a simple byte spinlock, with no fairness guarantees; I'm assuming (hoping) that the pathological cases that ticket locks were introduced to solve will be mitigated by the timeout/blocking path (and/or less likely in a virtual environment anyway).

I measured a small performance improvement within the domain with this patch (kernbench-type workload), but an overall 10% reduction in system-wide CPU use with multiple competing domains.

   J

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.