|
|
|
|
|
|
|
|
|
|
xen-devel
Re: [Xen-devel] [PATCH] scrub pages on guest termination
On 23/5/08 18:01, "Ben Guthro" <bguthro@xxxxxxxxxxxxxxx> wrote:
Yes, sorry - should have removed our terminology from the description.
Node=physical machine
VS=HVM guest w/ pv-on-hvm drivers
Looking back at the original bug report - it seems to indicate it was migrating from a system with 2 processors to one with 8
It’s very surprising that lock contention would cause such a severe lack of progress on an 8-CPU system. If the lock is that hotly contended then even the usage of it in free_domheap_pages() has to be questionable.
I’m inclined to say that if we want to address this then we should do it in one or more of the following ways:
1. Count CPUs into the scrub function with an atomic_t and beyond a limit all other CPUs bail straight out after re-setting their timer.
2. Increase scrub batch size to reduce proportion of time that each loop iteration holds the lock.
3. Turn the spin_lock() into a spin_trylock() so that the timeout check can be guaranteed to execute frequently.
4. Eliminate the global lock by building a lock-free linked list, or by maintaining per-CPU hashed work queues with work stealing, or... etc.
The patch as-is at least suffers from the issue that the ‘primary scrubber’ should be regularly checking for softirq work. But I’m not sure such a sizeable change to the scheduling policy for scrubbing (such as it is!) is necessary or desirable.
Option 4 is on the morally highest ground but is of course the most work. :-)
-- Keir
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|
|
|