[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen 4.3 development update

On 04/25/2013 04:46 PM, Tim Deegan wrote:
At 16:20 +0100 on 25 Apr (1366906804), George Dunlap wrote:
On Thu, Apr 4, 2013 at 4:23 PM, Tim Deegan <tim@xxxxxxx> wrote:
At 11:34 -0400 on 03 Apr (1364988853), Andres Lagar-Cavilla wrote:
On Apr 3, 2013, at 6:53 AM, George Dunlap <george.dunlap@xxxxxxxxxxxxx> wrote:

On 03/04/13 08:27, Jan Beulich wrote:
On 02.04.13 at 18:34, Tim Deegan <tim@xxxxxxx> wrote:
This is a separate problem.  IIRC the AMD XP perf issue is caused by the
emulation of LAPIC TPR accesses slowing down with Andres's p2m locking
patches.  XP doesn't have 'lazy IRQL' or support for CR8, so it takes a
_lot_ of vmexits for IRQL reads and writes.
Ah, okay, sorry for mixing this up. But how is this a regression

My sense, when I looked at this back whenever that there was much more to this. 
 The XP IRQL updating is a problem, but it's made terribly worse by the 
changset in question.  It seemed to me like the kind of thing that would be 
caused by TLB or caches suddenly becoming much less effective.

The commit in question does not add p2m mutations, so it doesn't nuke the 
NPT/EPT TLBs. It introduces a spin lock in the hot path and that is the 
problem. Later in the 4.2 cycle we changed the common case to use an rwlock. 
Does the same perf degradation occur with tip of 4.2?

Yes, 4.2 is definitely slower.  A compile test on a 4-vcpu VM that takes
about 12 minutes before this locking change takes more than 20 minutes
on the current tip of xen-unstable (I gave up at 22 minutes and rebooted
to test something else).


Can you go into a bit more detail about what you complied on what kind of OS?

I was compiling on Win XP sp3, 32-bit, 1vcpu, 4G ram.  The compile was
the Windows DDK sample code.

As I think I mentioned later, all my measurements are extremely suspect
as I was relying on guest wallclock time, and the 'before' case was
before the XP wallclock time was fixed. :(

The VM was a Debian Wheezy VM, stock kernel (3.2), PVHVM mode, 1G of
RAM, 4 vcpus, LVM-backed 8G disk.

I suspect the TPR access patterns of XP are not seen on linux; it's been
known for long enough now that it's super-slow on emulated platforms and
AFAIK it was only ever Windows that used the TPR so aggressively anyway.

Right. IIRC w2k3 sp2 has the "lazy tpr" feature, so if I can get consistent results with that one then we can say... well, we can at least say it's not easy to reproduce. :-)


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.