[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen 4.3 development update / winxp AMD performance regression

On 28/04/13 11:18, Peter Maloney wrote:
On 04/25/2013 04:24 PM, Andres Lagar-Cavilla wrote:
On Apr 25, 2013, at 10:00 AM, George Dunlap <george.dunlap@xxxxxxxxxxxxx> wrote:

On 04/25/2013 02:51 PM, Pasi Kärkkäinen wrote:
On Wed, Apr 03, 2013 at 11:34:13AM -0400, Andres Lagar-Cavilla wrote:
On Apr 3, 2013, at 6:53 AM, George Dunlap <george.dunlap@xxxxxxxxxxxxx> wrote:

On 03/04/13 08:27, Jan Beulich wrote:
On 02.04.13 at 18:34, Tim Deegan <tim@xxxxxxx> wrote:
At 16:42 +0100 on 02 Apr (1364920927), Jan Beulich wrote:
On 02.04.13 at 16:07, George Dunlap <George.Dunlap@xxxxxxxxxxxxx> wrote:
* AMD NPT performance regression after c/s 24770:7f79475d3de7
   owner: ?
   Reference: http://marc.info/?l=xen-devel&m=135075376805215
This is supposedly fixed with the RTC changes Tim committed the
other day. Suravee, is that correct?
This is a separate problem.  IIRC the AMD XP perf issue is caused by the
emulation of LAPIC TPR accesses slowing down with Andres's p2m locking
patches.  XP doesn't have 'lazy IRQL' or support for CR8, so it takes a
_lot_ of vmexits for IRQL reads and writes.
Ah, okay, sorry for mixing this up. But how is this a regression
My sense, when I looked at this back whenever that there was much more to this. 
 The XP IRQL updating is a problem, but it's made terribly worse by the 
changset in question.  It seemed to me like the kind of thing that would be 
caused by TLB or caches suddenly becoming much less effective.
The commit in question does not add p2m mutations, so it doesn't nuke the 
NPT/EPT TLBs. It introduces a spin lock in the hot path and that is the 
problem. Later in the 4.2 cycle we changed the common case to use an rwlock. 
Does the same perf degradation occur with tip of 4.2?

Adding Peter to CC who reported the original winxp performance 
problem/regression on AMD.

Peter: Can you try Xen 4.2.2 please and report if it has the performance 
problem or not?
Do you want to compare 4.2.2 to 4.2.1, or 4.3?

The changeset in question was included in the initial release of 4.2, so unless 
you think it's been fixed since, I would expect 4.2 to have this regression.
I believe you will see this 4.2 onwards. 4.2 includes the rwlock optimization. 
Nothing has been added to the tree in that regard recently.

Bad news... It is very slow still. With 7 vcpus, it took very long to
get to the login screen, then I hit the login button at 10:30:30 and at
10.32:10 I can watch my icons starting to appear one by one very slowly.
When the icons are all there, I see a blue bar instead of the taskbar.
10:32:47 the taskbar looks normal finally, but systray is still empty. I
clicked the start menu at 10:33:40 (still empty systray). At 10:33:54,
the start menu opened. At 10:34:20, the first systray icon appeared. at
10:36 I managed to get Task manager loaded, and it shows 88-95% CPU
usage in 7 cpus, but doesn't show any processes using much. (xming using
16, System using 11, taskmgr.exe using 9, CCC.exe using 5, explorer and
services using 4%, etc.) xm top shows the domain at 646.9% CPU.

What guest OS is this again? Windows XP? Do you see the same behavior with other Windows OSes? (e.g., Win7, Win8, w2k3sp2, w2k8?)

If you're really keen, you could do a quick xentrace for me after the VM has mostly booted:
1. Run "xentrace -D -e all -S 32 -T 30 /tmp/[name].trace" on your Xen host
2. Clone and build the following hg repo: http://xenbits.xen.org/ext/xenalyze 3. Run "xenalyze --svm-mode -s [name].trace > [name].summary" and send me the results


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.