[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen 4.3 development update / winxp AMD performance regression



On 04/25/2013 04:24 PM, Andres Lagar-Cavilla wrote:
> On Apr 25, 2013, at 10:00 AM, George Dunlap <george.dunlap@xxxxxxxxxxxxx> 
> wrote:
>
>> On 04/25/2013 02:51 PM, Pasi Kärkkäinen wrote:
>>> On Wed, Apr 03, 2013 at 11:34:13AM -0400, Andres Lagar-Cavilla wrote:
>>>> On Apr 3, 2013, at 6:53 AM, George Dunlap <george.dunlap@xxxxxxxxxxxxx> 
>>>> wrote:
>>>>
>>>>> On 03/04/13 08:27, Jan Beulich wrote:
>>>>>>>>> On 02.04.13 at 18:34, Tim Deegan <tim@xxxxxxx> wrote:
>>>>>>> At 16:42 +0100 on 02 Apr (1364920927), Jan Beulich wrote:
>>>>>>>>>>> On 02.04.13 at 16:07, George Dunlap <George.Dunlap@xxxxxxxxxxxxx> 
>>>>>>>>>>> wrote:
>>>>>>>>> * AMD NPT performance regression after c/s 24770:7f79475d3de7
>>>>>>>>>   owner: ?
>>>>>>>>>   Reference: http://marc.info/?l=xen-devel&m=135075376805215
>>>>>>>> This is supposedly fixed with the RTC changes Tim committed the
>>>>>>>> other day. Suravee, is that correct?
>>>>>>> This is a separate problem.  IIRC the AMD XP perf issue is caused by the
>>>>>>> emulation of LAPIC TPR accesses slowing down with Andres's p2m locking
>>>>>>> patches.  XP doesn't have 'lazy IRQL' or support for CR8, so it takes a
>>>>>>> _lot_ of vmexits for IRQL reads and writes.
>>>>>> Ah, okay, sorry for mixing this up. But how is this a regression
>>>>>> then?
>>>>> My sense, when I looked at this back whenever that there was much more to 
>>>>> this.  The XP IRQL updating is a problem, but it's made terribly worse by 
>>>>> the changset in question.  It seemed to me like the kind of thing that 
>>>>> would be caused by TLB or caches suddenly becoming much less effective.
>>>> The commit in question does not add p2m mutations, so it doesn't nuke the 
>>>> NPT/EPT TLBs. It introduces a spin lock in the hot path and that is the 
>>>> problem. Later in the 4.2 cycle we changed the common case to use an 
>>>> rwlock. Does the same perf degradation occur with tip of 4.2?
>>>>
>>> Adding Peter to CC who reported the original winxp performance 
>>> problem/regression on AMD.
>>>
>>> Peter: Can you try Xen 4.2.2 please and report if it has the performance 
>>> problem or not?
>> Do you want to compare 4.2.2 to 4.2.1, or 4.3?
>>
>> The changeset in question was included in the initial release of 4.2, so 
>> unless you think it's been fixed since, I would expect 4.2 to have this 
>> regression.
> I believe you will see this 4.2 onwards. 4.2 includes the rwlock 
> optimization. Nothing has been added to the tree in that regard recently.
>
> Andres
Bad news... It is very slow still. With 7 vcpus, it took very long to
get to the login screen, then I hit the login button at 10:30:30 and at
10.32:10 I can watch my icons starting to appear one by one very slowly.
When the icons are all there, I see a blue bar instead of the taskbar.
10:32:47 the taskbar looks normal finally, but systray is still empty. I
clicked the start menu at 10:33:40 (still empty systray). At 10:33:54,
the start menu opened. At 10:34:20, the first systray icon appeared. at
10:36 I managed to get Task manager loaded, and it shows 88-95% CPU
usage in 7 cpus, but doesn't show any processes using much. (xming using
16, System using 11, taskmgr.exe using 9, CCC.exe using 5, explorer and
services using 4%, etc.) xm top shows the domain at 646.9% CPU.


xentop - 10:39:37   Xen 4.2.2
2 domains: 2 running, 0 blocked, 0 paused, 0 crashed, 0 dying, 0 shutdown
Mem: 16757960k total, 12768800k used, 3989160k free    CPUs: 8 @ 4499MHz
      NAME  STATE   CPU(sec) CPU(%)     MEM(k) MEM(%)  MAXMEM(k)
MAXMEM(%) VCPUS NETS NETTX(k) NETRX(k) VBDS   VBD_OO   VBD_RD   VBD_WR 
VBD_RSECT  VBD_WSECT SSID
  Domain-0 -----r       1184   25.4    8344320   49.8    8388608     
50.1     8    0        0        0    0        0        0       
0          0          0    0
windowsxp2 -----r       3853  587.3    4197220   25.0    4198400     
25.1     7    1      392       20    1        0    17657     4661    
806510      58398    0

(about 8% of the dom0 stuff is the qemu-dm process, and the rest is
unrelated things)

And this was expected, since I already tested 4.2.1, and you said that
this fix should be in 4.2 onwards, so I would have already tested it in
4.2.1.


Here's xm vcpu-list
Name                                ID  VCPU   CPU State   Time(s) CPU
Affinity
Domain-0                             0     0     2   -b-     461.7 any cpu
Domain-0                             0     1     4   -b-     340.1 any cpu
Domain-0                             0     2     5   -b-     182.8 any cpu
Domain-0                             0     3     3   -b-      84.9 any cpu
Domain-0                             0     4     2   -b-      67.5 any cpu
Domain-0                             0     5     2   r--      62.5 any cpu
Domain-0                             0     6     3   -b-      44.3 any cpu
Domain-0                             0     7     2   -b-      46.5 any cpu
windowsxp2                           3     0     5   r--     755.4 any cpu
windowsxp2                           3     1     1   r--     688.1 any cpu
windowsxp2                           3     2     3   r--     702.6 any cpu
windowsxp2                           3     3     7   r--     723.4 any cpu
windowsxp2                           3     4     6   r--     724.7 any cpu
windowsxp2                           3     5     0   r--     725.0 any cpu
windowsxp2                           3     6     4   r--     821.3 any cpu


Here's dmesg just to see the version:

 __  __            _  _    ____    ____ 
 \ \/ /___ _ __   | || |  |___ \  |___ \
  \  // _ \ '_ \  | || |_   __) |   __) |
  /  \  __/ | | | |__   _| / __/ _ / __/
 /_/\_\___|_| |_|    |_|(_)_____(_)_____|
                                        
(XEN) Xen version 4.2.2 (root@site) (gcc (SUSE Linux) 4.7.1 20120723
[gcc-4_7-branch revision 189773]) Sun Apr 28 00:16:04 CEST 2013
(XEN) Latest ChangeSet: unavailable
(XEN) Bootloader: GRUB2 2.00
(XEN) Command line: dom0_mem=8192M,max:8192M iommu=1 loglvl=all
guest_loglvl=all


Here's the dmesg after domu boots:



(XEN) HVM2: Press F12 for boot menu.
(XEN) HVM2:
(XEN) HVM2: Booting from Hard Disk...
(XEN) HVM2: Booting from 0000:7c00
(XEN) HVM3: HVM Loader
(XEN) HVM3: Detected Xen v4.2.2
(XEN) HVM3: Xenbus rings @0xfeffc000, event channel 9
(XEN) HVM3: System requested ROMBIOS
(XEN) HVM3: CPU speed is 4500 MHz
(XEN) irq.c:270: Dom3 PCI link 0 changed 0 -> 5
(XEN) HVM3: PCI-ISA link 0 routed to IRQ5
(XEN) irq.c:270: Dom3 PCI link 1 changed 0 -> 10
(XEN) HVM3: PCI-ISA link 1 routed to IRQ10
(XEN) irq.c:270: Dom3 PCI link 2 changed 0 -> 11
(XEN) HVM3: PCI-ISA link 2 routed to IRQ11
(XEN) irq.c:270: Dom3 PCI link 3 changed 0 -> 5
(XEN) HVM3: PCI-ISA link 3 routed to IRQ5
(XEN) HVM3: pci dev 01:2 INTD->IRQ5
(XEN) HVM3: pci dev 01:3 INTA->IRQ10
(XEN) HVM3: pci dev 03:0 INTA->IRQ5
(XEN) HVM3: pci dev 04:0 INTA->IRQ5
(XEN) HVM3: pci dev 05:0 INTA->IRQ10
(XEN) HVM3: pci dev 02:0 bar 10 size 02000000: f0000008
(XEN) HVM3: pci dev 03:0 bar 14 size 01000000: f2000008
(XEN) HVM3: pci dev 02:0 bar 14 size 00001000: f3000000
(XEN) HVM3: pci dev 03:0 bar 10 size 00000100: 0000c001
(XEN) HVM3: pci dev 04:0 bar 10 size 00000100: 0000c101
(XEN) HVM3: pci dev 04:0 bar 14 size 00000100: f3001000
(XEN) HVM3: pci dev 05:0 bar 10 size 00000100: 0000c201
(XEN) HVM3: pci dev 01:2 bar 20 size 00000020: 0000c301
(XEN) HVM3: pci dev 01:1 bar 20 size 00000010: 0000c321
(XEN) HVM3: Multiprocessor initialisation:
(XEN) HVM3:  - CPU0 ... 48-bit phys ... fixed MTRRs ... var MTRRs [2/8]
... done.
(XEN) HVM3:  - CPU1 ... 48-bit phys ... fixed MTRRs ... var MTRRs [2/8]
... done.
(XEN) HVM3:  - CPU2 ... 48-bit phys ... fixed MTRRs ... var MTRRs [2/8]
... done.
(XEN) HVM3:  - CPU3 ... 48-bit phys ... fixed MTRRs ... var MTRRs [2/8]
... done.
(XEN) HVM3:  - CPU4 ... 48-bit phys ... fixed MTRRs ... var MTRRs [2/8]
... done.
(XEN) HVM3:  - CPU5 ... 48-bit phys ... fixed MTRRs ... var MTRRs [2/8]
... done.
(XEN) HVM3:  - CPU6 ... 48-bit phys ... fixed MTRRs ... var MTRRs [2/8]
... done.
(XEN) HVM3: Testing HVM environment:
(XEN) HVM3:  - REP INSB across page boundaries ... passed
(XEN) HVM3:  - GS base MSRs and SWAPGS ... passed
(XEN) HVM3: Passed 2 of 2 tests
(XEN) HVM3: Writing SMBIOS tables ...
(XEN) HVM3: Loading ROMBIOS ...
(XEN) HVM3: 12604 bytes of ROMBIOS high-memory extensions:
(XEN) HVM3:   Relocating to 0xfc001000-0xfc00413c ... done
(XEN) HVM3: Creating MP tables ...
(XEN) HVM3: Loading Cirrus VGABIOS ...
(XEN) HVM3: Loading PCI Option ROM ...
(XEN) HVM3:  - Manufacturer: http://ipxe.org
(XEN) HVM3:  - Product name: iPXE
(XEN) HVM3: Option ROMs:
(XEN) HVM3:  c0000-c8fff: VGA BIOS
(XEN) HVM3:  c9000-d8fff: Etherboot ROM
(XEN) HVM3: Loading ACPI ...
(XEN) HVM3: vm86 TSS at fc010200
(XEN) HVM3: BIOS map:
(XEN) HVM3:  f0000-fffff: Main BIOS
(XEN) HVM3: E820 table:
(XEN) HVM3:  [00]: 00000000:00000000 - 00000000:0009e000: RAM
(XEN) HVM3:  [01]: 00000000:0009e000 - 00000000:000a0000: RESERVED
(XEN) HVM3:  HOLE: 00000000:000a0000 - 00000000:000e0000
(XEN) HVM3:  [02]: 00000000:000e0000 - 00000000:00100000: RESERVED
(XEN) HVM3:  [03]: 00000000:00100000 - 00000000:f0000000: RAM
(XEN) HVM3:  HOLE: 00000000:f0000000 - 00000000:fc000000
(XEN) HVM3:  [04]: 00000000:fc000000 - 00000001:00000000: RESERVED
(XEN) HVM3:  [05]: 00000001:00000000 - 00000001:10000000: RAM
(XEN) HVM3: Invoking ROMBIOS ...
(XEN) HVM3: $Revision: 1.221 $ $Date: 2008/12/07 17:32:29 $
(XEN) stdvga.c:147:d3 entering stdvga and caching modes
(XEN) HVM3: VGABios $Id: vgabios.c,v 1.67 2008/01/27 09:44:12 vruppert Exp $
(XEN) HVM3: Bochs BIOS - build: 06/23/99
(XEN) HVM3: $Revision: 1.221 $ $Date: 2008/12/07 17:32:29 $
(XEN) HVM3: Options: apmbios pcibios eltorito PMM
(XEN) HVM3:
(XEN) HVM3: ata0-0: PCHS=16383/16/63 translation=lba LCHS=1024/255/63
(XEN) HVM3: ata0 master: QEMU HARDDISK ATA-7 Hard-Disk (40960 MBytes)
(XEN) HVM3: IDE time out
(XEN) HVM3:
(XEN) HVM3:
(XEN) HVM3:
(XEN) HVM3: Press F12 for boot menu.
(XEN) HVM3:
(XEN) HVM3: Booting from Hard Disk...
(XEN) HVM3: Booting from 0000:7c00
(XEN) HVM3: int13_harddisk: function 15, unmapped device for ELDL=81
(XEN) HVM3: *** int 15h function AX=e980, BX=007e not yet supported!
(XEN) irq.c:270: Dom3 PCI link 0 changed 5 -> 0
(XEN) irq.c:270: Dom3 PCI link 1 changed 10 -> 0
(XEN) irq.c:270: Dom3 PCI link 2 changed 11 -> 0
(XEN) irq.c:270: Dom3 PCI link 3 changed 5 -> 0
(XEN) grant_table.c:1237:d3 Expanding dom (3) grant table from (4) to
(32) frames.
(XEN) irq.c:375: Dom3 callback via changed to GSI 28
(XEN) stdvga.c:151:d3 leaving stdvga



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.