Re: [Xen-devel] Poor SMP performance pv_ops domU

On 05/19/2010 09:24 AM, John Morrison wrote:
> I've tried with various kernel's today - pv_ops seems to only use 1 core out 
> of 8.
> PV spinlocks makes no difference.
> The thing that sticks out most is I cannot get the dom0 (xen-3.4.2) to show 
> more that about 99.7% cpu usage for any pv_ops kernel.
> #!/usr/bin/perl
> while () {}
> running 8 of these loads with nearly 800% cpu as shown in dom0
> running the same 8 in any pv_ops kernel's only gets as high as about 99.7%

What tool are you using to show CPU use?

> Inside the pv and xenU kernels top -s show all 8 cores being used.

I tried to reproduce this:

   1. I created a 4 vcpu pvops PV domain (4 pcpu host)
   2. Confirmed that all 4 vcpus are present with "cat /proc/cpuinfo" in
      the domain
   3. Ran 4 instances of ``perl -e "while(){}"&'' in the domain
   4. "top" within the domain shows 99% overall user time, no stolen
      time, with the perl processes each using 99% cpu time
   5. in dom0 "watch -n 1 xl vcpu-list <domain>" shows all 4 vcpus are
      consuming 1 vcpu second per second
   6. running a spin loop in dom0 makes top within the domain show
      16-25% stolen time

Aside from top showing "99%" rather than ~400% as one might expect, it
all seems OK, and it looks like the vcpus are actually getting all the
CPU they're asking for.  I think the 99 vs 400 difference is just a
change in how the kernel shows its accounting (since there's been a lot
of change in that area between .18 and .32, including a whole new

If you're seeing a real performance regression between .18 and .32,
that's interesting, but it would be useful to make sure you're comparing
apples to apples; in particular, isolating any performance effect
inherent in Linux's performance change from .18 -> .32, compared to
pvops vs xenU.

So, things to try:

    * make sure all the vcpus are actually enabled within your domain;
      if your adding them after the domain has booted, you need to make
      sure they get hot-plugged properly
    * make sure you don't have any expensive debug options enabled in
      your kernel config
    * run your benchmark on the 2.6.32 kernel booted native and compare
      it to pvops running under xen
    * compare it with the Novell 2.6.32 non-pvops kernel
    * try pinning the vcpus to physical cpus to eliminate any Xen
      scheduler effects


