[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Poor SMP performance pv_ops domU



(Re-added cc: xen-devel)

On 05/19/2010 12:41 PM, John Morrison wrote:
> xentop for the cpu usage.
>
> We see the performance of a single core in domU when running a pv_ops kernel.
> Reboot domU with 2.6.18.8-xenU and performance jumps nearly 8 fold.
>   

Could you reproduce my experiment?  If you look at the CPU time
accumulated by each vcpu, is it incrementing at less than 1 vcpu
second/second?

> Pinned all 8 cpu's -  still the same results.
>
> Tried bare metal much better results.
>   

What do you mean by "much better"?  How does it compare to domu 2.6.18?

> We have seen this over 18 months on all pv kernel's we try.
>
> It's not any specific kernel - all pv kernel's we try have the same 
> performance impact.
>   

Do you mean pvops, or all PV Xen kernels?  How do the recent Novell
Xenlinux kernels perform?  Have you verified there are no expensive
debug options enabled?

BTW, is it a 32 or 64-bit guest?

    J

> John
>
> On 19 May 2010, at 18:44, Jeremy Fitzhardinge wrote:
>
>   
>> On 05/19/2010 09:24 AM, John Morrison wrote:
>>     
>>> I've tried with various kernel's today - pv_ops seems to only use 1 core 
>>> out of 8.
>>>
>>> PV spinlocks makes no difference.
>>>
>>> The thing that sticks out most is I cannot get the dom0 (xen-3.4.2) to show 
>>> more that about 99.7% cpu usage for any pv_ops kernel.
>>>
>>> #!/usr/bin/perl
>>>
>>> while () {}
>>>
>>> running 8 of these loads 2.6.18.8-xenU with nearly 800% cpu as shown in dom0
>>> running the same 8 in any pv_ops kernel's only gets as high as about 99.7%
>>>
>>>       
>> What tool are you using to show CPU use?
>>
>>     
>>> Inside the pv and xenU kernels top -s show all 8 cores being used.
>>>
>>>       
>> I tried to reproduce this:
>>
>>   1. I created a 4 vcpu pvops PV domain (4 pcpu host)
>>   2. Confirmed that all 4 vcpus are present with "cat /proc/cpuinfo" in
>>      the domain
>>   3. Ran 4 instances of ``perl -e "while(){}"&'' in the domain
>>   4. "top" within the domain shows 99% overall user time, no stolen
>>      time, with the perl processes each using 99% cpu time
>>   5. in dom0 "watch -n 1 xl vcpu-list <domain>" shows all 4 vcpus are
>>      consuming 1 vcpu second per second
>>   6. running a spin loop in dom0 makes top within the domain show
>>      16-25% stolen time
>>
>> Aside from top showing "99%" rather than ~400% as one might expect, it
>> all seems OK, and it looks like the vcpus are actually getting all the
>> CPU they're asking for.  I think the 99 vs 400 difference is just a
>> change in how the kernel shows its accounting (since there's been a lot
>> of change in that area between .18 and .32, including a whole new
>> scheduler).
>>
>> If you're seeing a real performance regression between .18 and .32,
>> that's interesting, but it would be useful to make sure you're comparing
>> apples to apples; in particular, isolating any performance effect
>> inherent in Linux's performance change from .18 -> .32, compared to
>> pvops vs xenU.
>>
>> So, things to try:
>>
>>    * make sure all the vcpus are actually enabled within your domain;
>>      if your adding them after the domain has booted, you need to make
>>      sure they get hot-plugged properly
>>    * make sure you don't have any expensive debug options enabled in
>>      your kernel config
>>    * run your benchmark on the 2.6.32 kernel booted native and compare
>>      it to pvops running under xen
>>    * compare it with the Novell 2.6.32 non-pvops kernel
>>    * try pinning the vcpus to physical cpus to eliminate any Xen
>>      scheduler effects
>>
>> Thanks,
>>    J
>>
>>     
>   


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.