[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Xen-devel] [PATCH RFC V4 0/5] kvm : Paravirt-spinlock support	for KVM guests
 
- To: Alexander Graf <agraf@xxxxxxx>
 
- From: Raghavendra K T <raghavendra.kt@xxxxxxxxxxxxxxxxxx>
 
- Date: Wed, 18 Jan 2012 00:06:30 +0530
 
- Cc: Jeremy Fitzhardinge <jeremy@xxxxxxxx>, Greg Kroah-Hartman <gregkh@xxxxxxx>,	Gleb Natapov <gleb@xxxxxxxxxx>, linux-doc@xxxxxxxxxxxxxxx,	Peter Zijlstra <peterz@xxxxxxxxxxxxx>, Jan Kiszka <jan.kiszka@xxxxxxxxxxx>,	Srivatsa Vaddagiri <vatsa@xxxxxxxxxxxxxxxxxx>,	Randy Dunlap <rdunlap@xxxxxxxxxxxx>, Paul Mackerras <paulus@xxxxxxxxx>,	"H. Peter Anvin" <hpa@xxxxxxxxx>,	Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>,	Xen <xen-devel@xxxxxxxxxxxxxxxxxxx>,	Dave Jiang <dave.jiang@xxxxxxxxx>, KVM <kvm@xxxxxxxxxxxxxxx>,	Glauber Costa <glommer@xxxxxxxxxx>, X86 <x86@xxxxxxxxxx>,	Ingo Molnar <mingo@xxxxxxxxxx>, Avi Kivity <avi@xxxxxxxxxx>,	Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>,	Sasha Levin <levinsasha928@xxxxxxxxx>, Sedat Dilek <sedat.dilek@xxxxxxxxx>,	Thomas Gleixner <tglx@xxxxxxxxxxxxx>,	Virtualization <virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx>,	Marcelo Tosatti <mtosatti@xxxxxxxxxx>, LKML <linux-kernel@xxxxxxxxxxxxxxx>,	Dave Hansen <dave@xxxxxxxxxxxxxxxxxx>,	Suzuki Poulose <suzuki@xxxxxxxxxxxxxxxxxx>,	Rob Landley <rlandley@xxxxxxxxxxxxx>
 
- Delivery-date: Wed, 18 Jan 2012 10:35:30 +0000
 
- List-id: Xen developer discussion <xen-devel.lists.xensource.com>
 
 
 
On 01/17/2012 11:09 PM, Alexander Graf wrote:
 
On 17.01.2012, at 18:27, Raghavendra K T wrote:
 
On 01/17/2012 12:12 AM, Alexander Graf wrote:
 
On 16.01.2012, at 19:38, Raghavendra K T wrote:
 
On 01/16/2012 07:53 PM, Alexander Graf wrote:
 
On 16.01.2012, at 15:20, Srivatsa Vaddagiri wrote:
 
* Alexander Graf<agraf@xxxxxxx>    [2012-01-16 04:57:45]:
 
Speaking of which - have you benchmarked performance degradation of pv ticket 
locks on bare metal?
 
 
You mean, run kernel on bare metal with CONFIG_PARAVIRT_SPINLOCKS
enabled and compare how it performs with CONFIG_PARAVIRT_SPINLOCKS disabled for
some workload(s)?
 
 
Yup
 
In some sense, the 1x overcommitcase results posted does measure the overhead
of (pv-)spinlocks no? We don't see any overhead in that case for atleast
kernbench ..
 
Result for Non PLE machine :
============================
 
 
[snip]
 
Kernbench:
               BASE                    BASE+patch
 
 
What is BASE really? Is BASE already with the PV spinlocks enabled? I'm having 
a hard time understanding which tree you're working against, since the 
prerequisites aren't upstream yet.
Alex
 
 
Sorry for confusion, I think I was little imprecise on the BASE.
The BASE is pre 3.2.0 + Jeremy's following patches:
xadd (https://lkml.org/lkml/2011/10/4/328)
x86/ticketlocklock  (https://lkml.org/lkml/2011/10/12/496).
So this would have ticketlock cleanups from Jeremy and
CONFIG_PARAVIRT_SPINLOCKS=y
BASE+patch = pre 3.2.0 + Jeremy's above patches + above V5 PV spinlock
series and CONFIG_PARAVIRT_SPINLOCKS=y
In both the cases  CONFIG_PARAVIRT_SPINLOCKS=y.
So let,
A. pre-3.2.0 with CONFIG_PARAVIRT_SPINLOCKS = n
B. pre-3.2.0 + Jeremy's above patches with CONFIG_PARAVIRT_SPINLOCKS = n
C. pre-3.2.0 + Jeremy's above patches with CONFIG_PARAVIRT_SPINLOCKS = y
D. pre-3.2.0 + Jeremy's above patches + V5 patches with 
CONFIG_PARAVIRT_SPINLOCKS = n
E. pre-3.2.0 + Jeremy's above patches + V5 patches with 
CONFIG_PARAVIRT_SPINLOCKS = y
is it performance of A vs E ? (currently C vs E)
 
 
Since D and E only matter with KVM in use, yes, I'm mostly interested in A, B 
and C :).
Alex
 
 
setup :
Native: IBM xSeries with Intel(R) Xeon(R) x5570 2.93GHz CPU with 8 core , 64GB 
RAM, (16 cpu online)
Guest : Single guest with 8 VCPU 4GB Ram.
benchmark : kernbench -f -H -M -o 20
Here is the result :
Native Run
============
case A               case B             %improvement   case C  %improvement
56.1917 (2.57125)    56.035 (2.02439)   0.278867       56.27 (2.40401)   
-0.139344      
 
 
This looks a lot like statistical derivation. How often did you execute the 
test case? Did you make sure to have a clean base state every time?
Maybe it'd be a good idea to create a small in-kernel microbenchmark with a 
couple threads that take spinlocks, then do work for a specified number of 
cycles, then release them again and start anew. At the end of it, we can check 
how long the whole thing took for n runs. That would enable us to measure the 
worst case scenario.
 
 
 It was a quick test.  two iteration of kernbench (=6runs) and had 
ensured cache is cleared.
echo "1" > /proc/sys/vm/drop_caches
ccache -C. Yes may be I can run test as you mentioned..
 
Guest Run
============
case A               case B             %improvement   case C  %improvement
166.999 (15.7613)    161.876 (14.4874)  3.06768        161.24 (12.6497)  3.44852
 
 
Is this the same machine? Why is the guest 3x slower?
  Yes non - ple machine but with all 16 cpus online. 3x slower you meant 
case A is slower (pre-3.2.0 with CONFIG_PARAVIRT_SPINLOCKS = n) ?
 
Alex
 
We do not see much overhead in native run with CONFIG_PARAVIRT_SPINLOCKS = y
 
 
 
 
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
 
 
    
     |