WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Performance issues on dom0

To: xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Xen-devel] Performance issues on dom0
From: Ozgur Akan <ozgurakan@xxxxxxxxx>
Date: Tue, 18 Aug 2009 12:03:42 -0400
Delivery-date: Tue, 18 Aug 2009 09:04:16 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <8D39BA5C-B123-4576-A836-32DCA92F8CA9@xxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <8D39BA5C-B123-4576-A836-32DCA92F8CA9@xxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Hi,

There are a few tests (like Pipe Throughput , Pipe-based Context Switching) which give very bad results under xen enabled kernel. I benchmarked similar results and I am not sure why.

Also I am not sure how these may effect real server performance. I believe that running an apache (or any other service) and sending requests to it on xen enabled vs vanilla kernel makes more sense to measure the performance differences.

Below is the performance comparison between two.


Non-xen kernel xen kernel
Execl Throughput 1 0.28
File Copy 1024 bufsize 2000 maxblocks 1 0.76
File Copy 256 bufsize 500 maxblocks 1 0.78
File Copy 4096 bufsize 8000 maxblocks 1 0.71
Pipe Throughput 1 0.25
Pipe-based Context Switching 1 0.31
Process Creation 1 0.25
Shell Scripts (1 concurrent) 1 0.36
Shell Scripts (16 concurrent) 1 0.39
Shell Scripts (8 concurrent) 1 0.39
System Call Overhead 1 0.53


regards,
OZ

2009/8/18 Fr�ric VANNI�E <frederic@xxxxxxxxxxx>
Hello,

I'm benching a brand new Nehalem server and I've noticed performance problems when running xen0 without any VMs.

Xen : 3.4.1
OS: Debian 5.0 amd64
HW: Supermicro Dual-Xeon Nehalem L5520 2,27 GHz, 24GB memory, 16 CPUS (8 cores * 2 HT)
native kernel : 2.6.30.4
dom0 kernel 1 : 2.6.18-xen0
dom0 kernel 2 : 2.6.30.2-xen0 (SuSe patchs + config from http://x17.eu/xen)

Benchs: tar xjf linux2.6.30.4.tar.bz2
� � � �build 2.6.30.4 with the default config and "make -j16"
� � � �filebench (mail)
� � � �unixbench -c 16 system

The performances of 2.6.18-xen0 and 2.6.30.2-xen0 are very close (2.6.30 is a little faster) and the filesystem benchmark gives the same values for all kernels.


The dom0 has all the memory, all the vCPUS, no VMs is running, and each vCPU is pinned on a rCPU.

Linux elrond 2.6.30.2-xen0 #4 SMP Fri Aug 14 14:31:11 CEST 2009 x86_64 GNU/Linux

Name � � � � � � � � � � � � � � � � � � � �ID � Mem VCPUs � � �State � Time(s)
Domain-0 � � � � � � � � � � � � � � � � � � 0 24147 � �16 � � r----- �23128.5

1. tar xjf
�- native : 13 seconds
�- dom0 : 21 seconds
2. make -j16 linux :
�- native : 56 seconds
�- dom0 : 67 seconds
3. filebench :
�- native : 4268 iops
�- dom0 : 4219 iops
4. unixbench :
�- native : 5262
�- dom0 : 2200 !!!! --> very bad �(1810 with 2.6.18-xen0)




Any idea on the cause ?


Regards,




=========== �Unixbench : native ==============
Benchmark Run: mar ao�2009 13:24:36 - 13:51:10
16 CPUs in system; running 16 parallel copies of tests

Execl Throughput � � � � � � � � � � � � � � �40751.3 lps � (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks � � � �395854.5 KBps �(30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks � � � � �103076.1 KBps �(30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks � � � 1363078.3 KBps �(30.0 s, 2 samples)
Pipe Throughput � � � � � � � � � � � � � �15770868.9 lps � (10.0 s, 7 samples)
Pipe-based Context Switching � � � � � � � �3671269.3 lps � (10.0 s, 7 samples)
Process Creation � � � � � � � � � � � � � � 110360.2 lps � (30.0 s, 2 samples)
Shell Scripts (1 concurrent) � � � � � � � � �65731.1 lpm � (60.0 s, 2 samples)
Shell Scripts (16 concurrent) � � � � � � � � �4916.5 lpm � (60.1 s, 2 samples)
Shell Scripts (8 concurrent) � � � � � � � � � 9776.3 lpm � (60.0 s, 2 samples)
System Call Overhead � � � � � � � � � � � �6840143.9 lps � (10.0 s, 7 samples)

System Benchmarks Partial Index � � � � � � �BASELINE � � � RESULT � �INDEX
Execl Throughput � � � � � � � � � � � � � � � � 43.0 � � �40751.3 � 9477.1
File Copy 1024 bufsize 2000 maxblocks � � � � �3960.0 � � 395854.5 � �999.6
File Copy 256 bufsize 500 maxblocks � � � � � �1655.0 � � 103076.1 � �622.8
File Copy 4096 bufsize 8000 maxblocks � � � � �5800.0 � �1363078.3 � 2350.1
Pipe Throughput � � � � � � � � � � � � � � � 12440.0 � 15770868.9 �12677.5
Pipe-based Context Switching � � � � � � � � � 4000.0 � �3671269.3 � 9178.2
Process Creation � � � � � � � � � � � � � � � �126.0 � � 110360.2 � 8758.7
Shell Scripts (1 concurrent) � � � � � � � � � � 42.4 � � �65731.1 �15502.6
Shell Scripts (16 concurrent) � � � � � � � � � � --- � � � 4916.5 � � �---
Shell Scripts (8 concurrent) � � � � � � � � � � �6.0 � � � 9776.3 �16293.8
System Call Overhead � � � � � � � � � � � � �15000.0 � �6840143.9 � 4560.1
� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ========
System Benchmarks Index Score (Partial Only) � � � � � � � � � � � � 5262.1


=========== �Unixbench : dom0 2.6.30.2 ==============
Benchmark Run: mar ao�2009 15:07:45 - 15:34:31
16 CPUs in system; running 16 parallel copies of tests

Execl Throughput � � � � � � � � � � � � � � �11593.1 lps � (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks � � � �299238.3 KBps �(30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks � � � � � 79926.1 KBps �(30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks � � � �974072.8 KBps �(30.0 s, 2 samples)
Pipe Throughput � � � � � � � � � � � � � � 3922840.5 lps � (10.1 s, 7 samples)
Pipe-based Context Switching � � � � � � � �1125963.3 lps � (10.0 s, 7 samples)
Process Creation � � � � � � � � � � � � � � �27045.1 lps � (30.0 s, 2 samples)
Shell Scripts (1 concurrent) � � � � � � � � �23418.9 lpm � (60.0 s, 2 samples)
Shell Scripts (16 concurrent) � � � � � � � � �1939.5 lpm � (60.2 s, 2 samples)
Shell Scripts (8 concurrent) � � � � � � � � � 3826.3 lpm � (60.1 s, 2 samples)
System Call Overhead � � � � � � � � � � � �3592086.9 lps � (10.1 s, 7 samples)

System Benchmarks Partial Index � � � � � � �BASELINE � � � RESULT � �INDEX
Execl Throughput � � � � � � � � � � � � � � � � 43.0 � � �11593.1 � 2696.1
File Copy 1024 bufsize 2000 maxblocks � � � � �3960.0 � � 299238.3 � �755.7
File Copy 256 bufsize 500 maxblocks � � � � � �1655.0 � � �79926.1 � �482.9
File Copy 4096 bufsize 8000 maxblocks � � � � �5800.0 � � 974072.8 � 1679.4
Pipe Throughput � � � � � � � � � � � � � � � 12440.0 � �3922840.5 � 3153.4
Pipe-based Context Switching � � � � � � � � � 4000.0 � �1125963.3 � 2814.9
Process Creation � � � � � � � � � � � � � � � �126.0 � � �27045.1 � 2146.4
Shell Scripts (1 concurrent) � � � � � � � � � � 42.4 � � �23418.9 � 5523.3
Shell Scripts (16 concurrent) � � � � � � � � � � --- � � � 1939.5 � � �---
Shell Scripts (8 concurrent) � � � � � � � � � � �6.0 � � � 3826.3 � 6377.2
System Call Overhead � � � � � � � � � � � � �15000.0 � �3592086.9 � 2394.7
� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ========
System Benchmarks Index Score (Partial Only) � � � � � � � � � � � � 2200.0








_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
<Prev in Thread] Current Thread [Next in Thread>