[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] windows domU disk performance graph comparing hvm vs stubdom vs pv drivers



On Tue, Feb 23, 2010 at 01:14:47PM +0000, Stefano Stabellini wrote:
> On Mon, 22 Feb 2010, Keith Coleman wrote:
> > On Mon, Feb 22, 2010 at 12:27 PM, Stefano Stabellini
> > <stefano.stabellini@xxxxxxxxxxxxx> wrote:
> > > On Mon, 22 Feb 2010, Keith Coleman wrote:
> > >> On 2/22/10, Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx> wrote:
> > >> > On Fri, 19 Feb 2010, Keith Coleman wrote:
> > >> >> I am posting this to xen-devel instead of -users because it paints an
> > >> >> incomplete picture that shouldn't be the basis for deciding how to run
> > >> >> production systems.
> > >> >>
> > >> >> This graph shows the performance under a webserver disk IO workload at
> > >> >> different queue depths. It compares the 4 main IO methods for windows
> > >> >> guests that will be available in the upcoming xen 4.0.0 and 3.4.3
> > >> >> releases: pure HVM, stub domains, gplpv drivers, and xcp winpv
> > >> >> drivers.
> > >> >>
> > >> >> The gplpv and xcp winpv drivers have comparable performance with gplpv
> > >> >> being slightly faster. Both pv drivers are considerably faster than
> > >> >> pure hvm or stub domains. Stub domain performance was about even with
> > >> >> HVM which is lower than we were expecting. We tried a different cpu
> > >> >> pinning in "Stubdom B" with little impact.
> > >> >>
> > >> >
> > >> > What disk backend are you using?
> > >>
> > >> phy, LV
> > >>
> > >
> > > That is strange because in that configuration I get a far better
> > > disk bandwidth with stubdoms compared to qemu running in dom0.
> > >
> > 
> > What type of test are you doing?
> > 
> 
> these are the results I got a while ago running a simple "dd if=/dev/zero
> of=file" for 10 seconds:
> 
> qemu in dom0: 25.1 MB/s
> qemu in a stubdom: 56.7 MB/s
>

For dd tests you might want to use "oflag=direct" to make it use direct IO,
and not domU kernel cached.. also longer test would be good.
 
> 
> 
> I have just run just now "tiobench with --size 256 --numruns 4 --threads
> 4" using a raw file as a backend:
> 
> 
> qemu in dom0, using blktap2, best run:
> 
>  File  Blk   Num                   Avg      Maximum      Lat%     Lat%    CPU
>  Size  Size  Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
> ------ ----- ---  ------ ------ --------- -----------  -------- -------- -----
>  256   4096    4   85.82 108.6%     0.615     1534.10   0.00000  0.00000    79
> 
> qemu in a stubdom, using phy on a loop device, best run:
> 
>  File  Blk   Num                   Avg      Maximum      Lat%     Lat%    CPU
>  Size  Size  Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
> ------ ----- ---  ------ ------ --------- -----------  -------- -------- -----
>  256   4096    4  130.49 163.8%     0.345     1459.94   0.00000  0.00000    80
> 
> 
> These results as for the "sequential reads" test and rate is in
> megabytes per second.
> If I use phy on a loop device with qemu in dom0 unexpectedly I get much
> worse results.
> Same thing happens if I use tap:aio with qemu in a stubdom, but this is
> kind of expected since blktap is never going to be as fast as blkback.
> 

Hmm... what's the overall cpu usage difference, measured from the hypervisor? 
"xm top" or so..

-- Pasi


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.