WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Odd blkdev throughput results

> > The big thing is that on network RX it is currently dom0 that does the
> > copy. In the CMP case this leaves the data in the shared cache ready to
> > be accessed by the guest. In the SMP case it doesn't help at all. In
> > netchannel2 we're moving the copy to the guest CPU, and trying to
> > eliminate it with smart hardware.
> >
> > Block IO doesn't require a copy at all.
>
> Well, not in blkback by itself, but certainly from the in-memory disk
> image. Unless I misunderstoode Keirs post recently, page flipping is
> basically dead code, so I thought the number should at least point into
> roughly the same directions.

Blkback has always DMA-ed directly into guest memory when reading data from 
the disk drive (normal usecase), in which case there's no copy - I think that 
was Ian's point.  In contrast the Netback driver has to do a copy in the 
normal case.

If you're using a ramdisk then there must be a copy somewhere, although I'm 
not sure exactly where it happens!

Cheers,
Mark

> > > This is not my question. What strikes me is that for the blkdev
> > > interface, the CMP setup is 13% *slower* than SMP, at 661.99 MB/s.
> > >
> > > Now, any ideas? I'm mildly familiar with both netback and blkback, and
> > > I'd never expected something like that. Any hint appreciated.
> >
> > How stable are your results with hdparm? I've never really trusted it as
> > a benchmarking tool.
>
> So far, all the experiments I've done look fairly reasonable. Standard
> deviance is low, and since I've been tracing netback reads I'm fairly
> confident that the volume wasn't been left in domU memory somewhere.
>
> I'm not so much interested in bio or physical disk performance, but
> relative performance of how much can be squeezed through the buffer ring
> before and after applying some changes. It's hardly a physical disk
> benchmark, but it's simple and for the purpose given it seems okay.
>
> > The ramdisk isn't going to be able to DMA data into the domU's buffer on
> >  a read, so it will have to copy it.
>
> Right...
>
> > The hdparm running in domU probably
> >  doesn't actually look at any of the data it requests, so it stays local
> >  to the dom0 CPU's cache (unlike a real app).
>
> hdparm performs sequential 2MB-read()s over a 3s period. It's not
> calling the block layer directly or something. That'll certainly hit
> domU caches?
>
> > Doing all that copying
> >  in dom0 is going to beat up the domU in the shared cache in the CMP
> >  case, but won't effect it as much in the SMP case.
>
> Well, I could live with blaming L2 footprint. Just wanted to hear if
> someone has different explanations. And I would expect similar results
> on net RX then, but I may be mistaken.
>
> Furthermore, I need to apologize because I failed to use netperf
> correctly and managed to report the TX path on my original post :P. The
> real numbers are rather 885.43 (SMP) vs. 1295.46 (CMP), but the
> difference compared to blk reads as such stays the same.
>
> regards,
> daniel



-- 
Push Me Pull You - Distributed SCM tool (http://www.cl.cam.ac.uk/~maw48/pmpu/)

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>