|
|
|
|
|
|
|
|
|
|
xen-devel
[Xen-devel] Re: poor domU VBD performance.
Ian Pratt <m+Ian.Pratt <at> cl.cam.ac.uk> writes:
>
> > I'll check the xen block driver to see if there's anything
> > else that sticks out.
> >
> > Jens Axboe
>
> Jens, I'd really appreciate this.
>
> The blkfront/blkback drivers have rather evolved over time, and I don't
> think any of the core team fully understand the block-layer differences
> between 2.4 and 2.6.
>
> There's also some junk left in there from when the backend was in Xen
> itself back in the days of 1.2, though Vincent has prepared a patch to
> clean this up and also make 'refreshing' of vbd's work (for size
> changes), and also allow the blkfront driver to import whole disks
> rather than paritions. We had this functionality on 2.4, but lost it in
> the move to 2.6.
>
> My bet is that it's the 2.6 backend that is where the true perofrmance
> bug lies. Using a 2.6 domU blkfront talking to a 2.4 dom0 blkback seems
> to give good performance under a wide variety of circumstances. Using a
> 2.6 dom0 is far more pernickety. I agree with Andrew that I suspect it's
> the work queue changes are biting us when we don't have many outstanding
> requests.
>
> Thanks,
> Ian
>
I have done my simple dd on hde1 with two different setting of readahead:
256 sectors and 512 sectors.
These are the results:
DOM0 readahead 512s
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-
sz avgqu-sz await svctm %util
hde 115055.40 2.00 592.40 0.80 115647.80 22.40 57823.90 11.20
194.99 2.30 3.88 1.68 99.80
hda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %idle
0.20 0.00 31.60 14.20 54.00
DOMU readahead 512s
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-
sz avgqu-sz await svctm %util
hda1 0.00 0.20 0.00 0.00 0.00 3.20 0.00 1.60
0.00 0.00 0.00 0.00 0.00
hde1 102301.40 0.00 11571.00 0.00 113868.80 0.00 56934.40
0.00 9.84 68.45 5.92 0.09 100.00
avg-cpu: %user %nice %system %iowait %idle
0.00 0.00 35.00 65.00 0.00
DOM0 readahead 256s
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-
sz avgqu-sz await svctm %util
hde 28289.20 1.80 126.80 0.40 28416.00 17.60 14208.00 8.80
223.53 1.06 8.32 7.85 99.80
hda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %idle
0.20 0.00 1.60 5.60 92.60
DOMU readahead 256s
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-
sz avgqu-sz await svctm %util
hda1 0.00 0.20 0.00 0.40 0.00 4.80 0.00 2.40
12.00 0.00 0.00 0.00 0.00
hde1 25085.60 0.00 3330.40 0.00 28416.00 0.00 14208.00
0.00 8.53 30.54 9.17 0.30 100.00
avg-cpu: %user %nice %system %iowait %idle
0.20 0.00 1.40 98.40 0.00
What surprises me is that the service time for the request in DOM0 decreases
dramatically when readahead is increased from 256 to 512 sectors. If the output
of iostat is reliable, it tells me requests in DOMU are assembled to about 8
to 10 sectors in size, while DOM0 puts them together to about 200 or even more
sectors
Using readahead of 256 sectors results in a an average queuesize of anout 1
while changing readahead to 512 sectors results in an avaerage queuesize of
slightly above 2 on DOM0. Service times in DOM0 and readahead 256 sectors
seem to be in the range of the typical seek time of a modern ide disk while
it is significantly lower with readahead of 512 sectors.
As I have mentioned, this is the system with only one installed disk; this re-
sults in the write activity on the disk. The two write request per second
go into a different partition and those result in four required seeks per
second. This should not be a reason for all requests to take about seek time
as service time.
I have done a number of further test on various systems. In most cases I failed
to achieve service times below 8 msecs in Dom0; the only counterexample is
reported above. It seems to me, that at low readahead values the amount of
data requested for from disk is simply the readahead amount of data. This
request takes about seek time and thus I get lower performance when I work
with small readahead values.
What I do not understand at all is why throughput collapses with large
readahead
sizes.
I found in mm/readahead.c that the readahead size for a file is updated if
the readahead is not efficient. I suspect that the mechanism might lead to
readahed being switched of for this file.
With readahead being set to 2048 sectors, the product of avgq-sz and avgrq-sz
reported by drops to 4 to 5 physical pages.
Peter
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|
|
|