Xen project Mailing List

Re: [Xen-devel] IO speed limited by size of IO request (for RBD driver)

To: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>

From: Felipe Franciosi <felipe.franciosi@xxxxxxxxxx>

Date: Thu, 23 May 2013 07:22:27 +0000

Accept-language: en-GB, en-US

Cc: "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>, Steven Haigh <netwiz@xxxxxxxxx>, Roger Pau Monne <roger.pau@xxxxxxxxxx>

Delivery-date: Thu, 23 May 2013 07:23:15 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Thread-index: AQHOQFWP3I8iSLUcukmoWIigt6sSlpjofQIAgAAC4QCAAMPSAIAAVnAAgAAMhgCAAAxUgIADJX6AgAC1DICAAAXiAIABAGgQgAxkcQCAAAOnAIAAIXYAgAADZYCAABidAIAWhrsAgADLxeg=

Thread-topic: [Xen-devel] IO speed limited by size of IO request (for RBD driver)

On 22 May 2013, at 21:13, "Konrad Rzeszutek Wilk" <konrad.wilk@xxxxxxxxxx> wrote: > On Wed, May 08, 2013 at 11:14:26AM +0000, Felipe Franciosi wrote: >> However we didn't "prove" it properly, I think it is worth mentioning that >> this boils down to what we originally thought it was: >> Steven's environment is writing to a filesystem in the guest. On top of >> that, it's using the guest's buffer cache to do the writes. > > If he is using O_DIRECT it bypasses the cache in the guest. Certainly, but the issues were when _not_ using O_DIRECT. F > >> This means that we cannot (easily?) control how the cache and the fs are >> flushing these writes through blkfront/blkback. >> >> In other words, it's very likely that it generates a workload that simply >> doesn't perform well on the "stock" PV protocol. >> This is a good example of how indirect descriptors help (remembering Roger >> and I were struggling to find use cases where indirect descriptors showed a >> substantial gain). >> >> Cheers, >> Felipe >> >> -----Original Message----- >> From: Roger Pau Monne >> Sent: 08 May 2013 11:45 >> To: Steven Haigh >> Cc: Felipe Franciosi; xen-devel@xxxxxxxxxxxxx >> Subject: Re: IO speed limited by size of IO request (for RBD driver) >> >> On 08/05/13 12:32, Steven Haigh wrote: >>> On 8/05/2013 6:33 PM, Roger Pau Monné wrote: >>>> On 08/05/13 10:20, Steven Haigh wrote: >>>>> On 30/04/2013 8:07 PM, Felipe Franciosi wrote: >>>>>> I noticed you copied your results from "dd", but I didn't see any >>>>>> conclusions drawn from experiment. >>>>>> >>>>>> Did I understand it wrong or now you have comparable performance on dom0 >>>>>> and domU when using DIRECT? >>>>>> >>>>>> domU: >>>>>> # dd if=/dev/zero of=output.zero bs=1M count=2048 oflag=direct >>>>>> 2048+0 records in >>>>>> 2048+0 records out >>>>>> 2147483648 bytes (2.1 GB) copied, 25.4705 s, 84.3 MB/s >>>>>> >>>>>> dom0: >>>>>> # dd if=/dev/zero of=output.zero bs=1M count=2048 oflag=direct >>>>>> 2048+0 records in >>>>>> 2048+0 records out >>>>>> 2147483648 bytes (2.1 GB) copied, 24.8914 s, 86.3 MB/s >>>>>> >>>>>> >>>>>> I think that if the performance differs when NOT using DIRECT, the issue >>>>>> must be related to the way your guest is flushing the cache. This must >>>>>> be generating a workload that doesn't perform well on Xen's PV protocol. >>>>> >>>>> Just wondering if there is any further input on this... While DIRECT >>>>> writes are as good as can be expected, NON-DIRECT writes in certain >>>>> cases (specifically with a mdadm raid in the Dom0) are affected by >>>>> about a 50% loss in throughput... >>>>> >>>>> The hard part is that this is the default mode of writing! >>>> >>>> As another test with indirect descriptors, could you change >>>> xen_blkif_max_segments in xen-blkfront.c to 128 (it is 32 by >>>> default), recompile the DomU kernel and see if that helps? >>> >>> Ok, here we go.... compiled as 3.8.0-2 with the above change. 3.8.0-2 >>> is running on both the Dom0 and DomU. >>> >>> # dd if=/dev/zero of=output.zero bs=1M count=2048 >>> 2048+0 records in >>> 2048+0 records out >>> 2147483648 bytes (2.1 GB) copied, 22.1703 s, 96.9 MB/s >>> >>> avg-cpu: %user %nice %system %iowait %steal %idle >>> 0.34 0.00 17.10 0.00 0.23 82.33 >>> >>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s >>> avgrq-sz avgqu-sz await svctm %util >>> sdd 980.97 11936.47 53.11 429.78 4.00 48.77 >>> 223.81 12.75 26.10 2.11 101.79 >>> sdc 872.71 11957.87 45.98 435.67 3.55 49.30 >>> 224.71 13.77 28.43 2.11 101.49 >>> sde 949.26 11981.88 51.30 429.33 3.91 48.90 >>> 225.03 21.29 43.91 2.27 109.08 >>> sdf 915.52 11968.52 48.58 428.88 3.73 48.92 >>> 225.84 21.44 44.68 2.27 108.56 >>> md2 0.00 0.00 0.00 1155.61 0.00 97.51 >>> 172.80 0.00 0.00 0.00 0.00 >>> >>> # dd if=/dev/zero of=output.zero bs=1M count=2048 oflag=direct >>> 2048+0 records in >>> 2048+0 records out >>> 2147483648 bytes (2.1 GB) copied, 25.3708 s, 84.6 MB/s >>> >>> avg-cpu: %user %nice %system %iowait %steal %idle >>> 0.11 0.00 13.92 0.00 0.22 85.75 >>> >>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s >>> avgrq-sz avgqu-sz await svctm %util >>> sdd 0.00 13986.08 0.00 263.20 0.00 55.76 >>> 433.87 0.43 1.63 1.07 28.27 >>> sdc 202.10 13741.55 6.52 256.57 0.81 54.77 >>> 432.65 0.50 1.88 1.25 32.78 >>> sde 47.96 11437.57 1.55 261.77 0.19 45.79 >>> 357.63 0.80 3.02 1.85 48.60 >>> sdf 2233.37 11756.13 71.93 191.38 8.99 46.80 >>> 433.90 1.49 5.66 3.27 86.15 >>> md2 0.00 0.00 0.00 731.93 0.00 91.49 >>> 256.00 0.00 0.00 0.00 0.00 >>> >>> Now this is pretty much exactly what I would expect the system to do.... >>> ~96MB/sec buffered, and 85MB/sec direct. >> >> I'm sorry to be such a PITA, but could you also try with 64? If we have to >> increase the maximum number of indirect descriptors I would like to set it >> to the lowest value that provides good performance to prevent using too much >> memory. >> >>> So - it turns out that xen_blkif_max_segments at 32 is a killer in the >>> DomU. Now it makes me wonder what we can do about this in kernels that >>> don't have your series of patches against it? And also about the >>> backend stuff in 3.8.x etc? >> >> There isn't much we can do regarding kernels without indirect descriptors, >> there's no easy way to increase the number of segments in a request. >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@xxxxxxxxxxxxx >> http://lists.xen.org/xen-devel >> _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.