I attached the event counting patch. It should make sense, but it was
against XCP, so it may not apply all smoothly for you. If you really
want to play with it, and manage to find more interesting
storage/system/patch combinations than the one quoted below, that'd
probaby be really useful.
Daniel
On Wed, 2011-06-01 at 13:49 -0400, Daniel Stodden wrote:
> >
> > I think it would make sense to enhance blkfront design to increment
> > req_prod as soon as it processes an I/O while batching irq generation.
> > When blkfront and blkback are busy processing continuous stream of I/O
> > requests, it would be great if blkfront-blkback pipeline is able to
> > process them without generating unnecessary interrupts while improving I/O
> > latency performance.
>
>
> > Thoughts ? Any historical context on why this might
> > be bad ?
>
> No, and it's a good idea, and your assumptions are imho correct.
>
> The extreme case is an PUSH_REQUESTS_AND_CHECK_NOTIFY after each
> request. Even in this case, the majority of interrupts should be held
> off by looking at req_event. There are certainly variants, but I can
> show you some results for what you're asking. because I happened to try
> that just last week.
>
> Next consider adding a couple event counters in the backend. And you
> need the patch above, of course.
>
> req_event: Frontend event received
> rsp_event: Frontend notification sent
> req_again: FINAL_CHECK indicated more_to_do.
> boost-1, order=0:
>
> (This is the unmodified version)
>
> dd if=/dev/zero of=/dev/xvdc bs=1M count=1024
> 1073741824 bytes (1.1 GB) copied, 17.0105 s, 63.1 MB/s
> 1073741824 bytes (1.1 GB) copied, 17.7566 s, 60.5 MB/s
> 1073741824 bytes (1.1 GB) copied, 17.163 s, 62.6 MB/s
>
> rsp_event 6759
> req_event 6753
> req_again 16
>
> boost-2, order=0
>
> (This was the aggressive one, one PUSH_NOTIFY per ring request).
>
> dd if=/dev/zero of=/dev/xvdc bs=1M count=1024
> 1073741824 bytes (1.1 GB) copied, 17.3208 s, 62.0 MB/s
> 1073741824 bytes (1.1 GB) copied, 17.4851 s, 61.4 MB/s
> 1073741824 bytes (1.1 GB) copied, 17.7333 s, 60.5 MB/s
>
> rsp_event 7122
> req_event 7141
> req_again 5497
>
>
> So the result is that the event load even in the most aggressive case
> will typically increase only moderately. Instead, the restored outer
> loop in the dispatcher just starts to play out.
>
> I'm not proposing this as the final solution, we might chose to be more
> careful and limit event emission to some stride instead.
>
> Don't be confused by the throughput values not going up, the problem I
> had with the array (iSCSI in this case) just turned out to be elsewhere.
> I'm pretty convinced there are workload/storage configurations which
> benefit from that.
>
> In the case at hand, increasing the ring size was way more productive.
> At which point the queue depth multiplies as well. And I currently
> expect that the longer it gets the more urgent the issue you describe
> will be.
>
> But I also think it needs some experiments and wants to be backed by
> numbers.
>
> Daniel
blkback-final-check-stats.diff
Description: Text Data
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|