On 11/25/2010 11:30 AM, Ian Jackson wrote:
> Christoph Hellwig writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH] qemu and
> qemu-xen: support empty write barriers in xen_disk"):
>> On Wed, Nov 24, 2010 at 10:18:40AM -0800, Jeremy Fitzhardinge wrote:
>>> Linux wants is a useful thing to do and implement (especially since it
>>> amounts to standardising the ?BSD extension). I'm not sure of their
>>> precise semantics (esp WRT ordering), but I think its already OK.
>> The nice bit is that a pure flush does not imply any odering at all.
>> Which is how the current qemu driver implements the barrier requests
>> anyway, so that needs some fixing.
> Thanks for your comments. Does that mean, though, that Stefano's
> patch is actually making the situation worse, or simply that it isn't
> making it as good as it should be ?
The latter. There's a question over whether WRITE_BARRIER really
supports empty barriers, since it appears that none of the existing
backends implement it correctly - but on the other hand, the kernel
blkback code does *try* to implement it, even though it fails. This
change makes empty WRITE_BARRIERS work in qemu, which is helpful because
the upstream blkfront tries to use them.
But WRITE_BARRIER is fundamentally suboptimal for Linux's needs because
it is a fully ordered barrier operation. What Linux needs is a simple
FLUSH operation which just makes sure that previously completed writes
are fully flushed out of any caches and buffers and are really on
durable storage. It has no ordering requirements, so it doesn't prevent
subsequent writes from being handled while the flush is going on.
Christoph is therefore recommending that we add a specific FLUSH
operation to the protocol with these properties so that we can achieve
the best performance. But if the backend lacks FLUSH, we still need a
reliable WRITE_BARRIER.
(But it would be very sad that if, in practice, most backends in the
field fail empty WRITE_BARRIER operations, leaving guests with no
mechanism to force writes to stable storage. In that case I guess we'll
need to look at some other hacky thing to try and make it work...)
J
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|