[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Shouldn't backend devices for VMX domain disks be opened with O_DIRECT?

Philip R. Auld wrote:


Rumor has it that on Thu, Feb 02, 2006 at 04:28:37PM -0600 Steve Dobbelstein 
aliguori@xxxxxxxxxxxxxxxxxxxxxxx wrote on 02/02/2006 03:46:11 PM:

I would doubt it.  Since it's usually opening a file, and qemu-dm is
emulating a contigous disk, you probably want the buffer cache to
reorder events.
I guess we're not usual since our backend is an LVM volume. :)

I can appreciate how writing to the buffer cache can speed up the response
to the I/O and make it more efficient in its writing to the backend device
by reordering events.  However, I'm still wondering if we have a data
corruption issue should dom0 crash before it writes the data in the buffer
cache to disk, data that the domain expects to be on the disk but won't be
there when the domain is restarted.

I agree. It sounds like a correctness problem. It's just like disks
with write caching enabled.
Referring to the original question, which has been quoted away, journaling doesn't require that data be written to disk per-say but that writes occur in a particular order. A journal is always recoverable given that writes occur in the expected order. A buffer cache will have no effect on that order so you're no more likely to have corruption than if you disabled the buffer cache.

You especially want the buffer cache if you have LVM partitions. Sectors on an LVM disk are not necessarily contiguous and can even span multiple disks. You definitely want the IO scheduler involved there.

If anything, what you really want (from a performance perspective) is to disable the buffer cache in the domU and leave it enabled in the dom0 (this is what the paravirtual drivers should be doing IIRC).

Does this address your corruption concerns?


Anthony Liguori

Are you seeing a performance improvement?  Should be easy to check.

It's more about correctness and data integrity than performance.



We just started doing the first runs of disk performance tests when we
noticed this behavior and thought we should bring it up on the list.  We
don't have enough data points to compare yet.  We'll post problems/issues
if/when we find them.

Steve D.

Xen-devel mailing list

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.