This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] Shouldn't backend devices for VMX domain disks be opened

To: Stephen Tweedie <sct@xxxxxxxxxx>
Subject: Re: [Xen-devel] Shouldn't backend devices for VMX domain disks be opened with O_DIRECT?
From: Anthony Liguori <aliguori@xxxxxxxxxx>
Date: Thu, 02 Feb 2006 20:50:28 -0600
Cc: Steve Dobbelstein <steved@xxxxxxxxxx>, "Philip R. Auld" <pauld@xxxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Fri, 03 Feb 2006 03:00:44 +0000
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <1138934528.4374.13.camel@xxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <43E27DA3.80405@xxxxxxxxxx> <OF4FC3AD2A.9B8EA7AB-ON06257109.007A4F76-06257109.007B7876@xxxxxxxxxx> <20060202224106.GC17266@xxxxxxxxxxxxxxxxxx> <43E29F27.10009@xxxxxxxxxx> <1138934528.4374.13.camel@xxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla Thunderbird 1.0.7 (X11/20051013)
Stephen Tweedie wrote:


On Thu, 2006-02-02 at 18:09 -0600, Anthony Liguori wrote:

Referring to the original question, which has been quoted away, journaling doesn't require that data be written to disk per-say but that writes occur in a particular order. A journal is always recoverable given that writes occur in the expected order.

Sure... it's *internally* consistent, maybe.  But you need more than
that.  You need guarantees that things are on disk, else external
consistency guarantees will be broken.
Ok, this is certainly correct (but not the original point).

Consider things like sendmail fsync()ing a spool file before telling the
sender that the email has been accepted.  After that acknowledgement,
the sender can delete the mail from its queues knowing that the
recipient MTA definitely has the data, and even if it crashes, the mail
won't be lost.  Databases frequently have similar consistency
requirements.  If a power failure loses writes that you have told the
domU have completed --- even if you maintain write ordering --- then you
*are* putting application correctness at risk, there's no doubt about
Ok, this is a good argument for using O_SYNC.

Fortunately, that's just what blkback is doing --- it's using submit_bio
to submit the write IOs without waiting for completion, and is using the
bio's bi_end_io callback to process the IO completion once it is hard on
Yup, the question here is with the device model which doesn't use the block frontend/backend. Would O_DIRECT be helpful over O_SYNC?


Anthony Liguori


Xen-devel mailing list