[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [patch] barrier support for blk{front,back}



Ian Pratt wrote:
>> This patch adds support for barriers to blk{back,front} drivers.
> 
> It's good to see barrier supported added.
> 
> Out of interest, what was your motivation for adding it?

Trying to fix some problems of loop-file backed virtual block devices.
For SLES10 we have a patch which adds a syncronous mode to the loop
driver (by opening the file with O_SYNC).  It solves the problem of loop
doing too much buffering and screw up journaling filesystems, but is
dead slow.  When using barriers instead the performance should become
better without the risc to kill the filesystem by ignoring write
ordering.  There is also a patch in the queue (for mainline) which adds
barrier support to loop devices, attached below for reference.

> Which file systems use it, and do you see a worthwhile performance
> gain from the extra disk scheduling flexibility?

All journaling filesystems should be able to use them.  ext3 and
reiserfs do for sure, although they are not enabled by default, you need
the barrier=1 (ext3) and barrier=flush (reiser) mount options.  Don't
know what xfs and jfs are doing by default.

No benchmarks yet, sorry.  I finished the patch just the day before the
summit on my notebook, which is way to slow for serious performance
tests.  Beside that I simply had no time yet.  I can run some next week.

> We are going to have to think through what the impact of this would
> be in the live relocation block safety optimizations Andy Warfield 
> described at the summit. The simple thing is just to revert to
> stalling until the backend gives the all clear if there's a barrier
> in the queue.

Hmm, yes, the frontend driver better should take care that there isn't
an barrier request in flight.  Doing that should also reduce the risc to
corrupt the filesystem in the (already unlikely) case that the writes on
the host the machine is migrated from are ending up on disk after the
ones resubmitted from the host the machine is migrated to.

jetlagged greetings from europe,

  Gerd

-- 
Gerd Hoffmann <kraxel@xxxxxxx>
--- linux-2.6.16/drivers/block/loop.c~  2006-06-29 13:22:37.000000000 +0200
+++ linux-2.6.16/drivers/block/loop.c   2006-06-29 13:28:17.000000000 +0200
@@ -467,16 +467,58 @@
        return ret;
 }
 
+/*
+ * This is best effort. We really wouldn't know what to do with a returned
+ * error. This code is taken from the implementation of fsync.
+ */
+static int sync_file(struct file * file)  
+{
+       struct address_space *mapping;
+       int ret;
+
+       if (!file->f_op || !file->f_op->fsync)
+               return -EOPNOTSUPP;
+
+       mapping = file->f_mapping;
+
+       ret = filemap_fdatawrite(mapping);
+       if (!ret) {
+               /*
+                * We need to protect against concurrent writers,
+                * which could cause livelocks in fsync_buffers_list
+                */
+               mutex_lock(&mapping->host->i_mutex);
+               ret = file->f_op->fsync(file, file->f_dentry, 1);
+               mutex_unlock(&mapping->host->i_mutex);
+
+               filemap_fdatawait(mapping);
+       }
+
+       return ret;
+}
+
 static int do_bio_filebacked(struct loop_device *lo, struct bio *bio)
 {
        loff_t pos;
        int ret;
+       int sync = bio_sync(bio);
+       int barrier = bio_barrier(bio);
+
+       if (barrier) {
+               ret = sync_file(lo->lo_backing_file);
+               if (unlikely(ret))
+                       return ret;
+       }
 
        pos = ((loff_t) bio->bi_sector << 9) + lo->lo_offset;
        if (bio_rw(bio) == WRITE)
                ret = lo_send(lo, bio, lo->lo_blocksize, pos);
        else
                ret = lo_receive(lo, bio, lo->lo_blocksize, pos);
+
+       if ((barrier || sync) && !ret)
+               ret = sync_file(lo->lo_backing_file);
+
        return ret;
 }
 
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.