| 
    
 [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] xenfs: race condition on xenstore watch
 On 28/05/13 13:55, Konrad Rzeszutek Wilk wrote: On Wed, May 15, 2013 at 05:51:06PM +0100, Jonathan Davies wrote:Dear xen-devel, There's a race condition in xenfs (xenstore driver) that causes userspace utility xenstore-watch to crash. Normally, the userspace process gets an "OK" from xenfs and then the watch fires immediately after. Occasionally, this happens the other way around: the watch fires before the driver sends "OK", which confuses the xenstore-watch client. It seems to me that the client is within its rights to expect the "OK" first. Here's what is happening: The userspace process xenstore-watch writes to /proc/xen/xenbus with msg_type==XS_WATCH. This is handled by xenbus_write_watch which calls register_xenbus_watch with watch_fired as a callback *before* acquiring the reply_mutex and sending the synthesised "OK" reply. This gives a fast xenstore the opportunity to cause the watch_fired to run (and briefly grab the reply_mutex for itself) before the fake "OK" message is sent. Below, I've included a putative patch for pre-3.3 xenfs that fixes this problem. (It looks like the patch would also apply cleanly to 3.3-onwards xenbus_dev_frontend.c, but I haven't tried.) Any comments about whether this is a reasonable approach?It can't apply cleanly as the file moved :-( Well, it applies cleanly if you rename the affected file to drivers/xen/xenbus/xenbus_dev_frontend.c and can tolerate a 15-line offset... :-) A cursory glance at drivers/xen/xenbus/xenbus_dev.c suggests that it suffers from the same problem. Although I haven't haven't tested this, I'd expect that it requires a very similar solution. Jonathan Take the (non-reentrant) reply_mutex before calling register_xenbus_watch to prevent the watch_fired callback from writing anything until the "OK" has been sent.Should that perhaps be done inside the msg_type == XS_WATCH code (with a bool that would determine whether an reply_mutex has been taken?) As in, is there no need to take this mutex if msg_type != XS_WATCH? When msg_type != XS_WATCH, we only need to take the mutex briefly when sending the "OK". There's no issue with XS_UNWATCH because there's never anything further to be written. v2 of the patch won't hold the lock while doing the work for XS_UNWATCH. 
 No. I will give that a try. 
 Thanks for the feedback, Jonathan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel 
 
  | 
  
![]()  | 
            
         Lists.xenproject.org is hosted with RackSpace, monitoring our  |