[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/2] libxl/devd: fix a race with concurrent device addition/removal



On Wed, May 10, 2017 at 11:43:57AM +0100, Roger Pau Monne wrote:
> On Wed, May 10, 2017 at 11:32:45AM +0100, Ian Jackson wrote:
> > Roger Pau Monne writes ("[PATCH 1/2] libxl/devd: fix a race with concurrent 
> > device addition/removal"):
> > > Current code can free the libxl__device inside of the 
> > > libxl__ddomain_device
> > > before the addition has finished if a removal happens while an addition is
> > > still in process:
> > > 
> > >   backend_watch_callback
> > >             |
> > >             v
> > >        add_device
> > >             |                 backend_watch_callback
> > >     (async operation)                   |
> > >             |                           v
> > >             |                     remove_device
> > >             |                           |
> > >             |                           V
> > >             |                    device_complete
> > >             |                 (free libxl__device)
> > >             v
> > >      device_complete
> > >   (deref libxl__device)
> > > 
> > > Fix this by creating a temporary copy of the libxl__device, that's
> > > tracked by the GC of the nested async operation. This ensures that
> > > the libxl__device used by the async operations cannot be freed while
> > > being used.
> > 
> > Doesn't this arrange that the remove hotplug script will be invoked
> > while the add hotplug script is still running ?
> 
> That's indeed possible (either with the current code or with this patch),
> although unlikely. The async code called by remove_device will wait for the
> backend to switch to state 6, while the add_device code will wait for state 2
> IIRC (one can change those states to make them clash probably).
> 
> > Is that really desirable (or allowed!) ?
> 
> Hm, no, I don't think it's desirable at all. I still think this is better that
> the previous code (at lest it doesn't dereference libxl__device anymore), but
> clearly needs further improvements.
> 
> Also, it seems to me the same can happen even without driver domains, if a 
> user
> executes concurrent block-{attach/detach} operations, but maybe I'm missing
> something?
> 

There is a lot of locking for all the device add / remove code. See
libxl_internal.h L2588.

> Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.