[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] xl shutdown --wait "racy"



On Wed, 2014-04-23 at 15:38 +0200, Sander Eikelenboom wrote:
> Thursday, April 17, 2014, 8:09:39 PM, you wrote:
> 
> > George Dunlap writes ("Re: [Xen-devel] xl shutdown --wait "racy""):
> >> On Wed, Apr 16, 2014 at 3:13 PM, Ian Campbell <Ian.Campbell@xxxxxxxxxx> 
> >> wrote:
> >> > It is waiting for the domain to be shutdown (state 's') not for the
> >> > domain to be destroyed. So it's doing what it said it would (I
> >> > appreciate you might not find this distinction helpful under the
> >> > circumstances...)
> >> 
> >> For any reasonable person's definition of "shutdown", it does *not*
> >> wait until it's shutdown.  "In the shutdown state" is not something
> >> anyone outside of Xen cares about: what they care about is being able
> >> to, for example, start the domain again (or start a domain that
> >> depends on resources currently held by the shutting down domain).
> 
> > Quite.  I think this is simply a bug and it should wait for the domain
> > to be destroyed.
> 
> It seems the shutdown is racy in another aspect as well, the code in 
> "wait_for_domain_deaths" 
> on receiving a event doesn't actually check if it's a event related to the 
> domain it should wait for. It only checks a count (1 if you shutdown only one 
> domain, nb_domains -1 if you shutdown all).

It only calls libxl_evenable_domain_death for domains which it should
wait for, and by construction the number of deaths passed to
wait_for_domain_deaths is equal to the number of domains for which
libxl_evenable_domain_death was called.

> To be less error prone it should ideally pass an array of domids it needs to 
> wait for through to wait_for_domain_deaths and check on it.
> 
> > It's IMO a tolerable side-effect if this means that when the domain
> > shuts down in a way that causes it to be preserved (ie the daemonic xl
> > doesn't reap it) xl shutdown -w gets stuck.  (There should be an
> > option to restore the former behaviour but it should not be the
> > default.)
> 
> > Ideally xl would record something somewhere so that it would know
> > what's going on and could make "xl shutdown -w" fail in the default
> > case if it's going to wait "forever".
> 
> There isn't a generic timeout function that could raise an timeout event
> when it is taking too long ?

Nope. I suppose you could use alarm(2) or something but that would be
getting pretty hairy.

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.