[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] xl shutdown --wait "racy"



Wednesday, April 23, 2014, 3:42:32 PM, you wrote:

> On Wed, 2014-04-23 at 15:38 +0200, Sander Eikelenboom wrote:
>> Thursday, April 17, 2014, 8:09:39 PM, you wrote:
>> 
>> > George Dunlap writes ("Re: [Xen-devel] xl shutdown --wait "racy""):
>> >> On Wed, Apr 16, 2014 at 3:13 PM, Ian Campbell <Ian.Campbell@xxxxxxxxxx> 
>> >> wrote:
>> >> > It is waiting for the domain to be shutdown (state 's') not for the
>> >> > domain to be destroyed. So it's doing what it said it would (I
>> >> > appreciate you might not find this distinction helpful under the
>> >> > circumstances...)
>> >> 
>> >> For any reasonable person's definition of "shutdown", it does *not*
>> >> wait until it's shutdown.  "In the shutdown state" is not something
>> >> anyone outside of Xen cares about: what they care about is being able
>> >> to, for example, start the domain again (or start a domain that
>> >> depends on resources currently held by the shutting down domain).
>> 
>> > Quite.  I think this is simply a bug and it should wait for the domain
>> > to be destroyed.
>> 
>> It seems the shutdown is racy in another aspect as well, the code in 
>> "wait_for_domain_deaths" 
>> on receiving a event doesn't actually check if it's a event related to the 
>> domain it should wait for. It only checks a count (1 if you shutdown only 
>> one 
>> domain, nb_domains -1 if you shutdown all).

> It only calls libxl_evenable_domain_death for domains which it should
> wait for, and by construction the number of deaths passed to
> wait_for_domain_deaths is equal to the number of domains for which
> libxl_evenable_domain_death was called.

Also if i do two times "xl shutdown --wait" with a script in parallel ?

That would give a event for the guest that shuts down the fastest .. but 
wouldn't 
the "wait_for_domain_deaths" for the other guest also respond to that (and 
return too early) ?


>> To be less error prone it should ideally pass an array of domids it needs to 
>> wait for through to wait_for_domain_deaths and check on it.
>> 
>> > It's IMO a tolerable side-effect if this means that when the domain
>> > shuts down in a way that causes it to be preserved (ie the daemonic xl
>> > doesn't reap it) xl shutdown -w gets stuck.  (There should be an
>> > option to restore the former behaviour but it should not be the
>> > default.)
>> 
>> > Ideally xl would record something somewhere so that it would know
>> > what's going on and could make "xl shutdown -w" fail in the default
>> > case if it's going to wait "forever".
>> 
>> There isn't a generic timeout function that could raise an timeout event
>> when it is taking too long ?

> Nope. I suppose you could use alarm(2) or something but that would be
> getting pretty hairy.

> Ian.




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.