[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC] libxl: use libxl_event_wait to process libxl events



On 11/23/2015 04:24 AM, Ian Jackson wrote:
> Jim Fehlig writes ("[PATCH RFC] libxl: use libxl_event_wait to process libxl 
> events"):
>> Prior to this patch, libxl events were delivered to libvirt via
>> the libxlDomainEventHandler callback registered with libxl.
>> Documenation in $xensrc/tools/libxl/libxl_event.h states that the
>> callback "may occur on any thread in which the application calls
>> libxl". This can result in deadlock since many of the libvirt
>> callees of libxl hold a lock on the virDomainObj they are working
>> on. When the callback is invoked, it attempts to find a virDomainObj
>> corresponding to the domain ID provided by libxl. Searching the
>> domain obj list results in locking each obj before checking if it is
>> active, and its ID equals the requested ID. Deadlock is possible
>> when attempting to lock an obj that is already locked further up
>> the call stack. Indeed, Max Ustermann recently reported an instance
>> of this deadlock
> This sounds like a very plausible failure mode.
>
>> https://www.redhat.com/archives/libvir-list/2015-November/msg00130.html
>>
>> This patch moves processing of libxl events to a thread, where
>> libxl_event_wait() is used to collect events. This allows processing
>> libxl events asynchronously in libvirt, avoiding the deadlock.
> The reasoning is sound and the remedy is IMO correct.  (However, you
> mean "this allows processing libxl events _synchronously_", since what
> you are doing is serialising them all into your main loop.)

Ah yes, poor choice of words. The last sentence should read:

This approach gives libvirt more control over event processing, ensuring object
locking constraints can be met.

But your comment exposes a flaw in the patch, one that had already been fixed in
the event_occurs approach. Shutting down a large memory domain can take
considerable time due to memory scrubbing, meanwhile stalling reception of other
events. The patch was a bit aggressive in removing libxlDomainShutdownThread().

>
> So:
>
> Acked-by: Ian Jackson <ian.jackson@xxxxxxxxxxxxx>
>
> NB that I have not reviewed the patch in detail.  I can do that if you
> like, although of course my knowledge of the innards of libvirt is not
> wonderful.
>
>> The only reservations I have about this patch come from comments
>> about libxl_event_wait in libxl_event.h
>>
>>   Like libxl_event_check but blocks if no suitable events are
>>   available, until some are.  Uses libxl_osevent_beforepoll/
>>   _afterpoll so may be inefficient if very many domains are being
>>   handled by a single program.
> If this turns out to be a problem in practice, we will improve libxl's
> data structures to not involve so many linear searches.  (In fact I
> think you are probably already calling synchronous libxl ao functions,
> which have the same performance properties, although this is not
> documented.)
>
> Given what you say above I don't think there is a reasonable
> alternative remedy.  So you should go ahead with this patch.

Thanks for your comments and ACK'ing the change . I'll submit a V2 that retains
handling of shutdown event in a thread.

Regards,
Jim


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.