[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 4 of 5 V3] tools/libxl: Control network buffering in remus callbacks [and 1 more messages] [and 1 more messages]



On Tue, Nov 12, 2013 at 10:38 AM, Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx> wrote:
Shriram Rajagopalan writes ("Re: [PATCH 4 of 5 V3] tools/libxl: Control network buffering in remus callbacks [and 1 more messages] [and 1 more messages]"):
> The nested-ao patch makes sense for Remus, even without fixing this
> timeout issue.  I can modify my stuff accordingly. Probably create a
> nested-ao per iteration and drop it at the start of the next
> iteration.

Right.  Great.

> However, the timeout part is not convincing enough. For example,
> libxl__domain_suspend_common_callback [the version before your patch]
> has two 6 second wait loops in the worst case..
...
>  LOG(DEBUG, "wait for the guest to acknowledge suspend request");
>         watchdog = 60;
>         while (!strcmp(state, "suspend") && watchdog > 0) {
>             usleep(100000);
...
> and then once again
...
>         usleep(100000);

Oh dear.  That is very poor.

> Now I know where the 200ms overhead per checkpoint comes from.
>
> Shouldn't this also be made into an event loop?  Irrespective of
> whether it is invoked in Remus' context or normal
> suspend/resume/save/restore/migrate context.

Yes, you are entitrely correct.

Both of these loops should be replaced with timeout/event/callback
approaches.

Do you want to attempt this or would you like me to do it ?

I can take a crack at it.

Thanks
Shriram
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.