[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] libxl: Increase device model startup timeout to 1min.



On Tue, 30 Jun 2015, Ian Jackson wrote:
> Anthony PERARD writes ("Re: [PATCH] libxl: Increase device model startup 
> timeout to 1min."):
> > On Mon, Jun 29, 2015 at 03:51:57PM +0100, Ian Campbell wrote:
> > > Nor does it really answer Ian's question in
> > > <21901.33163.547929.321814@xxxxxxxxxxxxxxxxxxxxxxxx> I think.
> > 
> > I only know what happen, not why it happen.
> > 
> > How could I investigate why qemu is taking so long after an mmap() syscall?
> > I guest at that time, it is copying the library into memory.
> 
> Is the machine very busy at this time ?
> 
> If this is a stress test of some kind then I guess there are three
> possible views:
> 
>   * The number and nature of parallel operations done in the stress
>     test is unreasonable for the provided hardware:
>       => the timeout is fine

I don't know if it is our place to make this call.  Should we really be
deciding what is considered "reasonable"? I think not. Defining what is
reasonable and policies that match it is not a route I think we should
take in libxl.


>   * The number and nature of parallel operations is reasonable for the
>     provided hardware, and should not result in the toolstack domain
>     being overloaded to the point where ld.so on qemu takes so long
>       => the timeout is correct and there is an underlying bug
>          (perhaps in Linux)
> 
>   * The number and nature of parallel operations is reasonable for
>     the hardware, but might easily result in long delays in ld.so
>     assembling qemu - that is, people expect to deploy Xen and then
>     cause their toolstack domain to be massively `overloaded' and
>     very slow
>       => the timeout is too short
> 
> In the third case we probably need a general-purpose facility for
> adjusting timeouts in general.  People who do not expect to overload
> their toolstack domain should not be made to await massive timeout
> values designed for people who do.

But they are not actually made to await any timeouts: the timeout is
only to check whether QEMU started. Users are only made to await
timeouts in case of errors. Is that what you meant?
In the normal case, we could have a timeout of 1h and users wouldn't
tell the difference.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.