[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [XTF PATCH] xtf-runner: fix two synchronisation issues



On 29/07/16 14:12, Wei Liu wrote:
> On Fri, Jul 29, 2016 at 02:06:56PM +0100, Andrew Cooper wrote:
>> On 29/07/16 13:58, Wei Liu wrote:
>>> On Fri, Jul 29, 2016 at 01:43:42PM +0100, Andrew Cooper wrote:
>>>> On 29/07/16 13:07, Wei Liu wrote:
>>>>> There were two synchronisation issues for the old code:
>>>>>
>>>>> 1. There was no guarantee that guest console was ready before "xl
>>>>>    console" invocation.
>>>>> 2. There was no guarantee that runner wouldn't not exit before all test
>>> s/not//
>>>
>>>>>    guests were gone.
>>>> Sorry, but I can't parse this.
>>>>
>>>> The runner existing before xl has torn down the guest is very
>>>> deliberate, because some part of hvm guests is terribly slow to tear
>>>> down; waiting synchronously for teardown tripled the wallclock time to
>>>> run a load of tests back-to-back.
>>>>
>>> Then you won't know if a guest is leaked or it is being slowly destroyed
>>> when a dead guest shows up in the snapshot of 'xl list'.
>>>
>>> Also consider that would make back-to-back tests that happen to have a
>>> guest that has the same name as the one in previous test fail.
>> test names are globally unique, so this isn't an issue.
>>
>> Also, the wait for `xl console` to complete shows that @releasedomain
>> has been fired for the domain.
>>
> Are you suggesting waiting for "xl console" only is good enough?

The "stdout, _ = console.communicate()" line waits for `xl console` to
exit.  (In fact, `xl` exec()'s `xenconosle`)

As we never put a CTRL-] into stdin, `xl console` only exits when the
PTY shuts, which is when xenconsoled decided the domain has terminated.

So, yes - I think this is safe.

>
>>> I don't think getting blocked for a few more seconds is a big issue.
>>> It's is important to eliminate such race conditions so that osstest can
>>> work properly.
>> Amortising the teardown cost in the background is fine (which is what
>> your "for child in wait_list" ends up doing), but an extra few seconds
>> per test is very important to be avoided for manual use of XTF.
>>
> So let's default to not wait and add an option to wait. Osstest will use
> that option.

The code you currently have should be fine (in this respect).  This way,
you are tearing down a dying domain at the same time as starting the
next one up, which is fine.

What I want to avoid is waiting for complete teardown before starting
the next domain up.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.