[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] stable trees (was: [xen-4.2-testing test] 58584: regressions)



Jan Beulich writes ("stable trees (was: [xen-4.2-testing test] 58584: 
regressions)"):
> Which leaves several options:
> - the problem was always there, but hidden by some factor in the
>   old osstest instance,

I think this is most likely.  The old system had much older hosts.

I think this is a race that we now happen to lose most of the time.

> - this is an infrastructure problem in the new osstest instance
>   (after all what makes the tests fail is a ping timing out, which can
>   have a variety of reasons),

I think this is very unlikely.  When we were investigating the FreeBSD
migration failure, I looked at this possibility in some depth.  I ran
a number of long-term ping tests between various infrastructure
machines and test boxes and saw nothing untoward (for example, no
unexpected packet loss).  (In the end the problem turned out to be a
race bug in the FreeBSD netfront, which would try to send the
gratuitous ARP before the backend was up.)

> - this is a build or runtime problem due to software differences
>   between the old and new instances (no idea whether exact same
>   package versions were used at the time of the switchover),

All the builds are done on hosts frequently reinstalled from Debian
upstream.  The compiler would change if Debian released an updated
package but not otherwise.  So the old and new build environments
would be very close to identical, apart from the hardware, hostnames,
etc.

> One aspect making me indeed consider the build (or less likely
> runtime) aspect is that we're seeing the frequent migration failures
> in the stable trees only - other than unstable they're all getting built
> with debug=n.

Races frequently come and go with that kind of change.

> While I agree that it wouldn't be nice to release 4.5.1 with these
> failures not understood, the current situation (with no-one having
> a real idea of what's going on, and apparently also no-one really
> trying to debug the issue - it being migration _and_ [apparently]
> qemuu specific I don't really feel qualified myself, leaving aside any
> time constraints, which certainly also apply to others) will lead to
> an indefinite stall on both this tree and the 4.4 one (4.4.3 would
> be due in about a month, i.e. normally I would send out a call for
> backport requests around now).

I think going ahead with 4.5.1 anyway would be a reasonable choice.

Stefano ?

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.