[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [xen-4.6-testing test] 65327: regressions - FAIL



Jan Beulich writes ("Re: [Xen-devel] [xen-4.6-testing test] 65327: regressions 
- FAIL"):
> And indeed I had suggested a force push a number of flights ago,
> but Ian had hoped it would eventually end up running on another
> host, thus allowing a push to happen.

I'm not sure which Ian this is and I can't find a record in email of
anyone having said that.  But, that seems like a rather forlorn hope.

It seems that this is a host-specific failure, which is reliably
reproducible.  I did a search in the database[1] to see if this test
ever passed on merlot, and it didn't.

> I don't know how sticky the stickiness of failed tests is, but I'm
> not getting the impression that such a host change is going to
> happen reliably within a couple of days at most.

The system tries to be as sticky as possible to avoid regressions
slipping through.

IMO the right justification for a push is that this test has never
passed on merlot.  The push gate only regards it as a regression
because it once happened to run on a different machine for some
reason, which looks like a baseline pass that it thinks ought to be
reproduced.

We can force push this in 4.6 and I will do so (based on 65327) after
sending this mail.

This will recur on other branches occasionally.  In general in
situations like this we have four options:
 1. Fix the underlying bug
 2. Force push each relevant tree each time this comes up
 3. Add this particular test to the allowable failures list
 4. Arrange to not run this test on merlot*

In this case: fixing the bug seems difficult (thanks to Wei for
investigating).  Selecting different hosts would be applicable if we
knew what the problem was (eg BIOS bug, CPU incompatibility, or
whatever), but doesn't seem relevant here.  Force pushing affected
trees will get annoying eventually.

I suggest we continue doing force pushes and mark the test as
non-blocking if it gets too annoying.

In the meantime I think we should continue to investigate the bug.  I
think it is likely that it is a race which we happen to lose on
merlot*.

Ian.

[1]

select * from steps join flights using (flight) join jobs using (flight,job) 
where job='test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm' and testid = 
'guest-localmigrate/x10' and blessing='real' and (select val from runvars r 
where r.flight=flights.flight and r.job=jobs.job and name='host') like 
'merlot%' order by flight desc;
 => 64 rows, all showing failure, on a variety of branches

select * from steps join flights using (flight) join jobs using (flight,job) 
where job='test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm' and testid = 
'guest-localmigrate/x10' and blessing='real' and (select val from runvars r 
where r.flight=flights.flight and r.job=jobs.job and name='host') like 
'merlot%' and steps.status='pass' order by flight desc;
 => 0 rows

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.