[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [xen-4.3-testing bisection] complete build-i386



xen.org writes ("[xen-4.3-testing bisection] complete build-i386"):
> branch xen-4.3-testing
> xen branch xen-4.3-testing
> job build-i386
> test xen-build
> 
> Tree: qemuu git://xenbits.xen.org/staging/qemu-upstream-4.3-testing.git
> Tree: xen git://xenbits.xen.org/xen.git
> 
> *** Found and reproduced problem changeset ***
> 
>   Bug is in tree:  xen git://xenbits.xen.org/xen.git
>   Bug introduced:  3dfb4018d4948588ff00e32f4ab12d4715bb8c5e
>   Bug not present: c8d233c644cbaba24b823ddd2394e4b4e07f7d9d
... 
>       Config.mk: Update QEMU_TAG and QEMU_UPSTREAM_REVISION for 4.3

This is because
 (a) qemu-xen-4.3-testing.git (the trad tree) had not had
     git-update-server-info run in it since the xen-4.3.0 tag was
     made.
 (b) Config.mk specifies an http url (rather than a git url) for
     the qemu trees.

I have fixed this and prodded the tester into starting another 4.3
test right away.


Post mortem:

The xen-4.3.0 tag in that tree was made some time last week.  Running
g-u-s-i is in the checklist.  I don't know why the checklist wasn't
followed properly (by me).  That was clearly a mistake.  Sorry.

I think, though, that our release process has become quite complex and
single the ad-hoc checklist file we (mostly I) are using is creaking
under the strain rather.

ACTION: I think we need to write down our actual relase process from a
higher-level point of view, and have a separate file for technical
processes and runes at various stages.

The autotester didn't detect this situation before the release for the
following reasons:

1. There was an intermittent infrastructure problem which broke the
   autotester entirely for much of the last few weeks, for days at a
   time.  The cause of this infrastructure problem (an NFS lockup) has
   not been identified.  We hope to collect more data if it recurs.

   ACTION: We intend to migrate the many services (not just osstest)
   off the server which broke, into their own VMs.  But this is
   dependent on effort and hardware.

2. There was a problem with the way the autotester was trying to test
   4.3 testing which was actually an overlooked hg-specific rune in a
   slightly ad-hoc piece of machinery (which grabs the qemu trad
   version out of a particular Xen's Config.mk).  For reasons which
   still aren't clear to me, this problem only manifested in the 4.3
   testing branch, which was created only recently.  (This bug is now
   fixed.)

3. Because of problem (1) the cron mails reporting (2) didn't arrive
   for several days.  So I didn't become aware of it.

4. I was ill in bed on Monday.

For all of these reasons, there were no autotester test results of 4.3
until today.  But we knew on Tuesday that the tests were missing.

I'm not sure why we didn't detect this problem in any of the ad-hoc
test builds we did after changing Config.mk.

I'm not sure all the test builds we did, but I know about the ones I
did as part of the tarball process.  They wouldn't detect this
situation because the tarball is made to include copies of the checked
out qemu code; thus a test build of the resulting tarball doesn't need
to check out qemu.  (And building the tarball itself is done from
runes in the checklist which do not involve cloning a fresh qemu tree,
so you wouldn't see it then.)

If someone did an ad-hoc test build of changing Config.mk, it is
possible that this is because one of

(a) They didn't clean their tree _entirely thoroughly_.
    "make clean" is not sufficient because it doesn't remove cloned
    subdirectories.  "git-clean -x -d -f" is not sufficient because
    it recurses into the cloned subdirectories, discovers that these
    are separate git repositories, and stops.

    "make distclean" plus "git-clean -x -d -f" should be sufficient.
    In my test just now it is for qemu, however there is a bug in
    the build system which means that it leaves
    tools/firmware/seabios-dir-remote/

    This is not a particularly nice situation but arguably it is
    something that someone doing a final test build would have to
    know.  ACTION: We should perhaps write it down.  Where ?

(b) Their arrangements for building things involve using a local
    .config (or similar) which specify explicitly which qemu to use.

    Doing this for a final release check build after editing Config.mk
    would arguably be simply a mistake.

(c) Their arrangements for building things involve using a local
    .config or environment setting to tell the Xen build system to use
    the git protocol rather than http.

    It is not clear that we could regard this as unequivocally a
    mistake.


Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.