Xen project Mailing List

Re: [Xen-devel] [qemu-upstream-unstable test] 21375: regressions - FAIL

From: Anthony PERARD <anthony.perard@xxxxxxxxxx>

Date: Mon, 18 Nov 2013 17:18:45 +0000

Cc: Stefano Stabellini <stefano.stabellini@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx, "xen.org" <ian.jackson@xxxxxxxxxxxxx>, Ian Campbell <ian.campbell@xxxxxxxxxx>

Delivery-date: Mon, 18 Nov 2013 17:19:06 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Wed, Nov 06, 2013 at 05:22:29PM +0000, Anthony PERARD wrote: > On Fri, Nov 01, 2013 at 03:46:36PM +0000, Anthony PERARD wrote: > > On Fri, Nov 01, 2013 at 12:06:51PM +0000, Ian Campbell wrote: > > > On Fri, 2013-11-01 at 11:58 +0000, Anthony PERARD wrote: > > > > On Fri, Nov 01, 2013 at 10:43:16AM +0000, Ian Campbell wrote: > > > > > On Fri, 2013-11-01 at 10:38 +0000, xen.org wrote: > > > > > > flight 21375 qemu-upstream-unstable real [real] > > > > > > http://www.chiark.greenend.org.uk/~xensrcts/logs/21375/ > > > > > > > > > > > > Regressions :-( > > > > > > > > > > > > Tests which did not succeed and are blocking, > > > > > > including tests which could not be run: > > > > > > test-amd64-i386-qemuu-rhel6hvm-intel 7 redhat-install fail > > > > > > REGR. vs. 20054 > > > > > > > > > > Anythony, have you made any progress on this? It's been failing for > > > > > ages > > > > > now... > > > > > > > > Yes, looks like the bug it trigger during a vesa resolution change. I > > > > have try to use the vgabios blob that we use for qemu-traditionnal and > > > > it works fine. But with the vgabios blob provided by qemu, it does not > > > > work... I'm still not sure of what the bug is, but I'm getting closer to > > > > it. > > > > > > Yay! > > > > > > > Also, this happen only on an Intel machine, on an AMD machine, > > > > everything works like a charm. > > > > > > > > More detail, if anyone want to know: > > > > It's look like syslinux is doing a int 10h call that never return to set > > > > video mode: > > > > Int 0x10, with AX=0x4F02 > > > > > > This looks like it might be handled by SeaBIOS vgasrc/vbe.c:vbe_104f00 ? > > > There seem to be a few changes in upstream seabios since the version > > > referenced in xen.git:Config.mk. Many of them are cleanups/code motion > > > but a few look worth investigating. > > > > I've been able to get the things working by applying a patch to vgabios > > that is in xen tree: a0e7ccf6864c196906d58b54cd0996b4dbc1b022 > > This patch allow to clear the framebuffer much faster. > > > > But it those not really help be to understand why the guest freeze. A > > couple more printf might. > > I finally managed to have a better understanding of the issue. > > So, the vgabios blob provided by QEMU have a routine to clear the video > ram that take few seconds to run. That give enough time to QEMU to try > to refresh is display, and this mean they will be a call to > xc_hvm_track_dirty_vram(). If the function is called while the vgabios > routine is running, then the guest is lost. > > The issue appear only with an Intel machine on an HVM guest using EPT. > Having the guest using shadow works fine. So I'm going to investigate > the track_dirty code in Xen. > > The vgabios routine is called by syslinux with an Int 0x10, I tryied to > get some debug print after the call, either from the guest serial or > by using the Xen debug ioport, nothing ever appear, and gdbsx only gave > me some weird IP which does not appear to point to any usefull code > (it's all zeros). An other update, we had the idee of trying this on earlier versin of Xen, and it turns out that Xen 4.3 works fine. One bisect later, and a commit turns out. commit 86781624f8df1d50eb4185cfc2ddce926798f7aa x86_emulate: PUSH <mem> must read source operand just once ... for the case of accessing MMIO. So after this commit, syslinux stop working correctly with the last version of QEMU. This happen if QEMU is calling track_dirty_vram. I also have use xentrace/xenalyze to try to grab more information about the issue, it did not really help, but it's tell me that the guest is stock on a specific instruction (it result in vmexit EPT_VIOLATION over and over on xentrace). And that were the guest is stock: 0xa126: mov %eax,%cr0 0xa129: ljmp $0xf2e,$0xa12e 0xa130: mov $0x26,%dl 0xa132: or %bh,(%eax) 0xa134: movzww %sp,%sp 0xa138: mov %edx,%ds 0xa13a: mov %edx,%es 0xa13c: mov %edx,%fs 0xa13e: mov %edx,%gs 0xa140: jmp *%ebx 0xa142: pushf => 0xa143: lcall *%cs:(%si) 0xa147: mov $0x0,%ch Before trying on earlier version of Xen, I try to understand what when wrong on the Xen side, it turn out that, in the track_dirty_vram hypercall, a call to hap_enable_log_dirty() is all that needed to break the guest. Jan, any idee of what the issue is? Regards, -- Anthony PERARD _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.