[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] xm save + restore crashes Windows 200832-bit(4.0.2-rc2-pre) (AMD only)


  • To: "Tim Deegan" <Tim.Deegan@xxxxxxxxxx>
  • From: "James Harper" <james.harper@xxxxxxxxxxxxxxxx>
  • Date: Wed, 26 Jan 2011 00:35:40 +1100
  • Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
  • Delivery-date: Tue, 25 Jan 2011 05:36:34 -0800
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>
  • Thread-index: Acu8fidrVyEsQw+7S0mpezPgYnIrHQABbS9gAABRwLAAA6k28A==
  • Thread-topic: [Xen-devel] xm save + restore crashes Windows 200832-bit(4.0.2-rc2-pre) (AMD only)

I put some printf's around the restore of registers in
hvm_load_cpu_ctxt. One before, announcing what the register was about to
be set to, then set it as normal, then read it and print what it
contains (which should be what it was set to). The values don't match
for fs, gs, tr, and ldtr. The value's written do match what xen-hvmctx
tells me before the save is done, so the save is working just not the
restore.

So the problem is somewhere past hvm_set_segment_register, and because
it's amd only, probably in or beyond svm_set_segment_register. The first
thing I notice in that routine is that there is a case for those 4
registers... although all it seems to do is svm_sync_vmcb before and
svm_vmload after setting. I don't know what those two do though.

I'll investigate further tomorrow, assuming nobody fixes it while I'm
asleep :)

James


> -----Original Message-----
> From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx [mailto:xen-devel-
> bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of James Harper
> Sent: Tuesday, 25 January 2011 22:52
> To: Tim Deegan
> Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
> Subject: RE: [Xen-devel] xm save + restore crashes Windows
200832-bit(4.0.2-
> rc2-pre)
> 
> On intel we get a much more believable result:
> 
> # diff -u before after
> --- before      2011-01-25 22:40:32.861270619 +1100
> +++ after       2011-01-25 22:43:31.665271154 +1100
> @@ -1,4 +1,4 @@
> -HVM save record for domain 24
> +HVM save record for domain 25
>  Entry 0: type 1 instance 0, length 24
>       Header: magic 0x54381286, version 1
>               Xen changeset 0
> @@ -34,7 +34,7 @@
>        MSR flags 0x0000000000000000  lstar 0x0000000000000000
>             star 0x0000000000000000  cstar 0x0000000000000000
>           sfmask 0x0000000000000000   efer 0x0000000000000800
> -            tsc 0x0000002a2056c07e
> +            tsc 0x0000007e31c3b3f3
>            event 0x00000000 error 0x00000000
>      FPU:    fcw 0x027f fsw 0x0000
>              ftw 0x00 (0x00) fop 0x0000
> @@ -185,11 +185,11 @@
>                 rd_state 0, wr_state 0, wr_latch 0, rw_mode 0
>                 mode 0xff, bcd 0, gate 0x1
>  Entry 11: type 11 instance 0, length 16
> -    RTC: regs 0x32 0x00 0x40 0x00 0x22 0x00 0x02 0x25
> +    RTC: regs 0x29 0x00 0x43 0x00 0x22 0x00 0x02 0x25
>                0x01 0x11 0x2a 0x42 0x00 0x80, index 0x0c
>  Entry 12: type 12 instance 0, length 1048
>      HPET: capability 0xf424008086a201 config 0
> -          isr 0 counter 0x19b07289b
> +          isr 0 counter 0x394b04f80
>            timer0 config 0xf0000000000030 cmp 0
>            timer0 period 0 fsb 0
>            timer1 config 0xf0000000000030 cmp 0
> 
> just a few counters changed.
> 
> I retried under AMD and got the same result as last time so it's
> definitely broken.
> 
> Are there any tools to analyse the save file? If I can see what
numbers
> in there I should be able to tell if it's the save or the restore
that's
> broken...
> 
> Thanks
> 
> James
> 
> > -----Original Message-----
> > From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx [mailto:xen-devel-
> > bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of James Harper
> > Sent: Tuesday, 25 January 2011 22:38
> > To: Tim Deegan
> > Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
> > Subject: RE: [Xen-devel] xm save + restore crashes Windows 2008
> 32-bit(4.0.2-
> > rc2-pre)
> >
> > > I'm trying to set it up here as well but I'm away from the office
> and
> > > getting the VGA console as far as my screen is proving tricky.
> > >
> > > Can you try:
> > >  - xl pause <domid>
> > >  - xen-hvmctx <domid> >before
> > >  - xl save <domid> save-file
> > >  - xl restore -p save-file
> > >  - xl list
> > >  - xen-hvmctx <new-domid> >after
> > >  - diff -u before after
> > >
> > > There should be a few differences to do with timers and TSCs but
> there
> > > might be some other smoking gun.  Of course it's possible that
some
> > > piece of state got added that didn't get into the save/restore
code
> at
> > > all.  It's also possible that some vital piece of memory isn't
> getting
> > > saved properly but that's less likely to be AMD-specific.
> > >
> >
> > I was able to remove the 'is domain running?' check from xend and
> > complete your request using xm.
> >
> > # diff -u before after
> > --- before      2011-01-25 22:27:51.064451527 +1100
> > +++ after       2011-01-25 22:33:25.724619490 +1100
> > @@ -1,4 +1,4 @@
> > -HVM save record for domain 53
> > +HVM save record for domain 54
> >  Entry 0: type 1 instance 0, length 24
> >       Header: magic 0x54381286, version 1
> >               Xen changeset 0
> > @@ -22,11 +22,11 @@
> >               cs 0x0000001b (0x0000000000000000 + 0xffffffff /
> 0x00cfb)
> >               ds 0x00000023 (0x0000000000000000 + 0xffffffff /
> 0x00cf3)
> >               es 0x00000023 (0x0000000000000000 + 0xffffffff /
> 0x00cf3)
> > -             fs 0x0000003b (0x000000007ffdc000 + 0x00000fff /
> 0x004f3)
> > -             gs 0x00000000 (0x0000000000000000 + 0xffffffff /
> 0x00000)
> > +             fs 0x00000000 (0x00007f18bcbc6700 + 0xffffffff /
> 0x00000)
> > +             gs 0x00000000 (0xffff880028038000 + 0xffffffff /
> 0x00000)
> >               ss 0x00000023 (0x0000000000000000 + 0xffffffff /
> 0x00cf3)
> > -             tr 0x00000028 (0x0000000080157000 + 0x000020ab /
> 0x0008b)
> > -           ldtr 0x00000000 (0x0000000000000000 + 0x00000000 /
> 0x00000)
> > +             tr 0x0000e040 (0xffff82c480263a80 + 0x00000067 /
> 0x0008b)
> > +           ldtr 0x00000000 (0x0000000000000000 + 0x0000ffff /
> 0x00000)
> >             itdr            (0x0000000081fff400 + 0x000007ff)
> >             gdtr            (0x0000000081fff000 + 0x000003ff)
> >      sysenter cs 0x00000000  eip 0x0000000000000000  esp
> > 0x0000000000000000
> > @@ -34,7 +34,7 @@
> >        MSR flags 0xffffffffffffffff  lstar 0x0000000000000000
> >             star 0x0000000000000000  cstar 0x0000000000000000
> >           sfmask 0x0000000000000000   efer 0x0000000000000800
> > -            tsc 0x00000018cad69045
> > +            tsc 0x0000008fe39fad26
> >            event 0x00000000 error 0x00000000
> >      FPU:    fcw 0x037f fsw 0x0000
> >              ftw 0x00 (0x00) fop 0x0000
> > @@ -71,7 +71,7 @@
> >                 (0x00000000000000000000000000000000)
> >                 (0x00000000000000000000000000000000)
> >  Entry 2: type 3 instance 0, length 8
> > -    PIC: IRQ base 0x30, irr 0, imr 0xff, isr 0
> > +    PIC: IRQ base 0x30, irr 0x2, imr 0xff, isr 0
> >           init_state 0, priority_add 0, readsel_isr 0, poll 0
> >           auto_eoi 0, rotate_on_auto_eoi 0
> >           special_fully_nested_mode 0, special_mask_mode 0
> > @@ -153,8 +153,8 @@
> >            0x01c0: 0x0000000000000004   0x01d0: 0x0000000000000000
> >            0x01e0: 0x0000000000000000   0x01f0: 0x0000000000000000
> >            0x0200: 0x0000000000000000   0x0210: 0x0000000000000000
> > -          0x0220: 0x0000000000000000   0x0230: 0x0000000000040000
> > -          0x0240: 0x0000000000000004   0x0250: 0x0000000000000000
> > +          0x0220: 0x0000000000000000   0x0230: 0x0000000000060000
> > +          0x0240: 0x0000000000000006   0x0250: 0x0000000000000000
> >            0x0260: 0x0000000000000000   0x0270: 0x0000000000000000
> >            0x0280: 0x0000000000000000   0x0290: 0x0000000000000000
> >            0x02a0: 0x0000000000000000   0x02b0: 0x0000000000000000
> > @@ -171,7 +171,7 @@
> >  Entry 7: type 7 instance 0, length 16
> >      PCI IRQs: 0x00000000000100800000000000000000
> >  Entry 8: type 8 instance 0, length 8
> > -    ISA IRQs: 0x0001
> > +    ISA IRQs: 0x0003
> >  Entry 9: type 9 instance 0, length 8
> >      PCI LINK: 0 0 0 0
> >  Entry 10: type 10 instance 0, length 56
> > @@ -185,11 +185,11 @@
> >                 rd_state 0, wr_state 0, wr_latch 0, rw_mode 0
> >                 mode 0xff, bcd 0, gate 0x1
> >  Entry 11: type 11 instance 0, length 16
> > -    RTC: regs 0x48 0x00 0x27 0x00 0x22 0x00 0x02 0x25
> > +    RTC: regs 0x23 0x00 0x33 0x00 0x22 0x00 0x02 0x25
> >                0x01 0x11 0x2a 0x42 0x00 0x80, index 0x0c
> >  Entry 12: type 12 instance 0, length 1048
> >      HPET: capability 0xf424008086a201 config 0
> > -          isr 0 counter 0x1308081db
> > +          isr 0 counter 0x55323282f
> >            timer0 config 0xf0000000000030 cmp 0
> >            timer0 period 0 fsb 0
> >            timer1 config 0xf0000000000030 cmp 0
> > @@ -200,8 +200,8 @@
> >      ACPI PM: TMR_VAL 0xd9f446d, PM1a_STS 0x0, PM1a_EN 0x321
> >  Entry 14: type 14 instance 0, length 240
> >      MTRR: PAT 0x7010600070106, cap 0x508, default 0xc06
> > -          var 0 0x00000000f0000000 0x000000fff8000800
> > -          var 1 0x00000000f8000000 0x000000fffc000800
> > +          var 0 0x00000000f0000000 0x0000000000000000
> > +          var 1 0x00000000f8000000 0x0000000000000000
> >            var 2 0x0000000000000000 0x0000000000000000
> >            var 3 0x0000000000000000 0x0000000000000000
> >            var 4 0x0000000000000000 0x0000000000000000
> >
> > I don't really know a lot of what I'm looking at, but those cpu
> > registers shouldn't be different should they??? It almost looks like
> the
> > cpu was allowed to run for a few cycles at some point even though it
> was
> > paused.
> >
> > I'll try the same on the intel box and see what happens.
> >
> > Thanks
> >
> > James
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@xxxxxxxxxxxxxxxxxxx
> > http://lists.xensource.com/xen-devel
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.