[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] xsave=0 workaround needed on 3.2 kernels with Xen 4.1 or Xen-unstable.



On Mon, May 07, 2012 at 08:15:44AM +0100, Jan Beulich wrote:
> >>> On 04.05.12 at 21:30, AP <apxeng@xxxxxxxxx> wrote:
> > From the above I realized that X86_CR4_OSXSAVE was never getting set
> > in v->arch.pv_vcpu.ctrlreg[4].
> 
> Yes, that was the observation in the previous thread too, but the
> reporter didn't seem interested in continuing on from there.
> 
> 
> > So I tried the following patch:
> >
> > diff -r 5a0d60bb536b xen/arch/x86/domain.c
> > --- a/xen/arch/x86/domain.c Fri Apr 27 21:10:59 2012 -0700
> > +++ b/xen/arch/x86/domain.c Fri May 04 12:23:57 2012 -0700
> > @@ -691,8 +691,6 @@ unsigned long pv_guest_cr4_fixup(const s
> >          hv_cr4_mask &= ~X86_CR4_DE;
> >      if ( cpu_has_fsgsbase && !is_pv_32bit_domain(v->domain) )
> >          hv_cr4_mask &= ~X86_CR4_FSGSBASE;
> > -    if ( xsave_enabled(v) )
> > -        hv_cr4_mask &= ~X86_CR4_OSXSAVE;
> > 
> >      if ( (guest_cr4 & hv_cr4_mask) != (hv_cr4 & hv_cr4_mask) )
> >          gdprintk(XENLOG_WARNING,
> > diff -r 5a0d60bb536b xen/include/asm-x86/domain.h
> > --- a/xen/include/asm-x86/domain.h  Fri Apr 27 21:10:59 2012 -0700
> > +++ b/xen/include/asm-x86/domain.h  Fri May 04 12:23:57 2012 -0700
> > @@ -530,7 +530,7 @@ unsigned long pv_guest_cr4_fixup(const s
> >       & ~X86_CR4_DE)
> >  #define real_cr4_to_pv_guest_cr4(c)                         \
> >      ((c) & ~(X86_CR4_PGE | X86_CR4_PSE | X86_CR4_TSD        \
> > -             | X86_CR4_OSXSAVE | X86_CR4_SMEP))
> > +             | X86_CR4_SMEP))
> > 
> >  void domain_cpuid(struct domain *d,
> >                    unsigned int  input,
> 
> No, this is specifically the wrong thing. From what we know so far
> (i.e. the outcome of the above printing you added) the problem in
> in the Dom0 kernel (in it never setting CR4.OSXSAVE prior to
> attempting XSETBV). What your patch efectively does is take away
> control from the guest kernels to control the (virtual) CR4 flag...
> 
> > That allowed the system to boot successfully though I did see the
> > following message:
> > (XEN) domain.c:698:d0 Attempt to change CR4 flags 00042660 -> 00002660
> 
> ... which is what this message is telling you.
> 
> > Not sure if the above patch is right fix but I hope it was at least
> > helpful in pointing at where the problem might be.
> > 
> > BTW, I see the same invalid op issue with Xen 4.1.2 if I boot with xsave=1.
> 
> Sure, as it's a kernel problem. It's the kernel that needs logging added,
> to find out why the CR4 write supposedly happening immediately
> prior to the XSETBV (set_in_cr4(X86_CR4_OSXSAVE)) doesn't actually
> happen, or doesn't set the flag. Perhaps something fishy going on
> with the paravirt ops patching, since the disassembly of the opcode
> bytes shown with the oops message are indicating that the right
> thing is being attempted:
> 
> ff 14 25 10 33 c1 81  callq  *0xffffffff81c13310 [so xen_read_cr4 which is 
> native_read_cr4]
> 48 89 c7              mov    %rax,%rdi
> 48 81 cf 00 00 04 00  or     $0x40000,%rdi
>                              ^^^^^^^^
> ff 14 25 18 33 c1 81  callq  *0xffffffff81c13318 [so xen_write_cr4] - which 
> is filtering X86_CR4_PGE and X86_CR4_PSE]
> 48 8b 05 0d 15 db 00  mov    0xdb150d(%rip),%rax
> 31 c9                 xor    %ecx,%ecx
> 48 89 c2              mov    %rax,%rdx
> 48 c1 ea 20           shr    $0x20,%rdx
> 0f 01 d1              xsetbv 
> 5d                    pop    %rbp
> c3                    retq   
> 
> The primary thing that strikes me as odd is that both calls are still
> indirect ones, even though I thought that they should get replaced
> by direct ones (or even the actual instruction, namely in the
> read_cr4() case) upon first use. Konrad, Jeremy - am I wrong here?

They do get replaced (during runtime - and this is done by the
alternative_instructions). Is this output from objdump or straight
from the memory (so using the Xen debugger?).


> 
> And the dumped %rdi value indicates that bit 18 did _not_ get set.

That would imply that xen_write_cr4, which is just mov to cr4
is getting trapped but somehow the hypervisor isn't setting the
rdi value? Or maybe the the native_write_cr4 ends up
filtering in the wrong order?

   0xffffffff8102e650 <xen_write_cr4>:  push   %rbp
   0xffffffff8102e651 <xen_write_cr4+1>:        and    $0x6f,%dil
   0xffffffff8102e655 <xen_write_cr4+5>:        mov    %rsp,%rbp
   0xffffffff8102e658 <xen_write_cr4+8>:        mov    %rdi,%cr4
   0xffffffff8102e65b <xen_write_cr4+11>:       leaveq
   0xffffffff8102e65c <xen_write_cr4+12>:       retq


> 
> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.