[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH V3 1/6] x86/xsaves: enable xsaves/xrstors for pv guest



On 07/08/15 09:00, Shuai Ruan wrote:
>
>>> +                    goto skip;
>>> +                }
>>> +
>>> +                if ( !guest_kernel_mode(v, regs) || (regs->edi & 0x3f) )
>> What does edi have to do with xsaves?  only edx:eax are special
>> according to the manual.
>>
> regs->edi is the guest_linear_address

Whyso?  xsaves takes an unconditional memory parameter,  not a pointer
in %rdi.  (regs->edi is only correct for ins/outs because the pointer is
architecturally required to be in %rdi.)

There is nothing currently in emulate_privileged_op() which does ModRM
decoding for memory references, nor SIB decoding.  xsaves/xrstors would
be the first such operations.

I am also not sure that adding arbitrary memory decode here is sensible.

In an ideal world, we would have what is currently x86_emulate() split
in 3 stages.

Stage 1 does straight instruction decode to some internal representation.

Stage 2 does an audit to see whether the decoded instruction is
plausible for the reason why an emulation was needed.  We have had a
number of security issues with emulation in the past where guests cause
one instruction to trap for emulation, then rewrite the instruction to
be something else, and exploit a bug in the emulator.

Stage 3 performs the actions required for emulation.

Currently, x86_emulate() is limited to instructions which might
legitimately fault for emulation, but with the advent of VM
introspection, this is proving to be insufficient.  With my x86
maintainers hat on, I would like to avoid the current situation we have
with multiple bits of code doing x86 instruction decode and emulation
(which are all different).

I think the 3-step approach above caters suitably to all usecases, but
it is a large project itself.  It allows the introspection people to
have a full and complete x86 emulation infrastructure, while also
preventing areas like the shadow paging from being opened up to
potential vulnerabilities in unrelated areas of the x86 architecture.

I would even go so far as to say that it is probably ok not to support
xsaves/xrestors in PV guests until something along the above lines is
sorted.  The first feature in XSS is processor trace which a PV guest
couldn't use anyway.  I suspect the same applies to most of the other
XSS features, or they wouldn't need to be privileged in the first place.

>
>>> +
>>> +                if ( !cpu_has_xsaves || !(v->arch.pv_vcpu.ctrlreg[4] &
>>> +                                          X86_CR4_OSXSAVE))
>>> +                {
>>> +                    do_guest_trap(TRAP_invalid_op, regs, 0);
>>> +                    goto skip;
>>> +                }
>>> +
>>> +                if ( v->arch.pv_vcpu.ctrlreg[0] & X86_CR0_TS )
>>> +                {
>>> +                    do_guest_trap(TRAP_nmi, regs, 0);
>>> +                    goto skip;
>>> +                }
>>> +
>>> +                if ( !guest_kernel_mode(v, regs) || (regs->edi & 0x3f) )
>>> +                    goto fail;
>>> +
>>> +                if ( (rc = copy_from_user(&guest_xsave_area, (void *) 
>>> regs->edi,
>>> +                                          sizeof(struct xsave_struct))) 
>>> !=0 )
>>> +                {
>>> +                    propagate_page_fault(regs->edi +
>>> +                                    sizeof(struct xsave_struct) - rc, 0);
>>> +                    goto skip;
>> Surely you just need the xstate_bv and xcomp_bv ?
>>
> I will dig into SDM to see whether I missing some checkings.

What I mean by this is that xstate_bv and xcomp_bv are all that you are
checking, so you just need two uint64_t's, rather than a full xsave_struct.

>
>>>  
>>>          default:
>>> diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c
>>> index 98310f3..de94ac1 100644
>>> --- a/xen/arch/x86/x86_64/mm.c
>>> +++ b/xen/arch/x86/x86_64/mm.c
>>> @@ -48,6 +48,58 @@ l2_pgentry_t __section(".bss.page_aligned") 
>>> l2_bootmap[L2_PAGETABLE_ENTRIES];
>>>  
>>>  l2_pgentry_t *compat_idle_pg_table_l2;
>>>  
>>> +unsigned long do_page_walk_mfn(struct vcpu *v, unsigned long addr)
>> What is this function?  Why is it useful?  Something like this belongs
>> in its own patch along with a description of why it is being introduced.
>>
> The fucntion is used for getting the mfn related to guest linear address.
> Is there an another existing function I can use that can do the same
> thing? Can you give me a suggestion.

do_page_walk() and use virt_to_mfn() on the result?  (I am just
guessing, but

>
>>> +{
>>> +    asm volatile ( ".byte 0x48,0x0f,0xc7,0x2f"
>>> +                    : "=m" (*ptr)
>>> +                    : "a" (lmask), "d" (hmask), "D" (ptr) );
>>> +}
>>> +
>>> +void xrstors(uint32_t lmask, uint32_t hmask, struct xsave_struct *ptr)
>>> +{
>>> +    asm volatile ( "1: .byte 0x48,0x0f,0xc7,0x1f\n"
>>> +                   ".section .fixup,\"ax\"      \n"
>>> +                   "2: mov %5,%%ecx             \n"
>>> +                   "   xor %1,%1                \n"
>>> +                   "   rep stosb                \n"
>>> +                   "   lea %2,%0                \n"
>>> +                   "   mov %3,%1                \n"
>>> +                   "   jmp 1b                   \n"
>>> +                   ".previous                   \n"
>>> +                   _ASM_EXTABLE(1b, 2b)
>>> +                   : "+&D" (ptr), "+&a" (lmask)
>>> +                   : "m" (*ptr), "g" (lmask), "d" (hmask),
>>> +                     "m" (xsave_cntxt_size)
>>> +                   : "ecx" );
>>> +}
>>> +
>> Neither of these two helpers have anything like sufficient checking to
>> be safely used on guest state.
>>
> Basic checking is done before these two helpers.

But this isn't the only place where they are used.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.