[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v8 09/12] x86emul: support FXSAVE/FXRSTOR



On 08.05.2020 21:31, Andrew Cooper wrote:
> On 05/05/2020 09:16, Jan Beulich wrote:
>> @@ -8125,6 +8154,47 @@ x86_emulate(
>>      case X86EMUL_OPC(0x0f, 0xae): case X86EMUL_OPC_66(0x0f, 0xae): /* Grp15 
>> */
>>          switch ( modrm_reg & 7 )
>>          {
>> +#if !defined(X86EMUL_NO_FPU) || !defined(X86EMUL_NO_MMX) || \
>> +    !defined(X86EMUL_NO_SIMD)
>> +        case 0: /* fxsave */
>> +        case 1: /* fxrstor */
>> +            generate_exception_if(vex.pfx, EXC_UD);
>> +            vcpu_must_have(fxsr);
>> +            generate_exception_if(ea.type != OP_MEM, EXC_UD);
>> +            generate_exception_if(!is_aligned(ea.mem.seg, ea.mem.off, 16,
>> +                                              ctxt, ops),
>> +                                  EXC_GP, 0);
>> +            fail_if(!ops->blk);
>> +            op_bytes =
>> +#ifdef __x86_64__
>> +                !mode_64bit() ? offsetof(struct x86_fxsr, xmm[8]) :
>> +#endif
>> +                sizeof(struct x86_fxsr);
>> +            if ( amd_like(ctxt) )
>> +            {
>> +                if ( !ops->read_cr ||
>> +                     ops->read_cr(4, &cr4, ctxt) != X86EMUL_OKAY )
>> +                    cr4 = X86_CR4_OSFXSR;
> 
> Why do we want to assume OSFXSR in the case of a read_cr() failure,
> rather than bailing on the entire instruction?

I prefer to assume "normal" operation over failing in such
cases. We have a few similar examples elsewhere. I'll add
a comment t this effect.

>> @@ -11819,6 +11891,77 @@ int x86_emul_blk(
>>  
>>  #endif /* X86EMUL_NO_FPU */
>>  
>> +#if !defined(X86EMUL_NO_FPU) || !defined(X86EMUL_NO_MMX) || \
>> +    !defined(X86EMUL_NO_SIMD)
>> +
>> +    case blk_fxrstor:
>> +    {
>> +        struct x86_fxsr *fxsr = FXSAVE_AREA;
>> +
>> +        ASSERT(!data);
>> +        ASSERT(bytes == sizeof(*fxsr));
>> +        ASSERT(state->op_bytes <= bytes);
>> +
>> +        if ( state->op_bytes < sizeof(*fxsr) )
>> +        {
>> +            if ( state->rex_prefix & REX_W )
>> +            {
>> +                /*
>> +                 * The only way to force fxsaveq on a wide range of gas
>> +                 * versions. On older versions the rex64 prefix works only 
>> if
>> +                 * we force an addressing mode that doesn't require extended
>> +                 * registers.
>> +                 */
>> +                asm volatile ( ".byte 0x48; fxsave (%1)"
>> +                               : "=m" (*fxsr) : "R" (fxsr) );
>> +            }
>> +            else
>> +                asm volatile ( "fxsave %0" : "=m" (*fxsr) );
>> +        }
>> +
>> +        memcpy(fxsr, ptr, min(state->op_bytes,
>> +                              (unsigned int)offsetof(struct x86_fxsr, _)));
>> +        memset(fxsr->_, 0, sizeof(*fxsr) - offsetof(struct x86_fxsr, _));
> 
> I'm completely lost trying to follow what's going on here.  Why are we
> constructing something different to what the guest gave us?

This part of the structure may not get written by FXSAVE. Hence
I'd prefer to assume it has unknown contents (which we shouldn't
leak) over assuming this would never get written (and hence
remain zeroed). Furthermore we mean to pass this to FXRSTOR,
which we know can raise #GP in principle. While this is a legacy
insns and hence unlikely to change in behavior, it seems more
safe to have well known values in at least the reserved range.

I'll add an abbreviated variant of this as a comment.

>> +
>> +        generate_exception_if(fxsr->mxcsr & ~mxcsr_mask, EXC_GP, 0);
>> +
>> +        if ( state->rex_prefix & REX_W )
>> +        {
>> +            /* See above for why operand/constraints are this way. */
>> +            asm volatile ( ".byte 0x48; fxrstor (%0)"
>> +                           :: "R" (fxsr), "m" (*fxsr) );
>> +        }
>> +        else
>> +            asm volatile ( "fxrstor %0" :: "m" (*fxsr) );
>> +        break;
>> +    }
>> +
>> +    case blk_fxsave:
>> +    {
>> +        struct x86_fxsr *fxsr = FXSAVE_AREA;
>> +
>> +        ASSERT(!data);
>> +        ASSERT(bytes == sizeof(*fxsr));
>> +        ASSERT(state->op_bytes <= bytes);
>> +
>> +        if ( state->rex_prefix & REX_W )
>> +        {
>> +            /* See above for why operand/constraint are this way. */
>> +            asm volatile ( ".byte 0x48; fxsave (%0)"
>> +                           :: "R" (state->op_bytes < sizeof(*fxsr) ? fxsr : 
>> ptr)
>> +                           : "memory" );
>> +        }
>> +        else
>> +            asm volatile ( "fxsave (%0)"
>> +                           :: "r" (state->op_bytes < sizeof(*fxsr) ? fxsr : 
>> ptr)
>> +                           : "memory" );
>> +        if ( state->op_bytes < sizeof(*fxsr) )
>> +            memcpy(ptr, fxsr, state->op_bytes);
> 
> I think this logic would be clearer to follow with:
> 
> void *buf = state->op_bytes < sizeof(*fxsr) ? fxsr : ptr;
> 
> ...
> 
> if ( buf != ptr )
>     memcpy(ptr, fxsr, state->op_bytes);
> 
> This more clearly highlights the "we either fxsave'd straight into the
> destination pointer, or into a local buffer if we only want a subset"
> property.

Ah, yes, and by having buf (or really repurposed fxsr) have
proper type I can then also avoid the memory clobbers, making
the asm()s more similar to the ones used for FXRSTOR emulation.

>> --- a/xen/arch/x86/x86_emulate.c
>> +++ b/xen/arch/x86/x86_emulate.c
>> @@ -42,6 +42,8 @@
>>      }                                                      \
>>  })
>>  
>> +#define FXSAVE_AREA current->arch.fpu_ctxt
> 
> How safe is this?  Don't we already use this buffer to recover the old
> state in the case of an exception?

For a memory write fault after having updated register state
already, yes. But that can't be the case here. Nevertheless
forcing me to look at this again turned up a bug: We need to
set state->fpu_ctrl in order to keep {,hvmemul_}put_fpu()
from trying to replace FIP/FOP/FDP.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.