[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v10 1/9] x86emul: address x86_insn_is_mem_{access, write}() omissions



On 29.05.2020 17:03, Andrew Cooper wrote:
> On 29/05/2020 14:29, Jan Beulich wrote:
>> On 29.05.2020 14:18, Andrew Cooper wrote:
>>> On 25/05/2020 15:26, Jan Beulich wrote:
>>>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
>>>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
>>>> @@ -11474,25 +11474,87 @@ x86_insn_operand_ea(const struct x86_emu
>>>>      return state->ea.mem.off;
>>>>  }
>>>>  
>>>> +/*
>>>> + * This function means to return 'true' for all supported insns with 
>>>> explicit
>>>> + * accesses to memory.  This means also insns which don't have an explicit
>>>> + * memory operand (like POP), but it does not mean e.g. segment selector
>>>> + * loads, where the descriptor table access is considered an implicit one.
>>>> + */
>>>>  bool
>>>>  x86_insn_is_mem_access(const struct x86_emulate_state *state,
>>>>                         const struct x86_emulate_ctxt *ctxt)
>>>>  {
>>>> +    if ( mode_64bit() && state->not_64bit )
>>>> +        return false;
>>> Is this path actually used?
>> Yes, it is. It's only x86_emulate() which has
>>
>>     generate_exception_if(state->not_64bit && mode_64bit(), EXC_UD);
>>
>> right now.
> 
> Oh.  That is a bit awkward.
> 
>>> state->not_64bit ought to fail instruction
>>> decode, at which point we wouldn't have a valid state to be used here.
>> x86_decode() currently doesn't have much raising of #UD at all, I
>> think. If it wasn't like this, the not_64bit field wouldn't be
>> needed - it's used only to communicate from decode to execute.
>> We're not entirely consistent with this though, seeing in
>> x86_decode_onebyte(), a few lines below the block of case labels
>> setting that field,
>>
>>     case 0x9a: /* call (far, absolute) */
>>     case 0xea: /* jmp (far, absolute) */
>>         generate_exception_if(mode_64bit(), EXC_UD);
> 
> This is because there is no legitimate way to determine the end of the
> instruction.
> 
> Most of the not_64bit instructions do have a well-defined end, even if
> they aren't permitted for use.

I wouldn't bet on that. Just look at the re-use of opcode D6
(salc in 32-bit) by L1OM for an extreme example. There's nothing
we can say on how any of the reserved opcodes may get re-used.
Prior to AVX / AVX512 we'd have estimated C4, C5, and 62 wrongly
as well (but that's true independent of execution mode).

>> We could certainly drop the field and raise #UD during decode
>> already, but don't you think we then should do so for all
>> encodings that ultimately lead to #UD, e.g. also for AVX insns
>> without AVX available to the guest? This would make
>> x86_decode() quite a bit larger, as it would then also need to
>> have a giant switch() (or something else that's suitable to
>> cover all cases).
> 
> I think there is a semantic difference between "we can't parse the
> instruction", and "we can parse it, but it's not legitimate in this
> context".
> 
> I don't think our exact split is correct, but I don't think moving all
> #UD checking into x86_decode() is correct either.

Do you have any clear, sufficiently simple rule in mind for where
we could draw the boundary?

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.