Xen project Mailing List

Re: [PATCH v10 1/9] x86emul: address x86_insn_is_mem_{access, write}() omissions

To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Date: Fri, 29 May 2020 17:19:58 +0200

Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Roger Pau Monne <roger.pau@xxxxxxxxxx>

Delivery-date: Fri, 29 May 2020 15:20:14 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 29.05.2020 17:03, Andrew Cooper wrote: > On 29/05/2020 14:29, Jan Beulich wrote: >> On 29.05.2020 14:18, Andrew Cooper wrote: >>> On 25/05/2020 15:26, Jan Beulich wrote: >>>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c >>>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c >>>> @@ -11474,25 +11474,87 @@ x86_insn_operand_ea(const struct x86_emu >>>> return state->ea.mem.off; >>>> } >>>> >>>> +/* >>>> + * This function means to return 'true' for all supported insns with >>>> explicit >>>> + * accesses to memory. This means also insns which don't have an explicit >>>> + * memory operand (like POP), but it does not mean e.g. segment selector >>>> + * loads, where the descriptor table access is considered an implicit one. >>>> + */ >>>> bool >>>> x86_insn_is_mem_access(const struct x86_emulate_state *state, >>>> const struct x86_emulate_ctxt *ctxt) >>>> { >>>> + if ( mode_64bit() && state->not_64bit ) >>>> + return false; >>> Is this path actually used? >> Yes, it is. It's only x86_emulate() which has >> >> generate_exception_if(state->not_64bit && mode_64bit(), EXC_UD); >> >> right now. > > Oh. That is a bit awkward. > >>> state->not_64bit ought to fail instruction >>> decode, at which point we wouldn't have a valid state to be used here. >> x86_decode() currently doesn't have much raising of #UD at all, I >> think. If it wasn't like this, the not_64bit field wouldn't be >> needed - it's used only to communicate from decode to execute. >> We're not entirely consistent with this though, seeing in >> x86_decode_onebyte(), a few lines below the block of case labels >> setting that field, >> >> case 0x9a: /* call (far, absolute) */ >> case 0xea: /* jmp (far, absolute) */ >> generate_exception_if(mode_64bit(), EXC_UD); > > This is because there is no legitimate way to determine the end of the > instruction. > > Most of the not_64bit instructions do have a well-defined end, even if > they aren't permitted for use. I wouldn't bet on that. Just look at the re-use of opcode D6 (salc in 32-bit) by L1OM for an extreme example. There's nothing we can say on how any of the reserved opcodes may get re-used. Prior to AVX / AVX512 we'd have estimated C4, C5, and 62 wrongly as well (but that's true independent of execution mode). >> We could certainly drop the field and raise #UD during decode >> already, but don't you think we then should do so for all >> encodings that ultimately lead to #UD, e.g. also for AVX insns >> without AVX available to the guest? This would make >> x86_decode() quite a bit larger, as it would then also need to >> have a giant switch() (or something else that's suitable to >> cover all cases). > > I think there is a semantic difference between "we can't parse the > instruction", and "we can parse it, but it's not legitimate in this > context". > > I don't think our exact split is correct, but I don't think moving all > #UD checking into x86_decode() is correct either. Do you have any clear, sufficiently simple rule in mind for where we could draw the boundary? Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.