Xen project Mailing List

RE: [PATCH v3] x86/HVM: more consistently set I/O completion

To: "'Jan Beulich'" <jbeulich@xxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxx>

From: Paul Durrant <xadimgnik@xxxxxxxxx>

Date: Fri, 4 Sep 2020 17:17:04 +0100

Cc: "'Andrew Cooper'" <andrew.cooper3@xxxxxxxxxx>, "'Wei Liu'" <wl@xxxxxxx>, 'Roger Pau Monné' <roger.pau@xxxxxxxxxx>, "'Jun Nakajima'" <jun.nakajima@xxxxxxxxx>, "'Kevin Tian'" <kevin.tian@xxxxxxxxx>, "'George Dunlap'" <George.Dunlap@xxxxxxxxxxxxx>

Delivery-date: Fri, 04 Sep 2020 16:16:39 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Thread-index: AQKLGnh8NTAXxwELiAxasezQUK7x/KfvK7uA

> -----Original Message----- > From: Jan Beulich <jbeulich@xxxxxxxx> > Sent: 27 August 2020 08:09 > To: xen-devel@xxxxxxxxxxxxxxxxxxxx > Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>; Wei Liu <wl@xxxxxxx>; Roger > Pau Monné > <roger.pau@xxxxxxxxxx>; Paul Durrant <paul@xxxxxxx>; Jun Nakajima > <jun.nakajima@xxxxxxxxx>; Kevin Tian > <kevin.tian@xxxxxxxxx>; George Dunlap <George.Dunlap@xxxxxxxxxxxxx> > Subject: [PATCH v3] x86/HVM: more consistently set I/O completion > > Doing this just in hvm_emulate_one_insn() is not enough. > hvm_ud_intercept() and hvm_emulate_one_vm_event() can get invoked for > insns requiring one or more continuations, and at least in principle > hvm_emulate_one_mmio() could, too. Without proper setting of the field, > handle_hvm_io_completion() will do nothing completion-wise, and in > particular the missing re-invocation of the insn emulation paths will > lead to emulation caching not getting disabled in due course, causing > the ASSERT() in {svm,vmx}_vmenter_helper() to trigger. > > Reported-by: Don Slutz <don.slutz@xxxxxxxxx> > > Similar considerations go for the clearing of vio->mmio_access, which > gets moved as well. > > Additionally all updating of vio->mmio_* now gets done dependent upon > the new completion value, rather than hvm_ioreq_needs_completion()'s > return value. This is because it is the completion chosen which controls > what path will be taken when handling the completion, not the simple > boolean return value. In particular, PIO completion doesn't involve > going through the insn emulator, and hence emulator state ought to get > cleared early (or it won't get cleared at all). > > The new logic, besides allowing for a caller override for the > continuation type to be set (for VMX real mode emulation), will also > avoid setting an MMIO completion when a simpler PIO one will do. This > is a minor optimization only as a side effect - the behavior is strictly > needed at least for hvm_ud_intercept(), as only memory accesses can > successfully complete through handle_mmio(). Care of course needs to be > taken to correctly deal with "mixed" insns (doing both MMIO and PIO at > the same time, i.e. INS/OUTS). For this, hvmemul_validate() now latches > whether the insn being emulated is a memory access, as this information > is no longer easily available at the point where we want to consume it. > > Note that the presence of non-NULL .validate fields in the two ops > structures in hvm_emulate_one_mmio() was really necessary even before > the changes here: Without this, passing non-NULL as middle argument to > hvm_emulate_init_once() is meaningless. > > The restrictions on when the #UD intercept gets actually enabled are why > it was decided that this is not a security issue: > - the "hvm_fep" option to enable its use is a debugging option only, > - for the cross-vendor case is considered experimental, even if > unfortunately SUPPORT.md doesn't have an explicit statement about > this. > The other two affected functions are > - hvm_emulate_one_vm_event(), used for introspection, > - hvm_emulate_one_mmio(), used for Dom0 only, > which aren't qualifying this as needing an XSA either. > > Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx> > Tested-by: Don Slutz <don.slutz@xxxxxxxxx> > --- > v3: Add comment ahead of _hvm_emulate_one(). Add parentheses in a > conditional expr. Justify why this does not need an XSA. > v2: Make updating of vio->mmio_* fields fully driven by the new > completion value. > --- > I further think that the entire tail of _hvm_emulate_one() (everything > past the code changed/added there by this patch) wants skipping in case > a completion is needed, at the very least for the mmio and realmode > cases, where we know we'll come back here. > > --- a/xen/arch/x86/hvm/emulate.c > +++ b/xen/arch/x86/hvm/emulate.c > @@ -1683,9 +1683,11 @@ static int hvmemul_validate( > const struct x86_emulate_state *state, > struct x86_emulate_ctxt *ctxt) > { > - const struct hvm_emulate_ctxt *hvmemul_ctxt = > + struct hvm_emulate_ctxt *hvmemul_ctxt = > container_of(ctxt, struct hvm_emulate_ctxt, ctxt); > > + hvmemul_ctxt->is_mem_access = x86_insn_is_mem_access(state, ctxt); > + > return !hvmemul_ctxt->validate || hvmemul_ctxt->validate(state, ctxt) > ? X86EMUL_OKAY : X86EMUL_UNHANDLEABLE; > } > @@ -2610,8 +2612,13 @@ static const struct x86_emulate_ops hvm_ > .vmfunc = hvmemul_vmfunc, > }; > > +/* > + * Note that passing HVMIO_no_completion into this function serves as kind > + * of (but not fully) an "auto select completion" indicator. > + */ > static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt, > - const struct x86_emulate_ops *ops) > + const struct x86_emulate_ops *ops, > + enum hvm_io_completion completion) > { > const struct cpu_user_regs *regs = hvmemul_ctxt->ctxt.regs; > struct vcpu *curr = current; > @@ -2642,16 +2649,31 @@ static int _hvm_emulate_one(struct hvm_e > rc = X86EMUL_RETRY; > > if ( !hvm_ioreq_needs_completion(&vio->io_req) ) > + completion = HVMIO_no_completion; The comment doesn't mention that passing in something other than HVMIO_no_completion could get overridden. Is that intentional? > + else if ( completion == HVMIO_no_completion ) > + completion = (vio->io_req.type != IOREQ_TYPE_PIO || > + hvmemul_ctxt->is_mem_access) ? HVMIO_mmio_completion > + : HVMIO_pio_completion; > + > + switch ( vio->io_completion = completion ) I thought we tended to avoid assignments in control flow statements. > { > + case HVMIO_no_completion: > + case HVMIO_pio_completion: > vio->mmio_cache_count = 0; > vio->mmio_insn_bytes = 0; > + vio->mmio_access = (struct npfec){}; > hvmemul_cache_disable(curr); > - } > - else > - { > + break; > + > + case HVMIO_mmio_completion: > + case HVMIO_realmode_completion: > BUILD_BUG_ON(sizeof(vio->mmio_insn) < > sizeof(hvmemul_ctxt->insn_buf)); > vio->mmio_insn_bytes = hvmemul_ctxt->insn_buf_bytes; > memcpy(vio->mmio_insn, hvmemul_ctxt->insn_buf, vio->mmio_insn_bytes); > + break; > + > + default: > + ASSERT_UNREACHABLE(); > } > > if ( hvmemul_ctxt->ctxt.retire.singlestep ) > @@ -2692,9 +2714,10 @@ static int _hvm_emulate_one(struct hvm_e > } > > int hvm_emulate_one( > - struct hvm_emulate_ctxt *hvmemul_ctxt) > + struct hvm_emulate_ctxt *hvmemul_ctxt, > + enum hvm_io_completion completion) > { > - return _hvm_emulate_one(hvmemul_ctxt, &hvm_emulate_ops); > + return _hvm_emulate_one(hvmemul_ctxt, &hvm_emulate_ops, completion); > } > > int hvm_emulate_one_mmio(unsigned long mfn, unsigned long gla) > @@ -2703,11 +2726,13 @@ int hvm_emulate_one_mmio(unsigned long m > .read = x86emul_unhandleable_rw, > .insn_fetch = hvmemul_insn_fetch, > .write = mmcfg_intercept_write, > + .validate = hvmemul_validate, > }; > static const struct x86_emulate_ops hvm_ro_emulate_ops_mmio = { > .read = x86emul_unhandleable_rw, > .insn_fetch = hvmemul_insn_fetch, > .write = mmio_ro_emulated_write, > + .validate = hvmemul_validate, > }; > struct mmio_ro_emulate_ctxt mmio_ro_ctxt = { .cr2 = gla }; > struct hvm_emulate_ctxt ctxt; > @@ -2727,8 +2752,8 @@ int hvm_emulate_one_mmio(unsigned long m > hvm_emulate_init_once(&ctxt, x86_insn_is_mem_write, > guest_cpu_user_regs()); > ctxt.ctxt.data = &mmio_ro_ctxt; > - rc = _hvm_emulate_one(&ctxt, ops); > - switch ( rc ) > + > + switch ( rc = _hvm_emulate_one(&ctxt, ops, HVMIO_no_completion) ) Again, why move the assignment into the switch statement? > { > case X86EMUL_UNHANDLEABLE: > case X86EMUL_UNIMPLEMENTED: > @@ -2755,7 +2780,8 @@ void hvm_emulate_one_vm_event(enum emul_ > switch ( kind ) > { > case EMUL_KIND_NOWRITE: > - rc = _hvm_emulate_one(&ctx, &hvm_emulate_ops_no_write); > + rc = _hvm_emulate_one(&ctx, &hvm_emulate_ops_no_write, > + HVMIO_no_completion); > break; > case EMUL_KIND_SET_CONTEXT_INSN: { > struct vcpu *curr = current; > @@ -2776,7 +2802,7 @@ void hvm_emulate_one_vm_event(enum emul_ > /* Fall-through */ > default: > ctx.set_context = (kind == EMUL_KIND_SET_CONTEXT_DATA); > - rc = hvm_emulate_one(&ctx); > + rc = hvm_emulate_one(&ctx, HVMIO_no_completion); > } > > switch ( rc ) > @@ -2874,6 +2900,8 @@ void hvm_emulate_init_per_insn( > pfec, NULL) == HVMTRANS_okay) ? > sizeof(hvmemul_ctxt->insn_buf) : 0; > } > + > + hvmemul_ctxt->is_mem_access = false; > } > > void hvm_emulate_writeback( > --- a/xen/arch/x86/hvm/hvm.c > +++ b/xen/arch/x86/hvm/hvm.c > @@ -3798,7 +3798,7 @@ void hvm_ud_intercept(struct cpu_user_re > return; > } > > - switch ( hvm_emulate_one(&ctxt) ) > + switch ( hvm_emulate_one(&ctxt, HVMIO_no_completion) ) > { > case X86EMUL_UNHANDLEABLE: > case X86EMUL_UNIMPLEMENTED: > --- a/xen/arch/x86/hvm/io.c > +++ b/xen/arch/x86/hvm/io.c > @@ -81,20 +81,11 @@ void send_invalidate_req(void) > bool hvm_emulate_one_insn(hvm_emulate_validate_t *validate, const char > *descr) > { > struct hvm_emulate_ctxt ctxt; > - struct vcpu *curr = current; > - struct hvm_vcpu_io *vio = &curr->arch.hvm.hvm_io; > int rc; > > hvm_emulate_init_once(&ctxt, validate, guest_cpu_user_regs()); > > - rc = hvm_emulate_one(&ctxt); > - > - if ( hvm_ioreq_needs_completion(&vio->io_req) ) > - vio->io_completion = HVMIO_mmio_completion; > - else > - vio->mmio_access = (struct npfec){}; > - > - switch ( rc ) > + switch ( rc = hvm_emulate_one(&ctxt, HVMIO_no_completion) ) > { > case X86EMUL_UNHANDLEABLE: > hvm_dump_emulation_state(XENLOG_G_WARNING, descr, &ctxt, rc); > --- a/xen/arch/x86/hvm/vmx/realmode.c > +++ b/xen/arch/x86/hvm/vmx/realmode.c > @@ -97,15 +97,11 @@ static void realmode_deliver_exception( > void vmx_realmode_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt) > { > struct vcpu *curr = current; > - struct hvm_vcpu_io *vio = &curr->arch.hvm.hvm_io; > int rc; > > perfc_incr(realmode_emulations); > > - rc = hvm_emulate_one(hvmemul_ctxt); > - > - if ( hvm_ioreq_needs_completion(&vio->io_req) ) > - vio->io_completion = HVMIO_realmode_completion; > + rc = hvm_emulate_one(hvmemul_ctxt, HVMIO_realmode_completion); Ok, I guess the override of completion is intentional to deal with this case. Perhaps expand the comment above _hvm_emulate_one() then. > > if ( rc == X86EMUL_UNHANDLEABLE ) > { > --- a/xen/include/asm-x86/hvm/emulate.h > +++ b/xen/include/asm-x86/hvm/emulate.h > @@ -48,6 +48,8 @@ struct hvm_emulate_ctxt { > > uint32_t intr_shadow; > > + bool is_mem_access; > + Whilst you mention in the commit comment why this is added, I don't see any consumer if it in this patch. Will the come later? Paul > bool_t set_context; > }; > > @@ -62,7 +64,8 @@ bool __nonnull(1, 2) hvm_emulate_one_ins > hvm_emulate_validate_t *validate, > const char *descr); > int hvm_emulate_one( > - struct hvm_emulate_ctxt *hvmemul_ctxt); > + struct hvm_emulate_ctxt *hvmemul_ctxt, > + enum hvm_io_completion completion); > void hvm_emulate_one_vm_event(enum emul_kind kind, > unsigned int trapnr, > unsigned int errcode);

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.