Xen project Mailing List

Re: [Xen-devel] [PATCH for-4.13] x86/vmx: always sync PIR to IRR before vmentry

To: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Date: Mon, 18 Nov 2019 15:00:00 +0100

Cc: Kevin Tian <kevin.tian@xxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Joe Jin <joe.jin@xxxxxxxxxx>, Jun Nakajima <jun.nakajima@xxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx

Delivery-date: Mon, 18 Nov 2019 14:00:14 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 18.11.2019 14:46, Roger Pau Monné wrote: > On Mon, Nov 18, 2019 at 01:01:58PM +0100, Jan Beulich wrote: >> On 18.11.2019 11:16, Roger Pau Monne wrote: >>> When using posted interrupts on Intel hardware it's possible that the >>> vCPU resumes execution with a stale local APIC IRR register because >>> depending on the interrupts to be injected vlapic_has_pending_irq >>> might not be called, and thus PIR won't be synced into IRR. >>> >>> Fix this by making sure PIR is always synced to IRR in vmx_intr_assist >>> regardless of what interrupts are pending. >> >> For this part, did you consider pulling ahead to the beginning >> of hvm_vcpu_has_pending_irq() its call to vlapic_has_pending_irq()? > > I assumed the order in hvm_vcpu_has_pending_irq is there for a reason. > I could indeed move vlapic_has_pending_irq to the top, but then either > the result is discarded if for example a NMI is pending injection > (in which case there's no need to go through all the logic in > vlapic_has_pending_irq), or we invert the priority of event > injection. Changing the order of events injected is not an option afaict. The pointless processing done is a valid concern, yet the suggestion was specifically to have (part of) this processing to occur early. The discarding of the result, in turn, is not a problem afaict, as a subsequent call will return the same result (unless a higher priority interrupt has surfaced in the meantime). > I have to admit I have doubts about the code in > hvm_vcpu_has_pending_irq. I'm not sure what's the motivation for > example to give priority to HVMIRQ_callback_vector over other vectors > from the lapic. I vaguely recall there being a reason, but I guess it would take some git archaeology to find out. >> I ask because ... >> >>> --- a/xen/arch/x86/hvm/vmx/intr.c >>> +++ b/xen/arch/x86/hvm/vmx/intr.c >>> @@ -232,6 +232,14 @@ void vmx_intr_assist(void) >>> enum hvm_intblk intblk; >>> int pt_vector; >>> >>> + if ( cpu_has_vmx_posted_intr_processing ) >>> + /* >>> + * Always force PIR to be synced to IRR before vmentry, this is >>> also >>> + * done by vlapic_has_pending_irq but it's possible other pending >>> + * interrupts prevent the execution of that function. >>> + */ >>> + vmx_sync_pir_to_irr(v); >> >> ... this addition looks more like papering over some issue than >> actually taking care of it. > > Xen needs to make sure PIR is synced to IRR before entering > non-root mode. I could place the call somewhere else, or alternatively > Xen could disable interrupts, send a self-ipi with the posted vector > and enter non-root mode. That should IMO force a resync of PIR to IRR > when resuming vCPU execution, but is overly complicated. Indeed, further complicating things can't be the goal. But finding the most suitable place to make the call might still be worthwhile. >> Then again I wonder whether the PIR->IRR sync is actually >> legitimate to perform when v != current. > > IMO this is fine as long as the vCPU is not running, as in that case > the hardware is not in control of IRR. Here and ... >> If it's not, then there >> might be a wider set of problems (see e.g. >> hvm_local_events_need_delivery()). But of course the adjustment >> to hvm_vcpu_has_pending_irq() could also be to make the call >> early only when v == current. > > I don't think we should be that restrictive, v == current || > !vcpu_runable(v) ought to be safe. I've also forgot to send my > pre-patch to introduce an assert to that effect: > > https://lists.xenproject.org/archives/html/xen-devel/2019-11/msg00635.html > >> A last question is that on the consequences of overly aggressive >> sync-ing - that'll harm performance, but shouldn't affect >> correctness if I'm not mistaken. > > That's correct, as long as the vcpu is the current one or it's not > running. ... here I continue to be worried of races: Any check for a vCPU to be non-running (or non-runnable) is stale the moment you inspect the result of the check. Unless, of course, you suppress scheduling (actions potentially making a vCPU runnable) during that time window. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.