[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH for-4.13] x86/vmx: always sync PIR to IRR before vmentry



On 18.11.2019 14:46, Roger Pau Monné  wrote:
> On Mon, Nov 18, 2019 at 01:01:58PM +0100, Jan Beulich wrote:
>> On 18.11.2019 11:16, Roger Pau Monne wrote:
>>> When using posted interrupts on Intel hardware it's possible that the
>>> vCPU resumes execution with a stale local APIC IRR register because
>>> depending on the interrupts to be injected vlapic_has_pending_irq
>>> might not be called, and thus PIR won't be synced into IRR.
>>>
>>> Fix this by making sure PIR is always synced to IRR in vmx_intr_assist
>>> regardless of what interrupts are pending.
>>
>> For this part, did you consider pulling ahead to the beginning
>> of hvm_vcpu_has_pending_irq() its call to vlapic_has_pending_irq()?
> 
> I assumed the order in hvm_vcpu_has_pending_irq is there for a reason.
> I could indeed move vlapic_has_pending_irq to the top, but then either
> the result is discarded if for example a NMI is pending injection
> (in which case there's no need to go through all the logic in
> vlapic_has_pending_irq), or we invert the priority of event
> injection.

Changing the order of events injected is not an option afaict. The
pointless processing done is a valid concern, yet the suggestion
was specifically to have (part of) this processing to occur early.
The discarding of the result, in turn, is not a problem afaict, as
a subsequent call will return the same result (unless a higher
priority interrupt has surfaced in the meantime).

> I have to admit I have doubts about the code in
> hvm_vcpu_has_pending_irq. I'm not sure what's the motivation for
> example to give priority to HVMIRQ_callback_vector over other vectors
> from the lapic.

I vaguely recall there being a reason, but I guess it would take
some git archaeology to find out.

>> I ask because ...
>>
>>> --- a/xen/arch/x86/hvm/vmx/intr.c
>>> +++ b/xen/arch/x86/hvm/vmx/intr.c
>>> @@ -232,6 +232,14 @@ void vmx_intr_assist(void)
>>>      enum hvm_intblk intblk;
>>>      int pt_vector;
>>>  
>>> +    if ( cpu_has_vmx_posted_intr_processing )
>>> +        /*
>>> +         * Always force PIR to be synced to IRR before vmentry, this is 
>>> also
>>> +         * done by vlapic_has_pending_irq but it's possible other pending
>>> +         * interrupts prevent the execution of that function.
>>> +         */
>>> +        vmx_sync_pir_to_irr(v);
>>
>> ... this addition looks more like papering over some issue than
>> actually taking care of it.
> 
> Xen needs to make sure PIR is synced to IRR before entering
> non-root mode. I could place the call somewhere else, or alternatively
> Xen could disable interrupts, send a self-ipi with the posted vector
> and enter non-root mode. That should IMO force a resync of PIR to IRR
> when resuming vCPU execution, but is overly complicated.

Indeed, further complicating things can't be the goal. But
finding the most suitable place to make the call might still be
worthwhile.

>> Then again I wonder whether the PIR->IRR sync is actually
>> legitimate to perform when v != current.
> 
> IMO this is fine as long as the vCPU is not running, as in that case
> the hardware is not in control of IRR.

Here and ...

>> If it's not, then there
>> might be a wider set of problems (see e.g.
>> hvm_local_events_need_delivery()). But of course the adjustment
>> to hvm_vcpu_has_pending_irq() could also be to make the call
>> early only when v == current.
> 
> I don't think we should be that restrictive, v == current ||
> !vcpu_runable(v) ought to be safe. I've also forgot to send my
> pre-patch to introduce an assert to that effect:
> 
> https://lists.xenproject.org/archives/html/xen-devel/2019-11/msg00635.html
> 
>> A last question is that on the consequences of overly aggressive
>> sync-ing - that'll harm performance, but shouldn't affect
>> correctness if I'm not mistaken.
> 
> That's correct, as long as the vcpu is the current one or it's not
> running.

... here I continue to be worried of races: Any check for a vCPU to
be non-running (or non-runnable) is stale the moment you inspect the
result of the check. Unless, of course, you suppress scheduling
(actions potentially making a vCPU runnable) during that time window.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.