[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] VPMU interrupt unreliability



On Mon, Jul 24, 2017 at 8:07 AM, Boris Ostrovsky
<boris.ostrovsky@xxxxxxxxxx> wrote:
>
>>> One thing I noticed is that the workaround doesn't appear to be
>>> complete: it is only checking PMC0 status and not other counters (fixed
>>> or architectural). Of course, without knowing what the actual problem
>>> was it's hard to say whether this was intentional.
>> handle_pmc_quirk appears to loop through all the counters ...
>
> Right, I didn't notice that it is shifting MSR_CORE_PERF_GLOBAL_STATUS
> value one by one and so it is looking at all bits.
>
>>
>>>> 2. Intercepting MSR loads for counters that have the workaround
>>>> applied and giving the guest the correct counter value.
>>>
>>> We'd have to keep track of whether the counter has been reset (by the
>>> quirk) since the last MSR write.
>> Yes.
>>
>>>> 3. Or perhaps even changing the workaround to disable the PMI on that
>>>> counter until the guest acks via GLOBAL_OVF_CTRL, assuming that works
>>>> on the relevant hardware.
>>> MSR_CORE_PERF_GLOBAL_OVF_CTRL is written immediately after the quirk
>>> runs (in core2_vpmu_do_interrupt()) so we already do this, don't we?
>> I'm suggesting waiting until the *guest* writes to the (virtualized)
>> GLOBAL_OVF_CTRL.
>
> Wouldn't it be better to wait until the counter is reloaded?

Maybe!  I haven't thought through it a lot.  It's still not clear to
me whether MSR_CORE_PERF_GLOBAL_OVF_CTRL actually controls the
interrupt in any way or whether it just resets the bits in
MSR_CORE_PERF_GLOBAL_STATUS and acking the interrupt on the APIC is
all that's required to reenable it.

- Kyle

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.