[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Regression with vcpu runstate info and XEN_RUNSTATE_UPDATE



On 23.09.2019 11:42, Jürgen Groß wrote:
> On 16.09.19 17:44, Jan Beulich wrote:
>> On 16.09.2019 16:50, Andrew Cooper wrote:
>>> After a complicated investigation, it turns out that c/s 2529c850ea48
>>> broke xc_vcpu_getinfo().
>>>
>>> The bug looks as if it is in vcpu_runstate_get(), which doesn't account
>>> for XEN_RUNSTATE_UPDATE and calculating a wildly inappropriate delta.
>>> Ultimately, the result of XEN_DOMCTL_getvcpuinfo ends up very
>>> occasionally with op->u.getvcpuinfo.cpu_time being wrong by 1 << 63.
>>>
>>> Given some of the callers of vcpu_runstate_get(), I don't think it is
>>> reasonable to pause the VCPU while reading the runstate info.  However,
>>> it is also unclear whether waiting for XEN_RUNSTATE_UPDATE to drop in
>>> vcpu_runstate_get() is safe either.
>>
>> First and foremost I'm wondering whether simply masking off
>> XEN_RUNSTATE_UPDATE in vcpu_runstate_get() wouldn't be an
>> option. The assumption of the feature as a whole is for the
>> high bit to never be set in an actual time value, after all.
>>
>> The other option I'd see is for vcpu_runstate_get() to gain
>> a boolean return type by which it would indicate to
>> interested callers whether the latching of the values
>> happened while an update was in progress elsewhere. Callers
>> needing to consume the potentially incorrect result could
>> then choose to wait or schedule a hypercall continuation.
>>
>> The 3rd option (less desirable imo not the least because it
>> would require touching all callers) would be for the function
>> to gain a parameter telling it whether to spin until
>> XEN_RUNSTATE_UPDATE is observed clear.
> 
> And the 4th option would be to let vcpu_runstate_get() operate on a
> local runstate copy in order to avoid setting XEN_RUNSTATE_UPDATE in
> the "official" runstate info of the vcpu.

But it already does - first thing it does is a memcpy() from the
"official" instance to a caller provided buffer. (What is
confusing, at least to me, is that the lock gets dropped last,
when everything after the memcpy() already acts on the copy only.)

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.