[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH for-4.12 V3] x86/altp2m: fix HVMOP_altp2m_set_domain_state race

>>> On 11.02.19 at 18:21, <rcojocaru@xxxxxxxxxxxxxxx> wrote:
> On 2/11/19 6:59 PM, Jan Beulich wrote:
>>>> Thanks for noticing, actually this appears to invalidate the whole 
>>>> purpose of the patch (I should have tested this more before sumbitting 
>>>> V3, sorry).
>>>> The whole point of the new boolean is to have p2m assigned an altp2m 
>>>> regardless of altp2m_active() (hence the change) - which now no longer 
>>>> happens. I got carried away with this change.
>>>> The fact that this is so easy to get wrong is the reason why I've 
>>>> preferred the domain_pause() solution. There appears to be no clean way 
>>>> to fix this otherwise, and if this is so easy to misunderstand it'll 
>>>> break just as easily with further changes.
>>>> I suppose I could just pass the bool along to p2m_get_altp2m() (and 
>>>> indeed remove the if())...
>>> I think the best that can be done here is to check if altp2m_active() 
>>> early in p2m_switch_vcpu_altp2m_by_id() and 
>>> p2m_switch_domain_altp2m_by_id(), then bail if it's not active. Since 
>>> these are only called by HVMOP_altp2m_* (and thus serialized by the 
>>> domain lock), it should be enough WRT HVMOP_altp2m_set_domain_state.
>> I'm confused: Where do you see the domain lock used there?
>> Plus I can't see p2m_switch_vcpu_altp2m_by_id() called for
>> any HVMOP_altp2m_* at all. One of the actual callers is guarded
>> by altp2m_active(), but the other isn't.
> do_altp2m_op() does d = rcu_lock_domain_by_any_id(a.domain);, and
> unlocks it before it exits.

But that's not the "domain lock". This is just making sure the domain
won't disappear behind your back.

> But you're right, p2m_switch_vcpu_altp2m_by_id() is not called for any
> HVMOP_altp2m_*, I've misread that. Hence, I believe both callers
> ofp2m_switch_vcpu_altp2m_by_id() may race with HVMOP_altp2m_*.
> Would you like me to add the altp2m_active() check in both
> p2m_switch_domain_altp2m_by_id() and p2m_switch_vcpu_altp2m_by_id(), and
> make it harder to race (it still won't be impossible, since the bool may
> become false between the check and the actual function logic for
> p2m_switch_vcpu_altp2m_by_id(), as you've noticed)?

Well, altp2m being Tech Preview, any partial fix _could_ do in
principle. But personally I dislike such half hearted approaches,
and hence I'd recommend to address the issue in full, i.e.
eliminate race potential altogether. In the end you're going to
be bitten more than me by not getting this into proper shape.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.