[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH for-4.12 V3] x86/altp2m: fix HVMOP_altp2m_set_domain_state race

On 2/11/19 6:59 PM, Jan Beulich wrote:
>>> Thanks for noticing, actually this appears to invalidate the whole 
>>> purpose of the patch (I should have tested this more before sumbitting 
>>> V3, sorry).
>>> The whole point of the new boolean is to have p2m assigned an altp2m 
>>> regardless of altp2m_active() (hence the change) - which now no longer 
>>> happens. I got carried away with this change.
>>> The fact that this is so easy to get wrong is the reason why I've 
>>> preferred the domain_pause() solution. There appears to be no clean way 
>>> to fix this otherwise, and if this is so easy to misunderstand it'll 
>>> break just as easily with further changes.
>>> I suppose I could just pass the bool along to p2m_get_altp2m() (and 
>>> indeed remove the if())...
>> I think the best that can be done here is to check if altp2m_active() 
>> early in p2m_switch_vcpu_altp2m_by_id() and 
>> p2m_switch_domain_altp2m_by_id(), then bail if it's not active. Since 
>> these are only called by HVMOP_altp2m_* (and thus serialized by the 
>> domain lock), it should be enough WRT HVMOP_altp2m_set_domain_state.
> I'm confused: Where do you see the domain lock used there?
> Plus I can't see p2m_switch_vcpu_altp2m_by_id() called for
> any HVMOP_altp2m_* at all. One of the actual callers is guarded
> by altp2m_active(), but the other isn't.

do_altp2m_op() does d = rcu_lock_domain_by_any_id(a.domain);, and
unlocks it before it exits.

But you're right, p2m_switch_vcpu_altp2m_by_id() is not called for any
HVMOP_altp2m_*, I've misread that. Hence, I believe both callers
ofp2m_switch_vcpu_altp2m_by_id() may race with HVMOP_altp2m_*.

Would you like me to add the altp2m_active() check in both
p2m_switch_domain_altp2m_by_id() and p2m_switch_vcpu_altp2m_by_id(), and
make it harder to race (it still won't be impossible, since the bool may
become false between the check and the actual function logic for
p2m_switch_vcpu_altp2m_by_id(), as you've noticed)?

>> I see no other way out of this (aside from the domain_pause() fix).
> If only that one would have been a complete fix, rather than just a
> partial one.

Agreed, but that one at least clearly fixes the external case, whereas
this doesn't seem to cover all corner cases for any situation.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.