[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH V5] x86/hvm: fix domain crash when CR3 has the noflush bit set



On 02/27/2018 04:19 PM, Razvan Cojocaru wrote:
> On 02/27/2018 05:53 PM, George Dunlap wrote:
>> On 02/23/2018 07:29 AM, Razvan Cojocaru wrote:
>>> On 02/23/2018 06:53 AM, Tian, Kevin wrote:
>>>>> From: Razvan Cojocaru [mailto:rcojocaru@xxxxxxxxxxxxxxx]
>>>>> Sent: Friday, February 16, 2018 6:22 PM
>>>>>
>>>>> The emulation layers of Xen lack PCID support, and as we only offer
>>>>> PCID to HAP guests, all writes to CR3 are handled by hardware,
>>>>> except when introspection is involved. Consequently, trying to set
>>>>> CR3 when the noflush bit is set in hvm_set_cr3() leads to domain
>>>>> crashes. The workaround is to clear the noflush bit in
>>>>> hvm_set_cr3(). CR3 values in hvm_monitor_cr() are also sanitized.
>>>>> Additionally, a bool parameter now propagates to
>>>>> {svm,vmx}_update_guest_cr(), so that no flushes occur when
>>>>> the bit was set.
>>>>
>>>> Above message is not very clear for people who didn't follow
>>>> previous discussions, e.g. why lacking PCID support in emulation 
>>>> layer would lead to domain crash? and why noflush trick can 
>>>> avoid the situation? Can you help elaborate it?
>>>
>>> Lacking PCID support in the emulation layer creates two different way of
>>> handling the NOFLUSH being set: one is in hardware, and this happens for
>>> everything except the introspection case, and one in the emulation layer
>>> (this happens when an introspection agent asks Xen to emulate an
>>> instruction when it replies to an EPT fault vm_event).
>>>
>>> The checks in place expected the guest state to be correct with regard
>>> to handling the bit being set "the hardware way", but the emulation
>>> layer was, previous to this patch, completely ignoring the NOFLUSH bit.
>>> Hence, there was a difference between the expected domain state and the
>>> actual domain state, which translated into a domain crash.
>>
>> Just getting up to speed on this -- is it the case that the hardware
>> will automatically clear this bit; so because it wasn't being cleared in
>> hvm_set_cr3() during emulation, one of the checks in Xen (on exiting the
>> emulator?) was failing due to the bit being set, and then crashing the
>> domain?
> 
> Yes, I believe that that is what happens. The check is done on VMENTRY,
> this is the original email reporting the issue showing it:
> 
> https://lists.xenproject.org/archives/html/xen-devel/2018-01/msg02275.html
> 
>> In which case you're not so much fixing a domain crash by providing a
>> workaround, as actually implementing a bit of functionality.
>>
>> If so I think the commit message could use expanding.  What about
>> something like the following (assuming I haven't misunderstood what's
>> going on):
>>
>> ---
>> In hardware, when PCID support is enabled and the X86_CR3_NOFLUSH bit is
>> set when writing a CR3 value, the hardware will clear that that bit and
>> change the CR3 without flushing the TLB.  hvm_set_cr3(), however, was
>> ignoring this bit; the result was that post-emulation checks detected an
>> invalid CR3 value and crashed the domain.
>>
>> Handle X86_CR3_NOFLUSH in hvm_set_cr3() by:
>> 1. Clearing the bit
>> 2. Passing a "noflush" flag to lower-level cr3 setting functions to
>> indicate that a flush should not be performed.
>>
>> Also clear X86_CR3_NOFLUSH when reporting CR3 monitored CR3 writes.
>>
>> This allows introspection to be used on VMs whose operating system uses
>> the NOFLUSH bit.
>> ---
> 
> Fair enough, I'm happy to change the commit message if nobody else
> objects / has other changes in mind.
> 
>> As an aside -- are you sure clearing the NOFLUSH from reported CR3
>> values during introspection is the right thing to do?  You don't think
>> your introspection engine will ever want to know if the guest OS is
>> setting this bit?
> 
> We can't be sure this will never be useful to know, but at least for now
> I've not seen any requests to be able to, and our introspection engine
> is not interested in the information (in fact, one of the reasons why
> we've even missed the problem until it's been reported is that we
> haven't even been subscribing to CR3 write events for a while now).
> 
> So as far as we're concerned, losing the information about the NOFLUSH
> bit is no problem at all (and it's definitely preferable to a domain
> crash). Since Tamas has acked the patch, it's safe to assume that they
> have no problem with it either, and Bitweasil seemed happy with the
> solution as well.
> 
> I suppose we can always write a patch later if it turns out that this is
> valuable information.

Well if you want to maintain backwards compatibility, you'd need to
either have the bit opt-in, or pass the noflush bit back somewhere else
(either with a flag or with a different part of the struct).

If everyone is happy with it I don't mind.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.