[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH V5] x86/hvm: fix domain crash when CR3 has the noflush bit set

On 02/27/2018 05:53 PM, George Dunlap wrote:
> On 02/23/2018 07:29 AM, Razvan Cojocaru wrote:
>> On 02/23/2018 06:53 AM, Tian, Kevin wrote:
>>>> From: Razvan Cojocaru [mailto:rcojocaru@xxxxxxxxxxxxxxx]
>>>> Sent: Friday, February 16, 2018 6:22 PM
>>>> The emulation layers of Xen lack PCID support, and as we only offer
>>>> PCID to HAP guests, all writes to CR3 are handled by hardware,
>>>> except when introspection is involved. Consequently, trying to set
>>>> CR3 when the noflush bit is set in hvm_set_cr3() leads to domain
>>>> crashes. The workaround is to clear the noflush bit in
>>>> hvm_set_cr3(). CR3 values in hvm_monitor_cr() are also sanitized.
>>>> Additionally, a bool parameter now propagates to
>>>> {svm,vmx}_update_guest_cr(), so that no flushes occur when
>>>> the bit was set.
>>> Above message is not very clear for people who didn't follow
>>> previous discussions, e.g. why lacking PCID support in emulation 
>>> layer would lead to domain crash? and why noflush trick can 
>>> avoid the situation? Can you help elaborate it?
>> Lacking PCID support in the emulation layer creates two different way of
>> handling the NOFLUSH being set: one is in hardware, and this happens for
>> everything except the introspection case, and one in the emulation layer
>> (this happens when an introspection agent asks Xen to emulate an
>> instruction when it replies to an EPT fault vm_event).
>> The checks in place expected the guest state to be correct with regard
>> to handling the bit being set "the hardware way", but the emulation
>> layer was, previous to this patch, completely ignoring the NOFLUSH bit.
>> Hence, there was a difference between the expected domain state and the
>> actual domain state, which translated into a domain crash.
> Just getting up to speed on this -- is it the case that the hardware
> will automatically clear this bit; so because it wasn't being cleared in
> hvm_set_cr3() during emulation, one of the checks in Xen (on exiting the
> emulator?) was failing due to the bit being set, and then crashing the
> domain?

Yes, I believe that that is what happens. The check is done on VMENTRY,
this is the original email reporting the issue showing it:


> In which case you're not so much fixing a domain crash by providing a
> workaround, as actually implementing a bit of functionality.
> If so I think the commit message could use expanding.  What about
> something like the following (assuming I haven't misunderstood what's
> going on):
> ---
> In hardware, when PCID support is enabled and the X86_CR3_NOFLUSH bit is
> set when writing a CR3 value, the hardware will clear that that bit and
> change the CR3 without flushing the TLB.  hvm_set_cr3(), however, was
> ignoring this bit; the result was that post-emulation checks detected an
> invalid CR3 value and crashed the domain.
> Handle X86_CR3_NOFLUSH in hvm_set_cr3() by:
> 1. Clearing the bit
> 2. Passing a "noflush" flag to lower-level cr3 setting functions to
> indicate that a flush should not be performed.
> Also clear X86_CR3_NOFLUSH when reporting CR3 monitored CR3 writes.
> This allows introspection to be used on VMs whose operating system uses
> the NOFLUSH bit.
> ---

Fair enough, I'm happy to change the commit message if nobody else
objects / has other changes in mind.

> As an aside -- are you sure clearing the NOFLUSH from reported CR3
> values during introspection is the right thing to do?  You don't think
> your introspection engine will ever want to know if the guest OS is
> setting this bit?

We can't be sure this will never be useful to know, but at least for now
I've not seen any requests to be able to, and our introspection engine
is not interested in the information (in fact, one of the reasons why
we've even missed the problem until it's been reported is that we
haven't even been subscribing to CR3 write events for a while now).

So as far as we're concerned, losing the information about the NOFLUSH
bit is no problem at all (and it's definitely preferable to a domain
crash). Since Tamas has acked the patch, it's safe to assume that they
have no problem with it either, and Bitweasil seemed happy with the
solution as well.

I suppose we can always write a patch later if it turns out that this is
valuable information.

> In any case, with the updated commit message:
> Acked-by: George Dunlap <george.dunlap@xxxxxxxxxx>

Thank you very much for the review!


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.