[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 9/9] x86/vmx: Don't leak EFER.NXE into guest context



>>> On 25.05.18 at 10:36, <andrew.cooper3@xxxxxxxxxx> wrote:
> On 25/05/2018 08:49, Jan Beulich wrote:
>>>>> On 22.05.18 at 13:20, <andrew.cooper3@xxxxxxxxxx> wrote:
>>> @@ -1650,22 +1641,81 @@ static void vmx_update_guest_cr(struct vcpu *v, 
> unsigned int cr,
>>>  
>>>  static void vmx_update_guest_efer(struct vcpu *v)
>>>  {
>>> -    unsigned long vm_entry_value;
>>> +    unsigned long entry_ctls, guest_efer = v->arch.hvm_vcpu.guest_efer,
>>> +        xen_efer = read_efer();
>>> +
>>> +    if ( paging_mode_shadow(v->domain) )
>>> +    {
>>> +        /*
>>> +         * When using shadow pagetables, EFER.NX is a Xen-owned bit and is 
>>> not
>>> +         * under guest control.
>>> +         */
>>> +        guest_efer &= ~EFER_NX;
>>> +        guest_efer |= xen_efer & EFER_NX;
>>> +
>>> +        /*
>>> +         * At the time of writing (May 2018), the Intel SDM "VM Entry: 
>>> Checks
>>> +         * on Guest Control Registers, Debug Registers and MSRs" section 
>>> says:
>>> +         *
>>> +         *  If the "Load IA32_EFER" VM-entry control is 1, the following
>>> +         *  checks are performed on the field for the IA32_MSR:
>>> +         *   - Bits reserved in the IA32_EFER MSR must be 0.
>>> +         *   - Bit 10 (corresponding to IA32_EFER.LMA) must equal the 
>>> value of
>>> +         *     the "IA-32e mode guest" VM-entry control.  It must also be
>>> +         *     identical to bit 8 (LME) if bit 31 in the CR0 field
>>> +         *     (corresponding to CR0.PG) is 1.
>>> +         *
>>> +         * Experimentally what actually happens is:
>>> +         *   - Checks for EFER.{LME,LMA} apply uniformly whether using the
>>> +         *     GUEST_EFER VMCS controls, or MSR load/save lists.
>>> +         *   - Without EPT, LME being different to LMA isn't tolerated by
>>> +         *     hardware.  As writes to CR0 are intercepted, it is safe to
>>> +         *     leave LME clear at this point, and fix up both LME and LMA 
>>> when
>>> +         *     CR0.PG is set.
>>> +         */
>>> +        if ( !(guest_efer & EFER_LMA) )
>>> +            guest_efer &= ~EFER_LME;
>>> +    }
>> Why is this latter adjustments done only for shadow mode?
> 
> How should I go about making the comment clearer?
> 
> When EPT is active, hardware is happy with LMA  != LME.  When EPT is
> disabled, hardware strictly requires LME == LMA.

Part of my problem may be that "Without EPT" can have two meanings:
Hardware without EPT, or EPT disabled on otherwise capable hardware.

> This particular condition occurs architecturally on the transition into
> long mode, between setting LME and setting CR0.PG, and updating EFER
> controls in the naive way results in a vmentry failure.
> 
> Having spoken to Intel, they agree with my assessment that the docs
> appear to be correct for Gen1 hardware, and stale for Gen2 hardware,
> where fixing this was one of many parts of making Unrestricted Guest work.

This suggests you mean the former, in which case the check really
doesn't belong inside a paging_mode_shadow() conditional.

>> After the above adjustments, when guest_efer still matches
>> v->arch.hvm_vcpu.guest_efer, couldn't we disable the MSR read
>> intercept?
> 
> In principle, yes.  We use load/save lists, as long as we remembered to
> recalculate EFER every time CR0 gets modified in the shadow path.
> 
> However, that would be a net performance penalty rather than benefit
> (which is why I've gone to the effort of creating load-only lists).
> 
> In practice, EFER is written at boot and not touched again.  Having
> load/save logic might avoid these vmexits, but at the cost of almost
> every other vmexit needing to keep the guest_efer in sync with the
> load/save list or VMCS field.

I can't seem to connect this to my question about MSR _read_ intercept.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.