[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3] x86: do not enable global pages when virtualized on AMD hardware

On 09.12.2019 15:46, Roger Pau Monné wrote:
> On Mon, Dec 09, 2019 at 03:21:28PM +0100, Jan Beulich wrote:
>> On 09.12.2019 11:20, Roger Pau Monné wrote:
>>> On Wed, Dec 04, 2019 at 06:05:11PM +0100, Jan Beulich wrote:
>>>> On 04.12.2019 17:18, Roger Pau Monné wrote:
>>>>> On Wed, Dec 04, 2019 at 05:11:42PM +0100, Jan Beulich wrote:
>>>>>> On 04.12.2019 16:12, Roger Pau Monne wrote:
>>>>>>> @@ -130,7 +143,7 @@ unsigned long pv_make_cr4(const struct vcpu *v)
>>>>>>>       */
>>>>>>>      if ( d->arch.pv.pcid )
>>>>>>>          cr4 |= X86_CR4_PCIDE;
>>>>>>> -    else if ( !d->arch.pv.xpti )
>>>>>>> +    else if ( !d->arch.pv.xpti && opt_global_pages )
>>>>>>>          cr4 |= X86_CR4_PGE;
>>>>>> I'm sorry for noticing this only now, but what about XEN_MINIMAL_CR4,
>>>>>> which includes X86_CR4_PGE?
>>>>> I've tried removing PGE from XEN_MINIMAL_CR4 but it made no noticeable
>>>>> performance difference, so I left it as-is.
>>>> My concern isn't about performance, but correctness. I admit I
>>>> forgot for a moment that we now always write CR4 (unless the
>>>> cached value matches the intended new one). Yet
>>>> mmu_cr4_features (starting out as XEN_MINIMAL_CR4) is still of
>>>> concern.
>>>> I think this at least requires extending the description to
>>>> discuss the correctness.
>>> Would you be fine with adding the following at the end of the commit
>>> message.
>>> "Note that XEN_MINIMAL_CR4 is not modified, and thus global pages are
>>> left enabled for the hypervisor. This is not an issue because the code
>>> to switch the control registers (cr3 and cr4) already takes into
>>> account such situation and performs the necessary flushes. The same
>>> already happens when using XPTI or PCIDE, as the guest cr4 doesn't
>>> have global pages enabled in that case either."
>> Yes, this is good for XEN_MINIMAL_CR4. But I think mmu_cr4_features
>> needs discussing (or at least mentioning, if the same arguments
>> apply) as well.
> The same applies to mmu_cr4_features, it's fine for the hypervisor to
> use a different set of cr4 features (especially PGE) than PV guests:
> this is already the case when using XPTI or PCIDE when Xen cr4 will
> have PGE and guests cr4 won't, and switch_cr3_cr4 DTRT.
> So I would s/XEN_MINIMAL_CR4/XEN_MINIMAL_CR4 and mmu_cr4_features/ in
> the above proposed paragraph.

I'm afraid it's more complicated, up to and including you making a
possible pre-existing bug worse: The S3 resume path loads CR4 from
mmu_cr4_features, but doesn't update the in-memory cache of the
register. Hence some subsequent CR4 writes may wrongly be skipped,
until hitting one where an actual write is necessary. Now that you
play (more) with PGE, the situation is only going to get worse. Of
course I may well simply not have managed to spot the sync-ing of
the cached value. (VMX, otoh, takes special care to keep CPU loaded
value and cache in sync - see the bottom of vmx_do_resume().)


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.