[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen HVM regression on certain Intel CPUs

On 28.03.2013 17:39, Stefan Bader wrote:
> On 28.03.2013 16:02, Stefan Bader wrote:
>> On 28.03.2013 14:34, Jan Beulich wrote:
>>>>>> On 27.03.13 at 18:23, "H. Peter Anvin" <hpa@xxxxxxxxx> wrote:
>>>> On 03/27/2013 10:17 AM, Stefan Bader wrote:
>>>>>> What does x86info and /proc/cpuinfo show in HVM?
>>>>> x86info cpuid[7].ebx = 0xbbb and /proc/cpuinfo also shows smep
>>>>> set.
>>>> On all CPUs?
>>>>>> The inbound %cr4 shouldn't matter at all, we try to not rely on
>>>>>> it.
>>>>>> If the hypervisor presents SMEP to the guest then the guest is
>>>>>> pretty obviously going to try to use it.
>>>>> To me it looks like when bootstrapping the APs things are not yet
>>>>> ready to use it. If I did not miss something, the only place that
>>>>> the saved contents of cr4 are used is in startup_32 when the cpus
>>>>> are brought up. And then just stop dead. Would need to read more
>>>>> code but a bit weird why the BP is not affected.
>>>> This feels like a bug in Xen, but I don't know for sure yet.  Either
>>>> which way, it is odd.  That write to cr4 should be entirely legitimate.
>>> And I would guess one that got fixed already.
>>> Stefan, please try 4.2.2-rc1, or (separately)
>>> http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=485f374230d39e153d7b9786e3d0336bd52ee661
>>> (which I think requires the immediately preceding
>>> http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=1e6275a95d3e35a72939b588f422bb761ba82f6b
>>> too).
>> The backing explanation does make a lot of sense in reasoning what is going
>> wrong. Unfortunately the two patches above on their own do not fix the 
>> problem
>> (I will try to make another go with 4.2.2-rc1).
> The whole of 4.2.2-rc1 has the same (smep still present in
> trampoline_cr4_features) outcome.
>> For a bit more info I am running a kernel inside the HVM guest which shows 
>> the
>> contents of the cr4 shadow used in the trampoline. Out of interest I compared
>> those values to the ones used on a bare metal boot and both are identical
>> (0x1407F0).
>> That somehow gives some explanation for the patch above failing. Looking at 
>> the
>> code for cr4 updates in vmx_update_guest_cr() a few lines above the new SMEP
>> handling, there already was code which would clear the PAE flag when
>> paging_mode_hap(v->domain) was true. And that would need to be true if the 
>> flag should get cleared. And the PAE flag was (and has to be) set before.
>> Will be looking into this further.
> Going back to gather more info and to find some fix.

I added some more debugging output to the hypervisor to verify the state of HAP.
This showed that while HAP is available on the system, it is not used for the
HVM guests. It looks like this would require some flags to be set when creating
the guest domains and I assume this is not happening because I have to stay with
the xm stack for the libvirt setup for now (requires some repackaging which
hasn't been done, yet).

So the guest isn't using HAP but does seem to use some form of paging even if
the guest VCPU is not using paging. So I changed the vmx_update_guest_cr()
function in that way and that seems to prevent the hangs. Does this look like a
reasonable upstream Xen change?

From eccbc4cf0916c6d4388f658965c79770bd0ba10f Mon Sep 17 00:00:00 2001
From: Stefan Bader <stefan.bader@xxxxxxxxxxxxx>
Date: Wed, 3 Apr 2013 12:06:24 +0200
Subject: [PATCH] VMX: Always disable SMEP when guest is in non-paging mode

commit e7dda8ec9fc9020e4f53345cdbb18a2e82e54a65
  VMX: disable SMEP feature when guest is in non-paging mode

disabled the SMEP bit if a guest VCPU was using HAP and was not
in paging mode. However I could observe VCPUs getting stuck in
the trampoline after the following patch in the Linux kernel
changed the way CR4 gets set up:
  x86, realmode: read cr4 and EFER from kernel for 64-bit trampoline

The change will set CR4 from already set flags which includes the
SMEP bit. On bare metal this does not matter as the CPU is in non-
paging mode at that time. But Xen seems to use the emulated non-
paging mode regardless of HAP (I verified that on the guests I was
seeing the issue, HAP was not used).

Therefor it seems right to unset the SMEP bit for a VCPU that is
not in paging-mode, regardless of its HAP usage.

Signed-off-by: Stefan Bader <stefan.bader@xxxxxxxxxxxxx>
 xen/arch/x86/hvm/vmx/vmx.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 04dbefb..a869ed4 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1161,13 +1161,16 @@ static void vmx_update_guest_cr(struct vcpu *v, unsigned
int cr)
         if ( paging_mode_hap(v->domain) && !hvm_paging_enabled(v) )
             v->arch.hvm_vcpu.hw_cr[4] |= X86_CR4_PSE;
             v->arch.hvm_vcpu.hw_cr[4] &= ~X86_CR4_PAE;
+        }
+        if ( !hvm_paging_enabled(v) )
+        {
              * SMEP is disabled if CPU is in non-paging mode in hardware.
              * However Xen always uses paging mode to emulate guest non-paging
-             * mode with HAP. To emulate this behavior, SMEP needs to be
-             * manually disabled when guest switches to non-paging mode.
+             * mode. To emulate this behavior, SMEP needs to be manually
+             * disabled when guest VCPU is in non-paging mode.
             v->arch.hvm_vcpu.hw_cr[4] &= ~X86_CR4_SMEP;
         __vmwrite(GUEST_CR4, v->arch.hvm_vcpu.hw_cr[4]);

Attachment: 0001-VMX-Always-disable-SMEP-when-guest-is-in-non-paging-.patch
Description: Text Data

Attachment: signature.asc
Description: OpenPGP digital signature

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.