[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH v2 for-4.15] x86/msr: introduce an option for HVM relaxed rdmsr behavior


  • To: <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Roger Pau Monne <roger.pau@xxxxxxxxxx>
  • Date: Thu, 4 Mar 2021 15:47:55 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=L2pdlnYkK6NZGcE+3Ae/ki7kRac06cOdBxcW+JVkJzA=; b=MLCSzK8wBu0y9pADBQwLIDPWBjuGAYlGEYBKZlc7zEx4GseRhqwbFcD7vUwO6YF1zJe1ksWLSWLKxDmh84OFc1c5oNvy8gq3AtUrvTKCBBp9On1x6YrpiGjJ9CT68JRpMTWi6b/egYzDVTPh2dNl6fgImceMPkueBhmZTnB7ZLKZFe8a4Nz5YwO+5ew1GKw2i0Prc/tJ4bGyvcaEAM/V7N2AW+3t6zPcVm1RatBt2R093desa00dkyFwLLTPDqELZnRxRdein1wPuTibIcug8CRn/U3VoQYXLvxIwkpJ1/GemK7ngpTTksLTgAQcxKJ4Ks8ugV1rm5yu0RrZ6od8sA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=S2WPJsKNvje7T5XikABuU8YAZpOYC4VbzqShiUdY8pUT1wjRbKInFhwPR6Sk9NXmTRUXVDXzpYgnTUGXoU7u9KJlxY9mpLb8YX+htEB0ebbf5OotHZ5a3SRle1cNiOadLYoMMIfhEqnoeB4N2P0Vu0+qd4KGqZ67vXD3D638UPd61iNACx5b6aAue2BlODQv5amN0A9ngGjNSsIV0WsUR9xwzHfGSLDLXzCgysl111JqGxCMd8A30sUMbLSOKTycWFjbA9hRM2m/XSZRD6ibH75xpoCRzfCwVDdiVpE7lB7Ek4RTY90pMzN2/WCH0vrAYTI8sIdU9uEbZfEPYGYsGA==
  • Authentication-results: esa5.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: Roger Pau Monne <roger.pau@xxxxxxxxxx>, Ian Jackson <iwj@xxxxxxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Jun Nakajima <jun.nakajima@xxxxxxxxx>, Kevin Tian <kevin.tian@xxxxxxxxx>, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>
  • Delivery-date: Thu, 04 Mar 2021 14:53:46 +0000
  • Ironport-sdr: M4CVk+yExLYCaAbS8v9iEsAcaxFgPQrqcoCJF9NjApPMFMp/IpLluTNTxljrSO00F621HEVi0/ KmjU99aE/7mWKr0pyOAfJdCUeIEUO6rxfxMV1YlP9U8eHN2NcLXS7AKt22dw2C1K6gf4N0Cr2w I6j3DHvQgCP0x6Xzb2R/KW2GhMBvSEkiwFHuWdoN8nq5E5jI8+dCLsytUfi1k05Lc1VerdHSl2 lu5WSaC7Uay9WuT9NwQgO4hiou5/L2Y7M3xxjI3nprpl+8BEvxSJx4xI5XamfDNEkXXjlKKk0v cMw=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Introduce an option to allow selecting a less strict behaviour for
rdmsr accesses targeting a MSR not explicitly handled by Xen. Since
commit 84e848fd7a162f669 accesses to MSRs not explicitly handled by
Xen result in the injection of a #GP to the guest. This is a behavior
change since previously a #GP was only injected if accessing the MSR
on the real hardware will also trigger a #GP.

This commit attempts to offer a fallback option similar to the
previous behavior. Note however that the value of the underlying MSR
is never leaked to the guest, as the newly introduced option only
changes whether a #GP is injected or not.

Long term the plan is to properly handle all the MSRs, so the option
introduced here should be considered a temporary resort for OSes that
don't work properly with the new MSR policy. Any OS that requires this
option to be enabled should be reported to
xen-devel@xxxxxxxxxxxxxxxxxxxx.

Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
---
Changes since v1:
 - Only apply the option to HVM guests.
 - Only apply the special handling to MSR reads.
 - Sanitize the newly introduced flags field.
 - Print a warning message when the option is used.
---
Cc: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>
---
Boris, could you please test with Solaris to see if this fixes the
issue?

I wonder whether we need to to enable this option by default for
guests being migrated from previous Xen versions? Maybe that's not
required as the option is helpful mostly for early boot I would
assume, afterwards an OS should already have the #GP handler setup
when accessing MSRs.

>From a release PoV the biggest risk would be breaking some of the
existing MSR functionality. I think that's a necessary risk in order
to offer such fallback option, or else we might discover after the
release that guests that worked on Xen 4.14 don't work anymore in Xen
4.15.
---
 docs/man/xl.cfg.5.pod.in          | 17 +++++++++++++++++
 tools/include/libxl.h             |  8 ++++++++
 tools/libs/light/libxl_types.idl  |  2 ++
 tools/libs/light/libxl_x86.c      |  4 ++++
 tools/xl/xl_parse.c               |  7 +++++++
 xen/arch/x86/domain.c             | 10 ++++++++++
 xen/arch/x86/hvm/svm/svm.c        |  6 ++++++
 xen/arch/x86/hvm/vmx/vmx.c        |  7 +++++++
 xen/include/asm-x86/hvm/domain.h  |  3 +++
 xen/include/public/arch-x86/xen.h |  8 ++++++++
 10 files changed, 72 insertions(+)

diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
index 040374dcd6..62178b9829 100644
--- a/docs/man/xl.cfg.5.pod.in
+++ b/docs/man/xl.cfg.5.pod.in
@@ -2861,6 +2861,23 @@ No MCA capabilities in above list are enabled.
 
 =back
 
+=item B<rdmsr_relaxed=BOOLEAN>
+
+Select whether to use a relaxed behavior for read accesses to MSRs not
+explicitly handled by Xen instead of injecting a #GP to the guest.  Such access
+mode will force Xen to replicate the behaviour from the hardware it's currently
+running on in order to decide whether a #GP is injected to the guest.  Note
+that the guest is never allowed to read the value of unhandled MSRs, this
+option only changes whether a #GP might be injected or not.
+
+This option is only relevant for HVM guests, and will be removed in future
+releases once we are certain the default MSR access policy has been properly
+tested by a wide variety of guests.  If you need to use this option please send
+a bug report to xen-devel@xxxxxxxxxxxxxxxxxxxx with the details of the guests
+you are running that require it.
+
+=back
+
 =back
 
 =head1 SEE ALSO
diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index a7b673e89d..1cc40a2d67 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -495,6 +495,14 @@
  */
 #define LIBXL_HAVE_VMTRACE_BUF_KB 1
 
+/*
+ * LIBXL_HAVE_RDMSR_RELAXED indicates the toolstack has support for switching
+ * the rdmsr handling in the hypervisor to relaxed mode, where #GP is only
+ * injected to guests for unhandled MSRs if accessing the MSR on the physical
+ * hardware also triggers a #GP.
+ */
+#define LIBXL_HAVE_RDMSR_RELAXED 1
+
 /*
  * libxl ABI compatibility
  *
diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
index 5b85a7419f..03b0c80146 100644
--- a/tools/libs/light/libxl_types.idl
+++ b/tools/libs/light/libxl_types.idl
@@ -644,6 +644,8 @@ libxl_domain_build_info = Struct("domain_build_info",[
     ("arch_arm", Struct(None, [("gic_version", libxl_gic_version),
                                ("vuart", libxl_vuart_type),
                               ])),
+    ("arch_x86", Struct(None, [("rdmsr_relaxed", libxl_defbool),
+                              ])),
     # Alternate p2m is not bound to any architecture or guest type, as it is
     # supported by x86 HVM and ARM support is planned.
     ("altp2m", libxl_altp2m_mode),
diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
index 58187ed760..c9cff44088 100644
--- a/tools/libs/light/libxl_x86.c
+++ b/tools/libs/light/libxl_x86.c
@@ -5,9 +5,12 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
                                       libxl_domain_config *d_config,
                                       struct xen_domctl_createdomain *config)
 {
+    config->arch.domain_flags = 0;
     switch(d_config->c_info.type) {
     case LIBXL_DOMAIN_TYPE_HVM:
         config->arch.emulation_flags = (XEN_X86_EMU_ALL & ~XEN_X86_EMU_VPCI);
+        if (libxl_defbool_val(d_config->b_info.arch_x86.rdmsr_relaxed))
+            config->arch.domain_flags |= XEN_X86_RDMSR_RELAXED;
         break;
     case LIBXL_DOMAIN_TYPE_PVH:
         config->arch.emulation_flags = XEN_X86_EMU_LAPIC;
@@ -809,6 +812,7 @@ void libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
                                               libxl_domain_build_info *b_info)
 {
     libxl_defbool_setdefault(&b_info->acpi, true);
+    libxl_defbool_setdefault(&b_info->arch_x86.rdmsr_relaxed, false);
 }
 
 int libxl__arch_passthrough_mode_setdefault(libxl__gc *gc,
diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
index 1893cfc086..9f52c7e914 100644
--- a/tools/xl/xl_parse.c
+++ b/tools/xl/xl_parse.c
@@ -2636,6 +2636,13 @@ skip_usbdev:
         xlu_cfg_replace_string (config, "spice_streaming_video",
                                 &b_info->u.hvm.spice.streaming_video, 0);
         xlu_cfg_get_defbool(config, "nographic", &b_info->u.hvm.nographic, 0);
+        if (!xlu_cfg_get_defbool(config, "rdmsr_relaxed",
+                                 &b_info->arch_x86.rdmsr_relaxed, 0))
+            fprintf(stderr,
+                    "WARNING: rdmsr_relaxed will be removed in future 
versions.\n"
+                    "If it fixes an issue you are having please report to "
+                    "xen-devel@xxxxxxxxxxxxxxxxxxxx.\n");
+
         if (!xlu_cfg_get_long(config, "gfx_passthru", &l, 1)) {
             libxl_defbool_set(&b_info->u.hvm.gfx_passthru, l);
         } else if (!xlu_cfg_get_string(config, "gfx_passthru", &buf, 0)) {
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 5e3c94d3fa..c06b17d338 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -760,6 +760,13 @@ int arch_domain_create(struct domain *d,
                d->domain_id);
     }
 
+    if ( config->arch.domain_flags & ~XEN_X86_RDMSR_RELAXED )
+    {
+        printk(XENLOG_G_ERR "d%d: Invalid arch domain flags: %#x\n",
+               d->domain_id, config->arch.domain_flags);
+        return -EINVAL;
+    }
+
     emflags = config->arch.emulation_flags;
 
     if ( is_hardware_domain(d) && is_pv_domain(d) )
@@ -824,6 +831,9 @@ int arch_domain_create(struct domain *d,
     {
         if ( (rc = hvm_domain_initialise(d)) != 0 )
             goto fail;
+
+        d->arch.hvm.rdmsr_relaxed = config->arch.domain_flags &
+                                    XEN_X86_RDMSR_RELAXED;
     }
     else if ( is_pv_domain(d) )
     {
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index b819897a4a..d036809bd3 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -1795,6 +1795,7 @@ static int svm_msr_read_intercept(unsigned int msr, 
uint64_t *msr_content)
     const struct domain *d = v->domain;
     struct vmcb_struct *vmcb = v->arch.hvm.svm.vmcb;
     const struct nestedsvm *nsvm = &vcpu_nestedsvm(v);
+    uint64_t tmp;
 
     switch ( msr )
     {
@@ -1965,6 +1966,11 @@ static int svm_msr_read_intercept(unsigned int msr, 
uint64_t *msr_content)
         break;
 
     default:
+        if ( d->arch.hvm.rdmsr_relaxed && !rdmsr_safe(msr, tmp) )
+        {
+            *msr_content = 0;
+            break;
+        }
         gdprintk(XENLOG_WARNING, "RDMSR 0x%08x unimplemented\n", msr);
         goto gpf;
     }
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index bfea1b0f8a..883e43a0bb 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -3123,6 +3123,7 @@ static int is_last_branch_msr(u32 ecx)
 static int vmx_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
 {
     struct vcpu *curr = current;
+    uint64_t tmp;
 
     HVM_DBG_LOG(DBG_LEVEL_MSR, "ecx=%#x", msr);
 
@@ -3204,6 +3205,12 @@ static int vmx_msr_read_intercept(unsigned int msr, 
uint64_t *msr_content)
             break;
         }
 
+        if ( curr->domain->arch.hvm.rdmsr_relaxed && !rdmsr_safe(msr, tmp) )
+        {
+            *msr_content = 0;
+            break;
+        }
+
         gdprintk(XENLOG_WARNING, "RDMSR 0x%08x unimplemented\n", msr);
         goto gp_fault;
     }
diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
index 7b60e9125f..fdc1b36513 100644
--- a/xen/include/asm-x86/hvm/domain.h
+++ b/xen/include/asm-x86/hvm/domain.h
@@ -122,6 +122,9 @@ struct hvm_domain {
 
     bool_t                 is_s3_suspended;
 
+    /* Don't unconditionally inject #GP for unhandled MSRs reads. */
+    bool rdmsr_relaxed;
+
     /*
      * TSC value that VCPUs use to calculate their tsc_offset value.
      * Used during initialization and save/restore.
diff --git a/xen/include/public/arch-x86/xen.h 
b/xen/include/public/arch-x86/xen.h
index 629cb2ba40..fbf91bf3b9 100644
--- a/xen/include/public/arch-x86/xen.h
+++ b/xen/include/public/arch-x86/xen.h
@@ -304,6 +304,14 @@ struct xen_arch_domainconfig {
                                      XEN_X86_EMU_PIT | XEN_X86_EMU_USE_PIRQ |\
                                      XEN_X86_EMU_VPCI)
     uint32_t emulation_flags;
+
+/*
+ * HVM only: select whether to use a relaxed behavior for read accesses to MSRs
+ * not explicitly handled by Xen instead of injecting a #GP to the guest. Note
+ * this option doesn't allow the guest to read the hardware value.
+ */
+#define XEN_X86_RDMSR_RELAXED       (1u << 0)
+    uint32_t domain_flags;
 };
 
 /* Location of online VCPU bitmap. */
-- 
2.30.1




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.