[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: XSA-351 causing Solaris-11 systems to panic during boot.



On 11/17/20 3:12 AM, Jan Beulich wrote:
> On 16.11.2020 22:57, Cheyenne Wills wrote:
>> Running Xen with XSA-351 is causing Solaris 11 systems to panic during
>> boot.  The panic screen is showing the failure to be coming from
>> "unix:rdmsr".  The panic occurs with existing guests (booting off a disk)
>> and the  booting from an install ISO image.
>>
>> I discussed the problem with "andyhhp__" in the "#xen" IRC channel and he
>> requested that I report it here.
> Thanks. What we need though is information on the specific MSR(s) that
> will need to have workarounds added: We surely would want to avoid
> blindly doing this for all that the XSA change disallowed access to.
> Reproducing the panic screen here might already help; proper full logs
> would be even better.


We hit this issue today so I poked a bit around Solaris code.


It definitely reads MSR_RAPL_POWER_UNIT unguarded during boot.


In addition, it may read MSR_*_ENERGY_STATUS when running kstat. I haven't been 
able to trigger those reads (I didn't have access to the system myself and with 
neither me nor the tester remembering much about Solaris we only tried some 
basic commands).


The patch below lets Solaris guest boot on OVM. Our codebase is somewhat 
different from stable branches but if this is an acceptable workaround I will 
send proper patch for stable. I won't be able to test it though.


Author: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>
Date:   Wed Dec 16 17:19:07 2020 -0500

    x86/msr: Allow read access to some RAPL MSRs
   
    XSA-351 limited access to RAPL-related MSRs to avoid creating a
    side-channel that might allow information leakage. Guests trying
    to access those MSRs now receive #GP.
   
    RAPL is not indicated by CPUID but the assumption is that guests
    should not deal with power-related features and therefore should
    not touch those MSRs. (Linux, in fact, does read MSR_RAPL_POWER_UNIT
    but it does so in safe manner and can ignore the fault).
   
    Unfortunately, Solaris reads some of those registers without
    safeguards. So for those MSRs let's return 0.
   
    Signed-off-by: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>

diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 0dbe810e4b27..6b4a5dc77b7f 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -131,6 +131,18 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, 
uint64_t *val)
         *val &= ~(ARCH_CAPS_TSX_CTRL);
         break;
 
+        /* Solaris reads these MSRs unguarded so let's return 0 */
+    case MSR_RAPL_POWER_UNIT:
+    case MSR_PKG_ENERGY_STATUS:
+    case MSR_DRAM_ENERGY_STATUS:
+    case MSR_PP0_ENERGY_STATUS:
+    case MSR_PP1_ENERGY_STATUS:
+        if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
+            goto gp_fault;
+
+        *val = 0;
+        break;
+
         /*
          * These MSRs are not enumerated in CPUID.  They have been around
          * since the Pentium 4, and implemented by other vendors.
@@ -151,11 +163,16 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, 
uint64_t *val)
             break;
 
         /*fallthrough*/
-    case MSR_RAPL_POWER_UNIT:
-    case MSR_PKG_POWER_LIMIT  ... MSR_PKG_POWER_INFO:
-    case MSR_DRAM_POWER_LIMIT ... MSR_DRAM_POWER_INFO:
-    case MSR_PP0_POWER_LIMIT  ... MSR_PP0_POLICY:
-    case MSR_PP1_POWER_LIMIT  ... MSR_PP1_POLICY:
+    case MSR_PKG_POWER_LIMIT:
+    case MSR_PKG_PERF_STATUS:
+    case MSR_PKG_POWER_INFO:
+    case MSR_DRAM_POWER_LIMIT:
+    case MSR_DRAM_PERF_STATUS:
+    case MSR_DRAM_POWER_INFO:
+    case MSR_PP0_POWER_LIMIT:
+    case MSR_PP0_POLICY:
+    case MSR_PP1_POWER_LIMIT:
+    case MSR_PP1_POLICY:
     case MSR_PLATFORM_ENERGY_COUNTER:
     case MSR_PLATFORM_POWER_LIMIT:
     case MSR_F15H_CU_POWER ... MSR_F15H_CU_MAX_POWER:




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.