[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 1/2][4.15] x86/PV: conditionally avoid raising #GP for early guest MSR reads


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Mon, 8 Mar 2021 09:56:30 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=XmoMTmHAaEu7wpfC+D9vdqtW0NCM9/czu/DoPKCaoNY=; b=ZxGCLBTbA85i6l2SKTmj/jxQe2CdUwgIxzTCbhbxsF2MeeGeaGRd3asaJ5okeEnoR60IPsk9xmpBRWlUA/OaAiHZDM+DQ6aLIkVQgZUAvXifWtOhVOUv4md3jcM3by3qJa5H2YJxzUIO8l15J2EWEBij1meT0oUW35Z9bnFGC4oM5cidvYeeggdncRc+JfcaK+4xJ4JxrH+RqcDkgKzx5OrAk+8F2zaKha3420vh0BVgM1As6cg3D+dHgazcdkLD3+tEpBVTw9xS/lBaEdbjQlawZcNXz7Xh05kPwb/U7SiXX1XTWYCKtonUWBDSwkoTnR01sLASB0HyXzOFf24fGg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=XPIsrOcXwOsqqXnVHY0o6kqQo8UPQzqx/5OYDXo/7ODXJtif2d9W14SGVaOyggJkc5W/S85kUQ6fIZIWBwj75l1nuEfTCaseNjdsdNk9waBcmbWEVbuG9GeyiQ/vAgJv18girlHNrtTVCCVP1TmIYrkxxZrfnKutb9O3K6qAGpfZ4YfrocpjQoEaDZwU1BlCoZvryhHLzmegh787qbQ4/EtUCEEPO5OX8WJLLcLYSSGRXa4zvu9nJzl2xLV6C1HPuXhWE0/D6jt9n/OHdwpGwzx9CF7u3Asf/Bh88wuJGOAPJSN/R2e1FDR7FsW+xuLdNd77Rq7UB1XAaq/qhiKnAg==
  • Authentication-results: esa4.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Ian Jackson <iwj@xxxxxxxxxxxxxx>
  • Delivery-date: Mon, 08 Mar 2021 08:57:10 +0000
  • Ironport-sdr: eU7p9Krc6qFS5UKyFQwMy56T2slhwMHzr2smBbJkqcwu5jFx7c0yovBo/7uQyY4NvcfpWD1xm7 H/gXv9Q3FBL0V0fyEePm6v/a24pZkzw1SojXWcfoQsZW5qLPY39f+VEz6tCD0LZBwKyDmLN2ES OFNFL4GZzo5HAt4UuCoa2QQjNIgE5YJSA3h14ssdFTXc/UD/flQ8GklL8NtvY1fqO4FzR26vWO ZWV/aHuw7LpkHrdOr6ZL27sGqc9EJpL2kJxbtcAgaLwqQ74V1gL5+6fhpdf+pF9ldUUyPmJa8b ZZA=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Fri, Mar 05, 2021 at 10:50:34AM +0100, Jan Beulich wrote:
> Prior to 4.15 Linux, when running in PV mode, did not install a #GP
> handler early enough to cover for example the rdmsrl_safe() of
> MSR_K8_TSEG_ADDR in bsp_init_amd() (not to speak of the unguarded read
> of MSR_K7_HWCR later in the same function). The respective change
> (42b3a4cb5609 "x86/xen: Support early interrupts in xen pv guests") was
> backported to 4.14, but no further - presumably since it wasn't really
> easy because of other dependencies.
> 
> Therefore, to prevent our change in the handling of guest MSR accesses
> to render PV Linux 4.13 and older unusable on at least AMD systems, make
> the raising of #GP on this paths conditional upon the guest having
> installed a handler, provided of course the MSR can be read in the first
> place (we would have raised #GP in that case even before). Producing
> zero for reads isn't necessarily correct and may trip code trying to
> detect presence of MSRs early, but since such detection logic won't work
> without a #GP handler anyway, this ought to be a fair workaround.
> 
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> ---
> v2: Probe MSR read. Exclude hypervisor range. Avoid issuing two log
>     messages (in debug builds). Don't alter WRMSR behavior.
> ---
> I'm not convinced we can get away without also making the WRMSR path
> somewhat more permissive again, e.g. tolerating attempts to set bits
> which are already set. But of course this would require keeping in sync
> for which MSRs we "fake" reads, as then a kernel attempt to set a bit
> may also appear as an attempt to clear others (because of the zero value
> that we gave it for the read).

The above approach seems dangerous, as it could allow a guest to
figure out the value of the underlying MSR by probing whether values
trigger a #GP?

I think we want to do something similar to what we do on HVM in 4.14
and previous versions: ignore writes as long as the rdmsr to the
target MSR succeeds, regardless of the value.

> 
> --- a/xen/arch/x86/pv/emul-priv-op.c
> +++ b/xen/arch/x86/pv/emul-priv-op.c
> @@ -874,7 +874,7 @@ static int read_msr(unsigned int reg, ui
>      struct vcpu *curr = current;
>      const struct domain *currd = curr->domain;
>      const struct cpuid_policy *cp = currd->arch.cpuid;
> -    bool vpmu_msr = false;
> +    bool vpmu_msr = false, warn = false;
>      int ret;
>  
>      if ( (ret = guest_rdmsr(curr, reg, val)) != X86EMUL_UNHANDLEABLE )
> @@ -882,7 +882,7 @@ static int read_msr(unsigned int reg, ui
>          if ( ret == X86EMUL_EXCEPTION )
>              x86_emul_hw_exception(TRAP_gp_fault, 0, ctxt);
>  
> -        return ret;
> +        goto done;
>      }
>  
>      switch ( reg )
> @@ -986,7 +986,7 @@ static int read_msr(unsigned int reg, ui
>          }
>          /* fall through */
>      default:
> -        gdprintk(XENLOG_WARNING, "RDMSR 0x%08x unimplemented\n", reg);
> +        warn = true;
>          break;
>  
>      normal:
> @@ -995,7 +995,19 @@ static int read_msr(unsigned int reg, ui
>          return X86EMUL_OKAY;
>      }
>  
> -    return X86EMUL_UNHANDLEABLE;
> + done:

Won't this handling be better placed in the 'default' switch case
above?

> +    if ( ret != X86EMUL_OKAY && !curr->arch.pv.trap_ctxt[X86_EXC_GP].address 
> &&
> +         (reg >> 16) != 0x4000 && !rdmsr_safe(reg, *val) )

We didn't used to care about explicitly blocking the reserved MSR
range, do we really wan to do it now?

I'm not sure I see an issue with that, but given that we are trying to
bring back something similar to the previous behavior, I would avoid
adding new conditions.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.