[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/2][4.15?] x86/shadow: encode full GFN in magic MMIO entries


  • To: Jan Beulich <jbeulich@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Fri, 5 Mar 2021 16:32:31 +0000
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=DJlnwKq9Bt5dEneSDOUnS+rOw4mZ/hf1iYSBu0DysHE=; b=Klg0NpWWyNRjCZnNqi2lQstUzYhqbTxop4f7D7ALI6awwPKjeWCSx8ve2lPdEhnsPUDeD5m4E4KecvgR/wD8jtFiR2NPCBsOWSRY8SSMBKSA5fEjD2W4qMRXDgPvjHsh4Ll2o1eRDcNAE0wds2Gxg3jlVu9mubiCakk1wuv1rimVI84ulDpVr6ElvRhjYOm08vN6mTGYQYEH8jpYalHkXB/hUgwPEVWoZpZkiA6tUZbHBnyeS7csDQLKxwVhOhf2lVBN1ZHtJzD7ege9xgVQhy4pPWfBlj5aB6TDQT+NDHBuVfjGIVHlnOR8s+Jkp0mVVTJaMuzJR9Zvby5GooAG+g==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=MhfvXk8uikwcHhASUbkg6ZuL1B5YHhb7xeUFGosMX49Sh16cLbtxRpfRCWFAYP8v49M/eELKvUQdJwDQCvA3gXeDPWGqQQoU5GZlNA5q9+gb7yuDHNcuPNG1gUvHExFxXjLf1p0E9/uKq0IWOMZKCPkn9lC5ocOFviByFBrxWbkmR1eYMkVbf1OgonIP2wSmvOmmFqUQdpnUVUhcYWby5JFCRZGETqBq2gpaUvHkTiq9EHUX4U1C4otS0cltqYOTSEROXXwiqFtpyg9xxCtEJeDjJf7sztNk1A8F5XbXk7YtnoG+BtqH8LM30jHn86K7Zrdke7yWo+jfyoYMZ/0ufw==
  • Authentication-results: esa5.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: Tim Deegan <tim@xxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, "Wei Liu" <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Ian Jackson <iwj@xxxxxxxxxxxxxx>
  • Delivery-date: Fri, 05 Mar 2021 16:33:24 +0000
  • Ironport-sdr: V7UPt24OrqGRsTTWsI4AM9znFChnlbvllxDaOEwwFwBule7ChCKZuhhEvnDiok5cMwnjm4PRyu nHXBhEESnHOO4hS6IYoIfhGzdLWbd0RQVaAFgsfow8ZmlApO0z/6TAlYMncSDTcJpUVBcLJi7S b8qNglAHMDq1DaRPAHG1mYXWMGLDjcbu/YZz+Am3/BL7CHAtkQmsd1Y91sAOqIysytut5FzNHz lPHrJxqQF9xSSF18XUVixwvdeC2CDP7AdFJTP+j+1L/tEuB6KnRpZwWLNDKfb5odYnLqwv4uud zpU=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 05/03/2021 15:37, Jan Beulich wrote:
> Since we don't need to encode all of the PTE flags, we have enough bits
> in the shadow entry to store the full GFN. Don't use literal numbers -
> instead derive the involved values. Or, where derivation would become
> too ugly, sanity-check the result (invoking #error to identify failure).
>
> This then allows dropping from sh_l1e_mmio() again the guarding against
> too large GFNs.
>
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> ---
> I wonder if the respective check in sh_audit_l1_table() is actually
> useful to retain with these changes.
>
> --- a/xen/arch/x86/mm/shadow/types.h
> +++ b/xen/arch/x86/mm/shadow/types.h
> @@ -283,9 +283,17 @@ shadow_put_page_from_l1e(shadow_l1e_t sl
>   * This is only feasible for PAE and 64bit Xen: 32-bit non-PAE PTEs don't
>   * have reserved bits that we can use for this.  And even there it can only
>   * be used if we can be certain the processor doesn't use all 52 address 
> bits.
> + *
> + * For the MMIO encoding (see below) we need the bottom 4 bits for
> + * identifying the kind of entry and a full GFN's worth of bits to encode
> + * the originating frame number.  Set all remaining bits to trigger
> + * reserved bit faults, if (see above) the hardware permits triggering such.
>   */
>  
> -#define SH_L1E_MAGIC 0xffffffff00000001ULL
> +#define SH_L1E_MAGIC_NR_META_BITS 4
> +#define SH_L1E_MAGIC_MASK ((~0ULL << (PADDR_BITS - PAGE_SHIFT + \
> +                                      SH_L1E_MAGIC_NR_META_BITS)) | \
> +                           _PAGE_PRESENT)
>  
>  static inline bool sh_have_pte_rsvd_bits(void)
>  {
> @@ -294,7 +302,8 @@ static inline bool sh_have_pte_rsvd_bits
>  
>  static inline bool sh_l1e_is_magic(shadow_l1e_t sl1e)
>  {
> -    return (sl1e.l1 & SH_L1E_MAGIC) == SH_L1E_MAGIC;
> +    BUILD_BUG_ON(!(PADDR_MASK & SH_L1E_MAGIC_MASK));
> +    return (sl1e.l1 & SH_L1E_MAGIC_MASK) == SH_L1E_MAGIC_MASK;
>  }
>  
>  /* Guest not present: a single magic value */
> @@ -320,20 +329,26 @@ static inline bool sh_l1e_is_gnp(shadow_
>  
>  /*
>   * MMIO: an invalid PTE that contains the GFN of the equivalent guest l1e.
> - * We store 28 bits of GFN in bits 4:32 of the entry.
> + * We store the GFN in bits 4:43 of the entry.
>   * The present bit is set, and the U/S and R/W bits are taken from the guest.
>   * Bit 3 is always 0, to differentiate from gnp above.
>   */
> -#define SH_L1E_MMIO_MAGIC       0xffffffff00000001ULL
> -#define SH_L1E_MMIO_MAGIC_MASK  0xffffffff00000009ULL
> -#define SH_L1E_MMIO_GFN_MASK    0x00000000fffffff0ULL
> +#define SH_L1E_MMIO_MAGIC       SH_L1E_MAGIC_MASK
> +#define SH_L1E_MMIO_MAGIC_BIT   ((_PAGE_PRESENT | _PAGE_RW | _PAGE_USER) + 1)
> +#if SH_L1E_MMIO_MAGIC_BIT & (SH_L1E_MMIO_MAGIC_BIT - 1)
> +# error SH_L1E_MMIO_MAGIC_BIT needs to be a power of 2
> +#endif
> +#if SH_L1E_MMIO_MAGIC_BIT >> SH_L1E_MAGIC_NR_META_BITS
> +# error SH_L1E_MMIO_MAGIC_BIT and SH_L1E_MAGIC_NR_META_BITS are out of sync
> +#endif
> +#define SH_L1E_MMIO_MAGIC_MASK  (SH_L1E_MAGIC_MASK | SH_L1E_MMIO_MAGIC_BIT)
> +#define SH_L1E_MMIO_GFN_MASK    ~(SH_L1E_MMIO_MAGIC_MASK | _PAGE_RW | 
> _PAGE_USER)

In practice, it is 4:36, because we have to limit shadow guests to 32
bits of gfn for XSA-173 (width of the superpage backpointer IIRC).

Also, this property is important for L1TF.  The more guest-controllable
bits we permit in here, the greater the chance of being vulnerable to
L1TF on massive machines.

(I'm a little concerned that I can't spot an L1TF comment which has been
made stale by these changes...  Probably the fault of whichever numpty
prepared the L1TF patches, because I'm certain we discussed this at the
time)

Going from 32 to 36 bits moves the upper safety barrier from TOP-4G to
TOP-64G but I recall us agreed that that was ok, especially as the main
safety guestimate is "no RAM in the top quarter of the address space".

However, I don't think we want to accidentally creep beyond bit 36, so
I'd suggest that the easy fix here is just adjusting a nibble in the
MMIO masks.

~Andrew




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.