[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH v2] x86/flushtlb: remove flush_area check on system state


  • To: xen-devel@xxxxxxxxxxxxxxxxxxxx
  • From: Roger Pau Monne <roger.pau@xxxxxxxxxx>
  • Date: Tue, 24 May 2022 12:50:52 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Cybg82+tSXW9GZoMvvUXlCaNCT87z5QGSCbDcrZvh5Q=; b=XD4Ajtyq/NS8EwRgoxkTJTfM/mv3C2hcOU+ZIdIj98LQdiO6Tl5pJBELZ8piUKRYV4T9T+z8gVK30Jywkrvk2fdIM62EB99VBEvPlbKQ6R0HU4QUafVUjw6RwvVWV/zBYWGbL7RWamEdpAOSa3HFP1NNCO/PQEs/PJBQBebtRJb2/fXHTq5cexWKK1n9KX49WKBMY6XzAC2ZURIrRQ5UW0r6i4fWZTmjYPR7skZ5pOF/wQopIuHYrEZpv2sohEqoQRdPthry3geltS4Z679x/G4p7zUbKSVxIGI1Gpbm3WXh6Cil7lnh1La7jf3fqaX5KNOv0/fCNeD8HiSG6EtZEA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=TMNMTybpOQC3ALUhEQ9X96+gp1C1R5Fcp9v6MS1Uc64D6M6GEEMeDPx6xBrR4+S01qh9rErfx0vaYQCcxBtKqRPhNTnk7dprwBm/GqjFpyPnTcquEzh5lz7XyklUOj94mtu5xBgNmwD6mKi/B8Qc4GyljBLkQO+Ds81lgeVz1qzrjubGQFKFHByyIQOnV/6G0tWnFkaQV9eAc86LjidsrpRdmz4O4GGEVqTA85fwMMVEhmqb4v2GP63NugzlIuQJLLuzBongMssByDcYWu9AOzF/w3TUi0FJ8bEx0X1A/+HMGWR3Jqun6Vg4tOImAKx4RPcpo5LY+wZQn2oraesjQQ==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Roger Pau Monne <roger.pau@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Tue, 24 May 2022 10:51:37 +0000
  • Ironport-data: A9a23:o3ZJG6lX/9dIEEc5reaGqALo5gz1J0RdPkR7XQ2eYbSJt1+Wr1Gzt xJJDGzTbquLZWHwc4ojaI3i8UMD68XQn9M3QVRvqX0xECMWpZLJC+rCIxarNUt+DCFioGGLT Sk6QoOdRCzhZiaE/n9BCpC48T8kk/vgqoPUUIYoAAgoLeNfYHpn2EsLd9IR2NYy24DkWV/V4 LsenuWEULOb828sWo4rw/rrRCNH5JwebxtB4zTSzdgS1LPvvyF94KA3fMldHFOhKmVgJcaoR v6r8V2M1jixEyHBqD+Suu2TnkUiGtY+NOUV45Zcc/DKbhNq/kTe3kunXRa1hIg+ZzihxrhMJ NtxWZOYEg0nP43AuLomU1pyPA15ZZMbw6GcGC3q2SCT5xWun3rE5dxLVRlzGLJCv+F9DCdJ6 OASLy0LYlabneWqzbmnS+5qwMM+MM3sO4BZsXZlpd3bJa9+HdafHOOXuJkBhG9YasNmRJ4yY +IDbjVidlLYagBnMVYLEpMu2uyvgxETdhUH8Q7N+fRsvwA/yiRw4YrBEPWOR+WOQNkIxWCHg nngvELQV0Ry2Nu3jGDtHmiXrv/Cm2b3VZwfEJW89+V2mxuDy2oLEhoUWFCn5/6jhSaWWdhSN kgV8SoGtrUp+QqgSdyVdwK8iG6JuFgbQdU4LgEhwASEy66R5hnDAGEBF2ZFcIZ/7JdwQiE23 FiUmd+vHSZorLCeVXOa8PGTsC+2Pi8Wa2QFYEfoUDc43jUqm6lr5jqnczqpOPfdYgHdcd0o/ w23kQ==
  • Ironport-hdrordr: A9a23:EyIH0q+mFacnuU15s2huk+E+db1zdoMgy1knxilNoENuH/Bwxv rFoB1E73TJYVYqN03IV+rwWpVoJkmsjaKdgLNhRItKOTOLhILGFvAH0WKP+V3d8k7Fh5NgPN lbAs9D4bTLZDAV7PoSiDPIaerIq+P3lZxA692urEuEGmpRGtpdBkpCe3GmO3wzYDMDKYsyFZ Ka6MYCjz28eU4PZsD+InUeReDMq/DCiZqjOHc9dlcawTjLqQntxK/xEhCe0BtbezRTwY06+W yAtwDi/K2sv9yy1xeZ/W7O6JZ9nsfn17J4dbqxo/lQDg+pphejZYxnVbHHlDcpoNu34FJvq9 XIqwdIBbUA11rhOkWO5Tf90Qjp1zgjr1X4z0WDvHflqcvlABonFston+tiA1bkwntlmOs5/L NA3mqfuZYSJwjHhj7B69/BUAwvvlaooEAljfUYgxVkIMEjgYdq3MMiFX5uYdk99HqQ0vFnLA AuNrCW2B9uSyLXU5iD1VMfgOBFXRwIb2S7qwY5y4+oOgNt7Q9EJnsjtbAid0g7hewAouF/lo L524RT5cRzp5wtHNZA7Nloe7rHNkX9BTTxDUm1HXPLUIk6BlOlke+G3Fxy3pDjRKA1
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Booting with Shadow Stacks leads to the following assert on a debug
hypervisor:

Assertion 'local_irq_is_enabled()' failed at arch/x86/smp.c:265
----[ Xen-4.17.0-10.24-d  x86_64  debug=y  Not tainted ]----
CPU:    0
RIP:    e008:[<ffff82d040345300>] flush_area_mask+0x40/0x13e
[...]
Xen call trace:
   [<ffff82d040345300>] R flush_area_mask+0x40/0x13e
   [<ffff82d040338a40>] F modify_xen_mappings+0xc5/0x958
   [<ffff82d0404474f9>] F 
arch/x86/alternative.c#_alternative_instructions+0xb7/0xb9
   [<ffff82d0404476cc>] F alternative_branches+0xf/0x12
   [<ffff82d04044e37d>] F __start_xen+0x1ef4/0x2776
   [<ffff82d040203344>] F __high_start+0x94/0xa0


This is due to SYS_STATE_smp_boot being set before calling
alternative_branches(), and the flush in modify_xen_mappings() then
using flush_area_all() with interrupts disabled.  Note that
alternative_branches() is called before APs are started, so the flush
must be a local one (and indeed the cpumask passed to
flush_area_mask() just contains one CPU).

Take the opportunity to simplify a bit the logic and intorduce
flush_area_all() as an alias for flush_area_mask(&cpu_online_map...),
taking into account that cpu_online_map just contains the BSP before
APs are started.  This requires widening the assert in
flush_area_mask() to allow being called with interrupts disabled as
long as it's strictly a local only flush.

The overall result is that a conditional can be removed from
flush_area().

While there also introduce an ASSERT to check that a vCPU state flush
is not issued for the local CPU only.

Fixes: (78e072bc37 'x86/mm: avoid inadvertently degrading a TLB flush to local 
only')
Suggested-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
---
Changes since v1:
 - Add an extra assert.
 - Rename flush_area() to flush_area_all().
---
 xen/arch/x86/include/asm/flushtlb.h |  3 ++-
 xen/arch/x86/mm.c                   | 32 +++++++++++------------------
 xen/arch/x86/smp.c                  |  5 ++++-
 3 files changed, 18 insertions(+), 22 deletions(-)

diff --git a/xen/arch/x86/include/asm/flushtlb.h 
b/xen/arch/x86/include/asm/flushtlb.h
index 18777f1d4c..f0094bf747 100644
--- a/xen/arch/x86/include/asm/flushtlb.h
+++ b/xen/arch/x86/include/asm/flushtlb.h
@@ -146,7 +146,8 @@ void flush_area_mask(const cpumask_t *, const void *va, 
unsigned int flags);
 #define flush_mask(mask, flags) flush_area_mask(mask, NULL, flags)
 
 /* Flush all CPUs' TLBs/caches */
-#define flush_area_all(va, flags) flush_area_mask(&cpu_online_map, va, flags)
+#define flush_area_all(va, flags) \
+    flush_area_mask(&cpu_online_map, (const void *)(va), flags)
 #define flush_all(flags) flush_mask(&cpu_online_map, flags)
 
 /* Flush local TLBs */
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 72dbce43b1..96d95a07cd 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5070,14 +5070,6 @@ l1_pgentry_t *virt_to_xen_l1e(unsigned long v)
 #define l1f_to_lNf(f) (((f) & _PAGE_PRESENT) ? ((f) |  _PAGE_PSE) : (f))
 #define lNf_to_l1f(f) (((f) & _PAGE_PRESENT) ? ((f) & ~_PAGE_PSE) : (f))
 
-/*
- * map_pages_to_xen() can be called early in boot before any other
- * CPUs are online. Use flush_area_local() in this case.
- */
-#define flush_area(v,f) (system_state < SYS_STATE_smp_boot ?    \
-                         flush_area_local((const void *)v, f) : \
-                         flush_area_all((const void *)v, f))
-
 #define L3T_INIT(page) (page) = ZERO_BLOCK_PTR
 
 #define L3T_LOCK(page)        \
@@ -5213,7 +5205,7 @@ int map_pages_to_xen(
                 if ( l3e_get_flags(ol3e) & _PAGE_PSE )
                 {
                     flush_flags(lNf_to_l1f(l3e_get_flags(ol3e)));
-                    flush_area(virt, flush_flags);
+                    flush_area_all(virt, flush_flags);
                 }
                 else
                 {
@@ -5236,7 +5228,7 @@ int map_pages_to_xen(
                             unmap_domain_page(l1t);
                         }
                     }
-                    flush_area(virt, flush_flags);
+                    flush_area_all(virt, flush_flags);
                     for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ )
                     {
                         ol2e = l2t[i];
@@ -5310,7 +5302,7 @@ int map_pages_to_xen(
             }
             if ( locking )
                 spin_unlock(&map_pgdir_lock);
-            flush_area(virt, flush_flags);
+            flush_area_all(virt, flush_flags);
 
             free_xen_pagetable(l2mfn);
         }
@@ -5336,7 +5328,7 @@ int map_pages_to_xen(
                 if ( l2e_get_flags(ol2e) & _PAGE_PSE )
                 {
                     flush_flags(lNf_to_l1f(l2e_get_flags(ol2e)));
-                    flush_area(virt, flush_flags);
+                    flush_area_all(virt, flush_flags);
                 }
                 else
                 {
@@ -5344,7 +5336,7 @@ int map_pages_to_xen(
 
                     for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ )
                         flush_flags(l1e_get_flags(l1t[i]));
-                    flush_area(virt, flush_flags);
+                    flush_area_all(virt, flush_flags);
                     unmap_domain_page(l1t);
                     free_xen_pagetable(l2e_get_mfn(ol2e));
                 }
@@ -5415,7 +5407,7 @@ int map_pages_to_xen(
                 }
                 if ( locking )
                     spin_unlock(&map_pgdir_lock);
-                flush_area(virt, flush_flags);
+                flush_area_all(virt, flush_flags);
 
                 free_xen_pagetable(l1mfn);
             }
@@ -5430,7 +5422,7 @@ int map_pages_to_xen(
                 unsigned int flush_flags = FLUSH_TLB | FLUSH_ORDER(0);
 
                 flush_flags(l1e_get_flags(ol1e));
-                flush_area(virt, flush_flags);
+                flush_area_all(virt, flush_flags);
             }
 
             virt    += 1UL << L1_PAGETABLE_SHIFT;
@@ -5481,7 +5473,7 @@ int map_pages_to_xen(
                                                         l1f_to_lNf(flags)));
                     if ( locking )
                         spin_unlock(&map_pgdir_lock);
-                    flush_area(virt - PAGE_SIZE,
+                    flush_area_all(virt - PAGE_SIZE,
                                FLUSH_TLB_GLOBAL |
                                FLUSH_ORDER(PAGETABLE_ORDER));
                     free_xen_pagetable(l2e_get_mfn(ol2e));
@@ -5532,7 +5524,7 @@ int map_pages_to_xen(
                                                     l1f_to_lNf(flags)));
                 if ( locking )
                     spin_unlock(&map_pgdir_lock);
-                flush_area(virt - PAGE_SIZE,
+                flush_area_all(virt - PAGE_SIZE,
                            FLUSH_TLB_GLOBAL |
                            FLUSH_ORDER(2*PAGETABLE_ORDER));
                 free_xen_pagetable(l3e_get_mfn(ol3e));
@@ -5784,7 +5776,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, 
unsigned int nf)
                 l2e_write_atomic(pl2e, l2e_empty());
                 if ( locking )
                     spin_unlock(&map_pgdir_lock);
-                flush_area(NULL, FLUSH_TLB_GLOBAL); /* flush before free */
+                flush_area_all(NULL, FLUSH_TLB_GLOBAL); /* flush before free */
                 free_xen_pagetable(l1mfn);
             }
             else if ( locking )
@@ -5829,7 +5821,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, 
unsigned int nf)
                 l3e_write_atomic(pl3e, l3e_empty());
                 if ( locking )
                     spin_unlock(&map_pgdir_lock);
-                flush_area(NULL, FLUSH_TLB_GLOBAL); /* flush before free */
+                flush_area_all(NULL, FLUSH_TLB_GLOBAL); /* flush before free */
                 free_xen_pagetable(l2mfn);
             }
             else if ( locking )
@@ -5837,7 +5829,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, 
unsigned int nf)
         }
     }
 
-    flush_area(NULL, FLUSH_TLB_GLOBAL);
+    flush_area_all(NULL, FLUSH_TLB_GLOBAL);
 
 #undef FLAGS_MASK
     rc = 0;
diff --git a/xen/arch/x86/smp.c b/xen/arch/x86/smp.c
index 0a02086966..b42603c351 100644
--- a/xen/arch/x86/smp.c
+++ b/xen/arch/x86/smp.c
@@ -262,7 +262,10 @@ void flush_area_mask(const cpumask_t *mask, const void 
*va, unsigned int flags)
 {
     unsigned int cpu = smp_processor_id();
 
-    ASSERT(local_irq_is_enabled());
+    /* Local flushes can be performed with interrupts disabled. */
+    ASSERT(local_irq_is_enabled() || cpumask_subset(mask, cpumask_of(cpu)));
+    /* Exclude use of FLUSH_VCPU_STATE for the local CPU. */
+    ASSERT(!cpumask_test_cpu(cpu, mask) || !(flags & FLUSH_VCPU_STATE));
 
     if ( (flags & ~(FLUSH_VCPU_STATE | FLUSH_ORDER_MASK)) &&
          cpumask_test_cpu(cpu, mask) )
-- 
2.36.0




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.