[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [PATCH] x86/mmcfg: add "force" option for MCFG



If MCFG area is not reserved in E820 Xen by default will defer its usage
until Dom0 registers it explicitly after ACPI parser recognizes it as
a reserved resource in DSDT. Having it reserved in E820 is not
mandatory according to "PCI Firmware Specification, rev 3.2" (par. 4.1.2)
and firmware is free to keep a hole E820 in that place.

Unfortunately, keeping it disabled at this point makes Xen fail to
recognize many of PCIe extended capabilities early enough for newly added
devices. Namely, (1) some of VT-d quirks are not applied during Dom0
device handoff, (2) currently MCFG reservation report comes
too late from Dom0 after some of PCI devices being registered in Xen
by Dom0 kernel that break initialization of a number of PCIe capabilities
(e.g. SR-IOV VF BAR sizing).

Since introduction of ACPI parser in Xen is not possible add a "force"
option that will allow Xen to use MCFG area even it's not properly
reserved in E820.

Signed-off-by: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx>
---

I think we need to have this option to at least have a way to workaround
problem (1). Problem (2) could be solved in Dom0 kernel by invoking
xen_mcfg_late() earlier but before the first PCI bus enumertaion which
currently happens somwhere during ACPI scan I think.

The question is what the defult value for this option should be?

---
 docs/misc/xen-command-line.pandoc     |  8 +++++---
 xen/arch/x86/e820.c                   | 20 ++++++++++++++++++++
 xen/arch/x86/x86_64/mmconfig-shared.c | 34 ++++++++++++++++++++--------------
 xen/include/asm-x86/e820.h            |  1 +
 4 files changed, 46 insertions(+), 17 deletions(-)

diff --git a/docs/misc/xen-command-line.pandoc 
b/docs/misc/xen-command-line.pandoc
index 7c72e31..f13b10c 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -1424,11 +1424,13 @@ ordinary DomU, control domain, hardware domain, and - 
when supported
 by the platform - DomU with pass-through device assigned).
 
 ### mmcfg (x86)
-> `= <boolean>[,amd-fam10]`
+> `= List of [ <boolean>, force, amd-fam10 ]`
 
-> Default: `1`
+> Default: `true,force`
 
-Specify if the MMConfig space should be enabled.
+Specify if the MMConfig space should be enabled. If force option is specified
+(default) MMConfig space will be used by Xen early in boot even if it's
+not reserved by firmware in the E820 table.
 
 ### mmio-relax (x86)
 > `= <boolean> | all`
diff --git a/xen/arch/x86/e820.c b/xen/arch/x86/e820.c
index 8e8a2c4..edef72a 100644
--- a/xen/arch/x86/e820.c
+++ b/xen/arch/x86/e820.c
@@ -37,6 +37,26 @@ struct e820map e820;
 struct e820map __initdata e820_raw;
 
 /*
+ * This function checks if any part of the range <start,end> is mapped
+ * with type.
+ */
+int __init e820_any_mapped(u64 start, u64 end, unsigned type)
+{
+       int i;
+
+       for (i = 0; i < e820.nr_map; i++) {
+               struct e820entry *ei = &e820.map[i];
+
+               if (type && ei->type != type)
+                       continue;
+               if (ei->addr >= end || ei->addr + ei->size <= start)
+                       continue;
+               return 1;
+       }
+       return 0;
+}
+
+/*
  * This function checks if the entire range <start,end> is mapped with type.
  *
  * Note: this function only works correct if the e820 table is sorted and
diff --git a/xen/arch/x86/x86_64/mmconfig-shared.c 
b/xen/arch/x86/x86_64/mmconfig-shared.c
index cc08b52..9fc0865 100644
--- a/xen/arch/x86/x86_64/mmconfig-shared.c
+++ b/xen/arch/x86/x86_64/mmconfig-shared.c
@@ -26,33 +26,34 @@
 
 #include "mmconfig.h"
 
+static bool_t __read_mostly force_mmcfg = true;
 unsigned int pci_probe = PCI_PROBE_CONF1 | PCI_PROBE_MMCONF;
 
 static int __init parse_mmcfg(const char *s)
 {
     const char *ss;
-    int rc = 0;
+    int val, rc = 0;
 
     do {
         ss = strchr(s, ',');
         if ( !ss )
             ss = strchr(s, '\0');
 
-        switch ( parse_bool(s, ss) )
-        {
-        case 0:
-            pci_probe &= ~PCI_PROBE_MMCONF;
-            break;
-        case 1:
-            break;
-        default:
-            if ( !cmdline_strcmp(s, "amd_fam10") ||
-                 !cmdline_strcmp(s, "amd-fam10") )
+        if ( (val = parse_bool(s, ss)) >= 0 ) {
+            if ( val )
+               pci_probe |= PCI_PROBE_MMCONF;
+            else
+               pci_probe &= ~PCI_PROBE_MMCONF;
+        } else if ( (val = parse_boolean("amd_fam10", s, ss)) >= 0 ||
+                    (val = parse_boolean("amd-fam10", s, ss)) >= 0 ) {
+            if ( val )
                 pci_probe |= PCI_CHECK_ENABLE_AMD_MMCONF;
             else
-                rc = -EINVAL;
-            break;
-        }
+                pci_probe &= ~PCI_CHECK_ENABLE_AMD_MMCONF;
+        } else if ( (val = parse_boolean("force", s, ss)) >= 0)
+            force_mmcfg = val;
+        else
+            rc = -EINVAL;
 
         s = ss + 1;
     } while ( *ss );
@@ -355,6 +356,11 @@ static int __init is_mmconf_reserved(
                    (unsigned int)cfg->start_bus_number,
                    (unsigned int)cfg->end_bus_number);
         }
+    } else if (!e820_any_mapped(addr, addr + old_size - 1, 0)) {
+        if (!force_mmcfg)
+            printk(KERN_WARNING "PCI: MCFG area at %lx is not reserved in 
E820, "
+                   "use mmcfg=force option\n", addr);
+        valid = force_mmcfg;
     }
 
     return valid;
diff --git a/xen/include/asm-x86/e820.h b/xen/include/asm-x86/e820.h
index 52916fb..8babb4b 100644
--- a/xen/include/asm-x86/e820.h
+++ b/xen/include/asm-x86/e820.h
@@ -24,6 +24,7 @@ struct e820map {
 };
 
 extern int sanitize_e820_map(struct e820entry *biosmap, unsigned int *pnr_map);
+extern int e820_any_mapped(u64 start, u64 end, unsigned type);
 extern int e820_all_mapped(u64 start, u64 end, unsigned type);
 extern int reserve_e820_ram(struct e820map *e820, uint64_t s, uint64_t e);
 extern int e820_change_range_type(
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.