[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] xen/arm: vpci: Remove PCI I/O ranges property value



Hi Rahul,

I have a few comments on top of what Stefano already wrote.

On 14/12/2021 10:45, Rahul Singh wrote:
IO ports on ARM don't exist so all IO ports related hypercalls are going
to fail on ARM when we passthrough a PCI device.

Well. Arm doesn't have specific instructions to access I/O port. But they still exists because they are mapped in the memory address space.

It is quite likely we would need the xc_domain_ioport_permission() & co to work on Arm once we decide to expose the I/O region to the guest.

Failure of xc_domain_ioport_permission(..) would turn into a critical
failure at domain creation. We need to avoid this outcome, instead we
want to continue with domain creation as normal even if
xc_domain_ioport_permission(..) fails. XEN_DOMCTL_ioport_permission
is not implemented on ARM so it would return -ENOSYS.

To solve above issue remove PCI I/O ranges property value from dom0
device tree node so that dom0 linux will not allocate I/O space for PCI
devices if pci-passthrough is enabled.

Another valid reason to remove I/O ranges is that PCI I/O space are not
mapped to dom0 when PCI passthrough is enabled, also there is no vpci
trap handler register for IO bar.

TBH, this should be the main reason of this change. We should not expose the PCI I/O space because the vPCI is not supporting it.

The rest is just an implementation details to avoid major refactoring that may need some revert in the future.


Signed-off-by: Rahul Singh <rahul.singh@xxxxxxx>
---
  xen/arch/arm/domain_build.c   | 14 +++++++
  xen/common/device_tree.c      | 72 +++++++++++++++++++++++++++++++++++
  xen/include/xen/device_tree.h | 10 +++++
  3 files changed, 96 insertions(+)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index d02bacbcd1..60f6b2c73b 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -749,6 +749,11 @@ static int __init write_properties(struct domain *d, 
struct kernel_info *kinfo,
                  continue;
          }
+ if ( is_pci_passthrough_enabled() &&
+                dt_device_type_is_equal(node, "pci") )

This check is not going to change for a given node. So I think this wants to be moved outside of the loop to avoid expensive check.

In addition to that, this may also cover PCI devices. I think we want to use the same heuristic as in handle_linux_pci_domain(). So I would move the logic in a separate helper.

+            if ( dt_property_name_is_equal(prop, "ranges") )
+                continue;
+
          res = fdt_property(kinfo->fdt, prop->name, prop_data, prop_len);
if ( res )
@@ -769,6 +774,15 @@ static int __init write_properties(struct domain *d, 
struct kernel_info *kinfo,
              if ( res )
                  return res;
          }
+
+        /*
+         * PCI IO bar are not mapped to dom0 when PCI passthrough is enabled,
+         * also there is no trap handler registered for IO bar therefor remove

Typo: s/therefor/therefore/

+         * the IO range property from the device tree node for dom0.
+         */
+        res = dt_pci_remove_io_ranges(kinfo->fdt, node);

This is called unconditionally. Couldn't this potentially misinterpret some node?

+        if ( res )
+            return res;
      }
/*
diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
index 4aae281e89..9fa25f6723 100644
--- a/xen/common/device_tree.c
+++ b/xen/common/device_tree.c
@@ -2195,6 +2195,78 @@ int dt_get_pci_domain_nr(struct dt_device_node *node)
      return (u16)domain;
  }
+int dt_pci_remove_io_ranges(void *fdt, const struct dt_device_node *dev)
+{
+    const struct dt_device_node *parent = NULL;
+    const struct dt_bus *bus, *pbus;
+    unsigned int rlen;
+    int na, ns, pna, pns, rone, ret;
+    const __be32 *ranges;
+    __be32 regs[((GUEST_ROOT_ADDRESS_CELLS * 2) + GUEST_ROOT_SIZE_CELLS + 1)
+               * 2];
+    __be32 *addr = &regs[0];
+
+    bus = dt_match_bus(dev);
+    if ( !bus )

NIT: I don't particularly care whether we use !bus or bus == 0 but it would be nice to stay consistent at least within a function (below you use rlen == 0).

+        return 0; /* device is not a bus */
+
+    parent = dt_get_parent(dev);
+    if ( parent == NULL )
+        return -EINVAL;
+
+    ranges = dt_get_property(dev, "ranges", &rlen);
+    if ( ranges == NULL )
+    {
+        printk(XENLOG_ERR "DT: no ranges; cannot enumerate %s\n",
+               dev->full_name);
+        return -EINVAL;
+    }
+    if ( rlen == 0 ) /* Nothing to do */
+        return 0;
+
+    bus->count_cells(dev, &na, &ns);
+    if ( !DT_CHECK_COUNTS(na, ns) )
+    {
+        printk(XENLOG_ERR "dt_parse: Bad cell count for device %s\n",
+                  dev->full_name);
+        return -EINVAL;
+    }
+    pbus = dt_match_bus(parent);
+    if ( pbus == NULL )
+    {
+        printk("DT: %s is not a valid bus\n", parent->full_name);
+        return -EINVAL;
+    }
+
+    pbus->count_cells(dev, &pna, &pns);
+    if ( !DT_CHECK_COUNTS(pna, pns) )
+    {
+        printk(XENLOG_ERR "dt_parse: Bad cell count for parent %s\n",
+               dev->full_name);
+        return -EINVAL;
+    }
+    /* Now walk through the ranges */
+    rlen /= 4;
+    rone = na + pna + ns;
+
+    for ( ; rlen >= rone; rlen -= rone, ranges += rone )
+    {
+        unsigned int flags = bus->get_flags(ranges);
+        if ( flags & IORESOURCE_IO )
+            continue;
+
+        memcpy(addr, ranges, 4 * rone);
+
+        addr += rone;
+    }
+
+    ret = fdt_property(fdt, "ranges", regs, sizeof(regs));
+    if ( ret )
+        return ret;
+
+    return 0;
+}
+
  /*
   * Local variables:
   * mode: C
diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
index fd6cd00b43..ad2e905595 100644
--- a/xen/include/xen/device_tree.h
+++ b/xen/include/xen/device_tree.h
@@ -849,6 +849,16 @@ int dt_count_phandle_with_args(const struct dt_device_node 
*np,
   */
  int dt_get_pci_domain_nr(struct dt_device_node *node);
+/**
+ * dt_get_remove_io_range - Remove the PCI I/O range property value.
+ * @fdt: Pointer to the file descriptor tree.
+ * @node: Device tree node.
+ *
+ * This function will remove the PCI IO range property from the PCI device tree
+ * node.
+ */
+int dt_pci_remove_io_ranges(void *fdt, const struct dt_device_node *node);
+
  struct dt_device_node *dt_find_node_by_phandle(dt_phandle handle);
#ifdef CONFIG_DEVICE_TREE_DEBUG

Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.