[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC] ARM PCI Passthrough design document

To: Julien Grall <julien.grall@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>
From: Manish Jaggi <mjaggi@xxxxxxxxxxxxxxxxxx>
Date: Tue, 30 May 2017 11:23:38 +0530
Cc: edgar.iglesias@xxxxxxxxxx, okaya@xxxxxxxxxxxxxxxx, Wei Chen <Wei.Chen@xxxxxxx>, Steve Capper <Steve.Capper@xxxxxxx>, Andre Przywara <andre.przywara@xxxxxxx>, manish.jaggi@xxxxxxxxxxxxxxxxxx, punit.agrawal@xxxxxxx, vikrams@xxxxxxxxxxxxxxxx, "Goel, Sameer" <sgoel@xxxxxxxxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Dave P Martin <Dave.Martin@xxxxxxx>, Vijaya Kumar K <Vijaya.Kumar@xxxxxxxxxxxxxxxxxx>, roger.pau@xxxxxxxxxx
Delivery-date: Tue, 30 May 2017 05:54:09 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>
Spamdiagnosticmetadata: NSPM
Spamdiagnosticoutput: 1:99

Hi Julien,

On 5/29/2017 11:44 PM, Julien Grall wrote:

On 05/29/2017 03:30 AM, Manish Jaggi wrote:
Hi Julien,
Hello Manish,
On 5/26/2017 10:44 PM, Julien Grall wrote:
PCI pass-through allows the guest to receive full control ofphysical PCIdevices. This means the guest will have full and direct access tothe PCI
device.

ARM is supporting a kind of guest that exploits as much as possible
virtualization support in hardware. The guest will rely on PV driveronly
for IO (e.g block, network) and interrupts will come through the
virtualized
interrupt controller, therefore there are no big changes required
within the
kernel.

As a consequence, it would be possible to replace PV drivers by
assigning real
devices to the guest for I/O access. Xen on ARM would therefore be
able to
run unmodified operating system.
To achieve this goal, it looks more sensible to go towards emulatingthe
host bridge (there will be more details later).
IIUC this means that domU would have an emulated host bridge and dom0
will see the actual host bridge?
You don't want the hardware domain and Xen access the configurationspace at the same time. So if Xen is in charge of the host bridge,then an emulated host bridge should be exposed to the hardware.

I believe in x86 case dom0 and Xen do access the config space. In thecontext of pci device add hypercall.

Thats when the pci_config_XXX functions in xen are called.

Although, this is depending on who is in charge of the the hostbridge. As you may have noticed, this design document is proposing twoways to handle configuration space access. At the moment any generichost bridge (see the definition in the design document) will behandled in Xen and the hardware domain will have an emulated host bridge.

So in case of generic hb, xen will manage the config space and provide aemulated I/f to dom0, and accesses would be trapped by Xen.Essentially the goal is to scan all pci devices and register them withXen (which in turn will configure the smmu).For a generic hb, this can be done either in dom0/xen. The only doubthere is what extra benefit the emulated hb give in case of dom0.

If your host bridges is not a generic one, then the hardware domainwill be in charge of the host bridges, any configuration access fromXen will be forward to the hardware domain.
At the moment, as part of the first implementation, we are onlylooking to implement a generic host bridge in Xen. We will decide oncase by case basis for all the other host bridges whether we want tohave the driver in Xen.

agreed.

[...]
## IOMMU

The IOMMU will be used to isolate the PCI device when accessing the
memory (e.g
DMA and MSI Doorbells). Often the IOMMU will be configured using a
MasterID
(aka StreamID for ARM SMMU)  that can be deduced from the SBDF with
the help
of the firmware tables (see below).

Whilst in theory, all the memory transactions issued by a PCI device
should
go through the IOMMU, on certain platforms some of the memory
transaction may
not reach the IOMMU because they are interpreted by the host bridge.For
instance, this could happen if the MSI doorbell is built into the PCI
host
bridge or for P2P traffic. See [6] for more details.

XXX: I think this could be solved by using direct mapping (e.g GFN ==
MFN),
this would mean the guest memory layout would be similar to the host
one when
PCI devices will be pass-throughed => Detail it.
In the example given in the IORT spec, for pci devices not behind anSMMU,
how would the writes from the device be protected.
I realize the XXX paragraph is quite confusing. I am not trying tosolve the problem where PCI devices are not protected behind an SMMUbut platform where some transactions (e.g P2P or MSI doorbell access)are by-passing the SMMU.
You may still want to allow PCI passthrough in that case, because youknow that P2P cannot be done (or potentially disabled) and MSIdoorbell access is protected (for instance a write in the ITS doorbellwill be tagged with the device by the hardware). In order to supportsuch platform you need to direct map the doorbel (e.g GFN == MFN) andcarve out the P2P region from the guest memory map. Hence thesuggestion to re-use the host memory layout for the guest.
Note that it does not mean the RAM region will be direct mapped. It isonly there to ease carving out memory region by-passed by the SMMU.
[...]
## ACPI

### Host bridges

The static table MCFG (see 4.2 in [1]) will describe the host bridges
available
at boot and supporting ECAM. Unfortunately, there are platforms outthere
(see [2]) that re-use MCFG to describe host bridge that are not fully
ECAM
compatible.

This means that Xen needs to account for possible quirks in the host
bridge.
The Linux community are working on a patch series for this, see [2]
and [3],
where quirks will be detected with:
     * OEM ID
     * OEM Table ID
     * OEM Revision
     * PCI Segment
     * PCI bus number range (wildcard allowed)

Based on what Linux is currently doing, there are two kind of quirks:
     * Accesses to the configuration space of certain sizes are not
allowed
     * A specific driver is necessary for driving the host bridge

The former is straightforward to solve but the latter will require
more thought.
Instantiation of a specific driver for the host controller can be
easily done
if Xen has the information to detect it.
So Xen would parse the MCFG to find a hb, then map the config space in
dom0 stage2 ?
and then provide the same MCFG to dom0?
This is implementation details. I have been really careful so far toleave the implementation open as it does not matter at this stage howwe are going to implement it in Xen.

this matters in the case of stage 2 MMIO mappings, see below

[...]

## Discovering and registering host bridge

The approach taken in the document will require communication between
Xen and
the hardware domain. In this case, they would need to agree on the
segment
number associated to an host bridge. However, this number is not
available in
the Device Tree case.

The hardware domain will register new host bridges using the existing
hypercall
PHYSDEV_mmcfg_reserved:

#define XEN_PCI_MMCFG_RESERVED 1

struct physdev_pci_mmcfg_reserved {
     /* IN */
     uint64_t    address;
     uint16_t    segment;
     /* Range of bus supported by the host bridge */
     uint8_t     start_bus;
     uint8_t     end_bus;

     uint32_t    flags;
}

So this hypercall is not required for ACPI?

This is not DT specific as even on ACPI there are platform not fullyECAM compliant. As I said above, we will need to decide whether wewant to support non-ECAM compliant host bridges (e.g all host bridgeshave a specific drivers) in Xen. Likely this will be on case by casebasis.


[...]

## Discovering and registering PCI devices

The hardware domain will scan the host bridge to find the list of PCI
devices
available and then report it to Xen using the existing hypercall
PHYSDEV_pci_device_add:

#define XEN_PCI_DEV_EXTFN   0x1
#define XEN_PCI_DEV_VIRTFN  0x2
#define XEN_PCI_DEV_PXM     0x3

struct physdev_pci_device_add {
     /* IN */
     uint16_t    seg;
     uint8_t     bus;
     uint8_t     devfn;
     uint32_t    flags;
     struct {
         uint8_t bus;
         uint8_t devfn;
     } physfn;
     /*
      * Optional parameters array.

* First element ([0]) is PXM domain associated with the device(if

      * XEN_PCI_DEV_PXM is set)
      */
     uint32_t optarr[0];
}

For mapping the MMIO space of the device in Stage2, we need to add
support in Xen / via a map hypercall in linux/drivers/xen/pci.c

Mapping MMIO space in stage-2 is not PCI specific and alreadyaddressed in Xen 4.9 (see commit 80f9c31 "xen/arm: acpi: Map MMIO onfault in stage-2 page table for the hardware domain"). So I don'tunderstand why we should care about that here...

This approach is ok.
But we could have more granular approach than trapping IMHO.
For ACPI

-xen parses MCFG and can map pci hb (emulated / original) in stage2for dom0

   -device MMIO can be mapped in stage2 alongside pci_device_add call .
What do you think?

Regards,



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

Follow-Ups:
- Re: [Xen-devel] [RFC] ARM PCI Passthrough design document
  - From: Julien Grall

References:
- [Xen-devel] [RFC] ARM PCI Passthrough design document
  - From: Julien Grall
- Re: [Xen-devel] [RFC] ARM PCI Passthrough design document
  - From: Manish Jaggi
- Re: [Xen-devel] [RFC] ARM PCI Passthrough design document
  - From: Julien Grall

Prev by Date: Re: [Xen-devel] [PATCH 2/2] xen/input: add multi-touch support
Next by Date: [Xen-devel] [RFC] [PATCH] arm64-its: Add ITS support for ACPI dom0
Previous by thread: Re: [Xen-devel] [RFC] ARM PCI Passthrough design document
Next by thread: Re: [Xen-devel] [RFC] ARM PCI Passthrough design document
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.