[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH V2 4/25] Xen/doc: Add Xen virtual IOMMU doc



On 2017年08月22日 23:55, Roger Pau Monné wrote:
> On Wed, Aug 09, 2017 at 04:34:05PM -0400, Lan Tianyu wrote:
>> This patch is to add Xen virtual IOMMU doc to introduce motivation,
>> framework, vIOMMU hypercall and xl configuration.
>>
>> Signed-off-by: Lan Tianyu <tianyu.lan@xxxxxxxxx>
>> ---
>>  docs/misc/viommu.txt | 139 
>> +++++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 139 insertions(+)
>>  create mode 100644 docs/misc/viommu.txt
>>
>> diff --git a/docs/misc/viommu.txt b/docs/misc/viommu.txt
>> new file mode 100644
>> index 0000000..39455bb
>> --- /dev/null
>> +++ b/docs/misc/viommu.txt
> 
> IMHO, this should be the first patch in the series.

OK. Will update.

> 
>> @@ -0,0 +1,139 @@
>> +Xen virtual IOMMU
>> +
>> +Motivation
>> +==========
>> +*) Enable more than 255 vcpu support
> 
> Seems like the "*)" is some kind of leftover?
> 
>> +HPC cloud service requires VM provides high performance parallel
>> +computing and we hope to create a huge VM with >255 vcpu on one machine
>> +to meet such requirement. Pin each vcpu to separate pcpus.
> 
> I would re-write this as:
> 
> The current requirements of HPC cloud service requires VM with a high
> number of CPUs in order to achieve high performance in parallel
> computing.
> 
> Also, this is needed in order to create VMs with > 128 vCPUs, not 255
> vCPUs. That's because the APIC ID used by Xen is CPU ID * 2 (ie: CPU
> 127 has APIC ID 254, which is the last one available in xAPIC mode).
> You should reword the paragraphs below in order to fix the mention of
> 255 vCPUs.

Thanks for your rewrite.

> 
>> +
>> +To support >255 vcpus, X2APIC mode in guest is necessary because legacy
>> +APIC(XAPIC) just supports 8-bit APIC ID and it only can support 255
>> +vcpus at most. X2APIC mode supports 32-bit APIC ID and it requires
>> +interrupt mapping function of vIOMMU.
> 
> Correct me if I'm wrong, but I don't think x2APIC requires vIOMMU. The
> IOMMU is required so that you can route interrupts to all the possible
> CPUs. One could image a setup where only CPUs with APIC IDs < 255 are
> used as targets of external interrupts, and that doesn't require a
> IOMMU.

This is OS behavior. IIRC, Windows strictly requires IOMMU when enable
x2apic mode and Linux kernel only has such requirement when cpu number
is > 255.


> 
>> +The reason for this is that there is no modification to existing PCI MSI
>> +and IOAPIC with the introduction of X2APIC. PCI MSI/IOAPIC can only send
>> +interrupt message containing 8-bit APIC ID, which cannot address >255
>> +cpus. Interrupt remapping supports 32-bit APIC ID and so it's necessary
>> +to enable >255 cpus with x2apic mode.
>> +
>> +
>> +vIOMMU Architecture
>> +===================
>> +vIOMMU device model is inside Xen hypervisor for following factors
>> +    1) Avoid round trips between Qemu and Xen hypervisor
>> +    2) Ease of integration with the rest of hypervisor
>> +    3) HVMlite/PVH doesn't use Qemu
>> +
>> +* Interrupt remapping overview.
>> +Interrupts from virtual devices and physical devices are delivered
>> +to vLAPIC from vIOAPIC and vMSI. vIOMMU needs to remap interrupt during
>> +this procedure.
>> +
>> ++---------------------------------------------------+
>> +|Qemu                       |VM                     |
>> +|                           | +----------------+    |
>> +|                           | |  Device driver |    |
>> +|                           | +--------+-------+    |
>> +|                           |          ^            |
>> +|       +----------------+  | +--------+-------+    |
>> +|       | Virtual device |  | |  IRQ subsystem |    |
>> +|       +-------+--------+  | +--------+-------+    |
>> +|               |           |          ^            |
>> +|               |           |          |            |
>> ++---------------------------+-----------------------+
>> +|hypervisor     |                      | VIRQ       |
>> +|               |            +---------+--------+   |
>> +|               |            |      vLAPIC      |   |
>> +|               |VIRQ        +---------+--------+   |
>> +|               |                      ^            |
>> +|               |                      |            |
>> +|               |            +---------+--------+   |
>> +|               |            |      vIOMMU      |   |
>> +|               |            +---------+--------+   |
>> +|               |                      ^            |
>> +|               |                      |            |
>> +|               |            +---------+--------+   |
>> +|               |            |   vIOAPIC/vMSI   |   |
>> +|               |            +----+----+--------+   |
>> +|               |                 ^    ^            |
>> +|               +-----------------+    |            |
>> +|                                      |            |
>> ++---------------------------------------------------+
>> +HW                                     |IRQ
>> +                                +-------------------+
>> +                                |   PCI Device      |
>> +                                +-------------------+
>> +
>> +
>> +vIOMMU hypercall
>> +================
>> +Introduce new domctl hypercall "xen_domctl_viommu_op" to create/destroy
>             ^ a
>> +vIOMMU and query vIOMMU capabilities that device model can support.
>          ^ s                                ^ the
>> +
>> +* vIOMMU hypercall parameter structure
>> +
>> +/* vIOMMU type - specify vendor vIOMMU device model */
>> +#define VIOMMU_TYPE_INTEL_VTD     (1u << 0)
>> +
>> +/* vIOMMU capabilities */
>> +#define VIOMMU_CAP_IRQ_REMAPPING  (1u << 0)
>> +
>> +struct xen_domctl_viommu_op {
>> +    uint32_t cmd;
>> +#define XEN_DOMCTL_create_viommu          0
>> +#define XEN_DOMCTL_destroy_viommu         1
>> +#define XEN_DOMCTL_query_viommu_caps      2
>> +    union {
>> +        struct {
>> +            /* IN - vIOMMU type  */
>> +            uint64_t viommu_type;
>> +            /* IN - MMIO base address of vIOMMU. */
>> +            uint64_t base_address;
>> +            /* IN - Length of MMIO region */
>> +            uint64_t length;
>> +            /* IN - Capabilities with which we want to create */
>> +            uint64_t capabilities;
>> +            /* OUT - vIOMMU identity */
>> +            uint32_t viommu_id;
>> +        } create_viommu;
>> +
>> +        struct {
>> +            /* IN - vIOMMU identity */
>> +            uint32_t viommu_id;
>> +        } destroy_viommu;
>> +
>> +        struct {
>> +            /* IN - vIOMMU type */
>> +            uint64_t viommu_type;
>> +            /* OUT - vIOMMU Capabilities */
>> +            uint64_t capabilities;
>> +        } query_caps;
>> +    } u;
>> +};
>> +
>> +- XEN_DOMCTL_query_viommu_caps
>> +    Query capabilities of vIOMMU device model. vIOMMU_type specifies
>> +which vendor vIOMMU device model(E,G Intel VTD) is targeted and hypervisor
>> +returns capability bits(E,G interrupt remapping bit).
>> +
>> +- XEN_DOMCTL_create_viommu
>> +    Create vIOMMU device with vIOMMU_type, capabilities, MMIO
>> +base address and length. Hypervisor returns viommu_id. Capabilities should
>> +be in range of value returned by query_viommu_caps hypercall.
>> +
>> +- XEN_DOMCTL_destroy_viommu
>> +    Destroy vIOMMU in Xen hypervisor with viommu_id as parameters.
>> +
>> +Now just suppport single vIOMMU for one VM and introduced domtcls are 
>> compatible
>> +with multi-vIOMMU support.
>> +
>> +xl vIOMMU configuration
> 
> This should be "xl x86 vIOMMU configuration", since it's clearly x86
> specific.

OK. Will update.

> 
>> +=======================
>> +viommu="type=intel_vtd,intremap=1,x2apic=1"
> 
> Shouldn't this have some kind of array form? From the code I saw it
> seems like you are adding support for domains having multiple IOMMUs,
> in which case this should at least look like:

No, we don't support mult-vIOMMU but some vIOMMU data structure is
defined with multi-vIOMMU consideration.

> 
> viommu = [
>     'type=intel_vtd,intremap=1,x2apic=1',
>     'type=intel_vtd,intremap=1,x2apic=1'
> ]
> 

Wei also suggested this. Will update.

> But then it's missing to which PCI bus each IOMMU is attached.

This will be added if we really need to support multi vIOMMU.

> 
> Also, why do you need the x2apic parameter? Is there any value in
> providing a vIOMMU if it doesn't support x2APIC mode?

User can configure whether vIOMMU can support x2APIC mode and tool stack
will use this configuration to prepare ACPI DMAR table. There is an
X2APIC_OPT_OUT bit in DMAR table to tell OS not enable X2APIC mode for
IOMMU.

> 
> Roger.
> 


-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.