> I reckon that the availability of device specifications in the ATSR data
> structure must be there for a purpose.
> If that's not correct, then I'll certainly remove that code again, but I'd
> like to understand what that data is meant
> to be for in that case.
The atsr leverages the same PCI device scope is used for drhd and rmrr so
device and function comes along with bus number. As far as I can tell, we only
need to check the bus number for atsr.
> Since enabling ATS on an already enabled device doesn't do any harm according
> to how enable_ats_device() is
> implemented I can't see any bad in doing so. If there is, then we're back to
> square one where I was asking you how
> to properly do ATS enabling given the described MMCFG restriction.
Yes, there should be no harm enabling ACS or ATS multiple times. It would be
good tested out to make sure though.
> Either we don't need to call it at all during discovery (which I doubt, since
> when the device is in use by Dom0, I
> suppose having ATS enabled is still desirable or even required), or we have
> to potentially do it twice (remember
> that older Dom0 kernels may fail to report all PCI devices to the hypervisor).
I see, calling enable_ats_device() in pci_add_device() will also solve the case
where MMCFG might not work until after dom0 is initialized.
As I mentioned before, our QA team doesn't test ATS and ACS regularly. It
would good if you can coordinate with our QA team to test out these changes to
make sure they don't break any ACS and ATS functionality.
Allen
-----Original Message-----
From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
Sent: Tuesday, October 18, 2011 11:47 PM
To: Kay, Allen M
Cc: Dugger, Donald D; Shan, Haitao; Tian, Kevin; xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: RE: Resend: RE: enable_ats_device() call site
>>> On 19.10.11 at 00:46, "Kay, Allen M" <allen.m.kay@xxxxxxxxx> wrote:
> Sorry for the late reply, I was trying to close something on another
> project. I have the following questions on the patches after
> reviewing the
> paches:
>
> 1) In acpi_find_matched_atsr_unit(), you added following code. The
> original code only tries to match the bus number. What is the purpose
> of this new additional code? Does it fix a problem on one of your systems?
>
> + for ( i = 0; i < atsr->scope.devices_cnt; ++i )
> + if ( atsr->scope.devices[i] == bdf )
> + return atsr;
I reckon that the availability of device specifications in the ATSR data
structure must be there for a purpose. If that's not correct, then I'll
certainly remove that code again, but I'd like to understand what that data is
meant to be for in that case.
> 2) In pci_add_device() function, the original code calls
> pci_enable_acs() only if pdev->domain is not set. The new code calls
> pci_enable_acs() unconditionally, potentially more than once? What is
> the reason for the change?
That's the whole purpose of the change, so just to repeat: MMCFG accesses may
not be possible at scan_pci_devices() time for some or all segments/busses.
Hence enabling ATS may simply be impossible at that point, and must be
attempted a second time after Dom0 reported whether using MMCFG is safe.
Since enabling ATS on an already enabled device doesn't do any harm according
to how enable_ats_device() is implemented I can't see any bad in doing so. If
there is, then we're back to square one where I was asking you how to properly
do ATS enabling given the described MMCFG restriction.
> 3) In the same pci_add_device() function, the new code now also calls
> iommu_enable_device() which currently calls enable_ats_device(). This
> means the new code will enable ATS as it is being discovered by the platform.
> However, I did not see any code that removing enable_ats_device() call
> in domain_context_mapping(). Is this the intention? If so, what is the
> reason?
You were telling me that this needs to be re-done after FLR, and hence has to
remain there.
> I see the reason the original code is still needed but I don't see
> why we need to call enable_ats_device() during platform device
> discovery since the enabling bit will get cleared by FLR.
Either we don't need to call it at all during discovery (which I doubt, since
when the device is in use by Dom0, I suppose having ATS enabled is still
desirable or even required), or we have to potentially do it twice (remember
that older Dom0 kernels may fail to report all PCI devices to the hypervisor).
Jan
> Allen
>
>
> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> Sent: Tuesday, October 11, 2011 5:54 AM
> To: Kay, Allen M
> Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
> Subject: RE: Resend: RE: enable_ats_device() call site
>
>>>> On 08.10.11 at 04:09, "Kay, Allen M" <allen.m.kay@xxxxxxxxx> wrote:
>>> For which I'd like to understand why this is being done in the
>>> places it is
>> now
>>>(not the least why this is done in VT-d specific code in the first place).
>>
>> The reason it is call by reassign_device_ownership() is because FLR
>> clears ATS enabling bit on the device - I forgot about it when I
>> wrote the last email so we still need to re-enable ATS on the device
>> for each
> device assignment.
>> To summarize:
>>
>> 1) Reason for difference in ATS and ACS handling
>> a. ATS capability is in the PCIe endpoint - enabling bit is
>> cleared by device FLR on the passthrough device.
>> b. ACS capability is in the PCIe switch - not affected by FLR on
>> the passthrough device.
>>
>> 2) ATS enabling requirement
>> a. VT-d engine serving the device has to be ATS capable.
>> b. device has to be ATS capable
>
> Okay, so how about the below then (with an attached prerequisite
> cleanup patch)?
>
> Jan
>
> --- 2011-09-20.orig/xen/drivers/passthrough/iommu.c
> +++ 2011-09-20/xen/drivers/passthrough/iommu.c
> @@ -150,6 +150,23 @@ int iommu_add_device(struct pci_dev *pde
> return hd->platform_ops->add_device(pdev);
> }
>
> +int iommu_enable_device(struct pci_dev *pdev) {
> + struct hvm_iommu *hd;
> +
> + if ( !pdev->domain )
> + return -EINVAL;
> +
> + ASSERT(spin_is_locked(&pcidevs_lock));
> +
> + hd = domain_hvm_iommu(pdev->domain);
> + if ( !iommu_enabled || !hd->platform_ops ||
> + !hd->platform_ops->enable_device )
> + return 0;
> +
> + return hd->platform_ops->enable_device(pdev);
> +}
> +
> int iommu_remove_device(struct pci_dev *pdev) {
> struct hvm_iommu *hd;
> --- 2011-09-20.orig/xen/drivers/passthrough/pci.c
> +++ 2011-09-20/xen/drivers/passthrough/pci.c
> @@ -258,7 +258,7 @@ struct pci_dev *pci_get_pdev_by_domain(
> * pci_enable_acs - enable ACS if hardware support it
> * @dev: the PCI device
> */
> -void pci_enable_acs(struct pci_dev *pdev)
> +static void pci_enable_acs(struct pci_dev *pdev)
> {
> int pos;
> u16 cap, ctrl, seg = pdev->seg;
> @@ -409,8 +409,11 @@ int pci_add_device(u16 seg, u8 bus, u8 d
> }
>
> list_add(&pdev->domain_list, &dom0->arch.pdev_list);
> - pci_enable_acs(pdev);
> }
> + else
> + iommu_enable_device(pdev);
> +
> + pci_enable_acs(pdev);
>
> out:
> spin_unlock(&pcidevs_lock);
> --- 2011-09-20.orig/xen/drivers/passthrough/vtd/iommu.c
> +++ 2011-09-20/xen/drivers/passthrough/vtd/iommu.c
> @@ -1901,6 +1901,19 @@ static int intel_iommu_add_device(struct
> return ret;
> }
>
> +static int intel_iommu_enable_device(struct pci_dev *pdev) {
> + struct acpi_drhd_unit *drhd = acpi_find_matched_drhd_unit(pdev);
> + int ret = drhd ? ats_device(pdev, drhd) : -ENODEV;
> +
> + if ( ret <= 0 )
> + return ret;
> +
> + ret = enable_ats_device(pdev->seg, pdev->bus, pdev->devfn);
> +
> + return ret >= 0 ? 0 : ret;
> +}
> +
> static int intel_iommu_remove_device(struct pci_dev *pdev) {
> struct acpi_rmrr_unit *rmrr;
> @@ -1931,7 +1944,6 @@ static int intel_iommu_remove_device(str static
> void __init setup_dom0_device(struct pci_dev *pdev) {
> domain_context_mapping(pdev->domain, pdev->seg, pdev->bus, pdev->devfn);
> - pci_enable_acs(pdev);
> pci_vtd_quirk(pdev);
> }
>
> @@ -2302,6 +2314,7 @@ const struct iommu_ops intel_iommu_ops =
> .init = intel_iommu_domain_init,
> .dom0_init = intel_iommu_dom0_init,
> .add_device = intel_iommu_add_device,
> + .enable_device = intel_iommu_enable_device,
> .remove_device = intel_iommu_remove_device,
> .assign_device = intel_iommu_assign_device,
> .teardown = iommu_domain_teardown,
> --- 2011-09-20.orig/xen/include/xen/iommu.h
> +++ 2011-09-20/xen/include/xen/iommu.h
> @@ -70,6 +70,7 @@ int iommu_enable_x2apic_IR(void); void
> iommu_disable_x2apic_IR(void);
>
> int iommu_add_device(struct pci_dev *pdev);
> +int iommu_enable_device(struct pci_dev *pdev);
> int iommu_remove_device(struct pci_dev *pdev); int
> iommu_domain_init(struct domain *d); void iommu_dom0_init(struct
> domain *d); @@ -120,6 +121,7 @@ struct iommu_ops {
> int (*init)(struct domain *d);
> void (*dom0_init)(struct domain *d);
> int (*add_device)(struct pci_dev *pdev);
> + int (*enable_device)(struct pci_dev *pdev);
> int (*remove_device)(struct pci_dev *pdev);
> int (*assign_device)(struct domain *d, u16 seg, u8 bus, u8 devfn);
> void (*teardown)(struct domain *d);
> --- 2011-09-20.orig/xen/include/xen/pci.h
> +++ 2011-09-20/xen/include/xen/pci.h
> @@ -134,6 +134,5 @@ struct pirq;
> int msixtbl_pt_register(struct domain *, struct pirq *, uint64_t
> gtable); void msixtbl_pt_unregister(struct domain *, struct pirq *);
> void msixtbl_pt_cleanup(struct domain *d); -void pci_enable_acs(struct
> pci_dev *pdev);
>
> #endif /* __XEN_PCI_H__ */
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|