WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] RE: Resend: RE: enable_ats_device() call site

To: Jan Beulich <JBeulich@xxxxxxxx>
Subject: [Xen-devel] RE: Resend: RE: enable_ats_device() call site
From: "Kay, Allen M" <allen.m.kay@xxxxxxxxx>
Date: Wed, 19 Oct 2011 15:20:59 -0700
Accept-language: en-US
Acceptlanguage: en-US
Cc: "Tian, Kevin" <kevin.tian@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, "Shan, Haitao" <haitao.shan@xxxxxxxxx>, "Dugger, Donald D" <donald.d.dugger@xxxxxxxxx>
Delivery-date: Wed, 19 Oct 2011 15:33:20 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4E9E8EA1020000780005C155@xxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4E4E48360200007800051F65@xxxxxxxxxxxxxxxxxxxx> <987664A83D2D224EAE907B061CE93D5301EE709B5E@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <4E8EC8A80200007800059E3B@xxxxxxxxxxxxxxxxxxxx> <987664A83D2D224EAE907B061CE93D5301EE70A446@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <4E9458A5020000780005AB99@xxxxxxxxxxxxxxxxxxxx> <987664A83D2D224EAE907B061CE93D5301F19DB855@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <4E9E8EA1020000780005C155@xxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcyOKwBn9n9P5TqVRA2amnajlAnpiQAfFTgg
Thread-topic: Resend: RE: enable_ats_device() call site
> I reckon that the availability of device specifications in the ATSR data 
> structure must be there for a purpose.
> If that's not correct, then I'll certainly remove that code again, but I'd 
> like to understand what that data is meant
> to be for in that case.

The atsr leverages the same PCI device scope is used for drhd and rmrr so 
device and function comes along with bus number.  As far as I can tell, we only 
 need to check the bus number for atsr.

> Since enabling ATS on an already enabled device doesn't do any harm according 
> to how enable_ats_device() is
> implemented I can't see any bad in doing so. If there is, then we're back to 
> square one where I was asking you how
> to properly do ATS enabling given the described MMCFG restriction.

Yes, there should be no harm enabling ACS or ATS multiple times.  It would be 
good tested out to make sure though.

> Either we don't need to call it at all during discovery (which I doubt, since 
> when the device is in use by Dom0, I
> suppose having ATS enabled is still desirable or even required), or we have 
> to potentially do it twice (remember
> that older Dom0 kernels may fail to report all PCI devices to the hypervisor).

I see, calling enable_ats_device() in pci_add_device() will also solve the case 
where MMCFG might not work until after dom0 is initialized.

As I mentioned before, our QA team doesn't test ATS and ACS regularly.  It 
would good if you can coordinate with our QA team to test out these changes to 
make sure they don't break any ACS and ATS functionality.

Allen


-----Original Message-----
From: Jan Beulich [mailto:JBeulich@xxxxxxxx] 
Sent: Tuesday, October 18, 2011 11:47 PM
To: Kay, Allen M
Cc: Dugger, Donald D; Shan, Haitao; Tian, Kevin; xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: RE: Resend: RE: enable_ats_device() call site

>>> On 19.10.11 at 00:46, "Kay, Allen M" <allen.m.kay@xxxxxxxxx> wrote:
> Sorry for the late reply, I was trying to close something on another 
> project.  I have the following questions on the patches after 
> reviewing the
> paches:
> 
> 1) In acpi_find_matched_atsr_unit(), you added following code.  The 
> original code only tries to match the bus number.  What is the purpose 
> of this new additional code? Does it fix a problem on one of your systems?
> 
> +        for ( i = 0; i < atsr->scope.devices_cnt; ++i )
> +            if ( atsr->scope.devices[i] == bdf )
> +                return atsr;

I reckon that the availability of device specifications in the ATSR data 
structure must be there for a purpose. If that's not correct, then I'll 
certainly remove that code again, but I'd like to understand what that data is 
meant to be for in that case.

> 2)  In pci_add_device() function, the original code calls 
> pci_enable_acs() only if pdev->domain is not set.  The new code calls 
> pci_enable_acs() unconditionally, potentially more than once?  What is 
> the reason for the change?

That's the whole purpose of the change, so just to repeat: MMCFG accesses may 
not be possible at scan_pci_devices() time for some or all segments/busses. 
Hence enabling ATS may simply be impossible at that point, and must be 
attempted a second time after Dom0 reported whether using MMCFG is safe.

Since enabling ATS on an already enabled device doesn't do any harm according 
to how enable_ats_device() is implemented I can't see any bad in doing so. If 
there is, then we're back to square one where I was asking you how to properly 
do ATS enabling given the described MMCFG restriction.

> 3) In the same pci_add_device() function, the new code now also calls
> iommu_enable_device() which currently calls enable_ats_device().  This 
> means the new code will enable ATS as it is being discovered by the platform.
> However, I did not see any code that removing enable_ats_device() call 
> in domain_context_mapping().  Is this the intention?  If so, what is the 
> reason?

You were telling me that this needs to be re-done after FLR, and hence has to 
remain there.

>  I see the reason the original code is still needed but I don't see 
> why we need to call enable_ats_device() during platform device 
> discovery since the enabling bit will get cleared by FLR.

Either we don't need to call it at all during discovery (which I doubt, since 
when the device is in use by Dom0, I suppose having ATS enabled is still 
desirable or even required), or we have to potentially do it twice (remember 
that older Dom0 kernels may fail to report all PCI devices to the hypervisor).

Jan

> Allen
> 
> 
> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> Sent: Tuesday, October 11, 2011 5:54 AM
> To: Kay, Allen M
> Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
> Subject: RE: Resend: RE: enable_ats_device() call site
> 
>>>> On 08.10.11 at 04:09, "Kay, Allen M" <allen.m.kay@xxxxxxxxx> wrote:
>>> For which I'd like to understand why this is being done in the 
>>> places it is
>> now
>>>(not the least why this is done in VT-d specific code in the first place).
>> 
>> The reason it is call by reassign_device_ownership() is because FLR 
>> clears ATS enabling bit on the device - I forgot about it when I 
>> wrote the last email so we still need to re-enable ATS on the device 
>> for each
> device assignment.
>> To summarize:
>> 
>> 1) Reason for difference in ATS and ACS handling
>>     a. ATS capability is in the PCIe endpoint - enabling bit is 
>> cleared by device FLR on the passthrough device.
>>     b. ACS capability is in the PCIe switch - not affected by FLR on 
>> the passthrough device.
>> 
>> 2) ATS enabling requirement
>>     a. VT-d engine serving the device has to be ATS capable.
>>     b. device has to be ATS capable
> 
> Okay, so how about the below then (with an attached prerequisite 
> cleanup patch)?
> 
> Jan
> 
> --- 2011-09-20.orig/xen/drivers/passthrough/iommu.c
> +++ 2011-09-20/xen/drivers/passthrough/iommu.c
> @@ -150,6 +150,23 @@ int iommu_add_device(struct pci_dev *pde
>      return hd->platform_ops->add_device(pdev);
>  }
>  
> +int iommu_enable_device(struct pci_dev *pdev) {
> +    struct hvm_iommu *hd;
> +
> +    if ( !pdev->domain )
> +        return -EINVAL;
> +
> +    ASSERT(spin_is_locked(&pcidevs_lock));
> +
> +    hd = domain_hvm_iommu(pdev->domain);
> +    if ( !iommu_enabled || !hd->platform_ops ||
> +         !hd->platform_ops->enable_device )
> +        return 0;
> +
> +    return hd->platform_ops->enable_device(pdev);
> +}
> +
>  int iommu_remove_device(struct pci_dev *pdev)  {
>      struct hvm_iommu *hd;
> --- 2011-09-20.orig/xen/drivers/passthrough/pci.c
> +++ 2011-09-20/xen/drivers/passthrough/pci.c
> @@ -258,7 +258,7 @@ struct pci_dev *pci_get_pdev_by_domain(
>   * pci_enable_acs - enable ACS if hardware support it
>   * @dev: the PCI device
>   */
> -void pci_enable_acs(struct pci_dev *pdev)
> +static void pci_enable_acs(struct pci_dev *pdev)
>  {
>      int pos;
>      u16 cap, ctrl, seg = pdev->seg;
> @@ -409,8 +409,11 @@ int pci_add_device(u16 seg, u8 bus, u8 d
>          }
>  
>          list_add(&pdev->domain_list, &dom0->arch.pdev_list);
> -        pci_enable_acs(pdev);
>      }
> +    else
> +        iommu_enable_device(pdev);
> +
> +    pci_enable_acs(pdev);
>  
>  out:
>      spin_unlock(&pcidevs_lock);
> --- 2011-09-20.orig/xen/drivers/passthrough/vtd/iommu.c
> +++ 2011-09-20/xen/drivers/passthrough/vtd/iommu.c
> @@ -1901,6 +1901,19 @@ static int intel_iommu_add_device(struct
>      return ret;
>  }
>  
> +static int intel_iommu_enable_device(struct pci_dev *pdev) {
> +    struct acpi_drhd_unit *drhd = acpi_find_matched_drhd_unit(pdev);
> +    int ret = drhd ? ats_device(pdev, drhd) : -ENODEV;
> +
> +    if ( ret <= 0 )
> +        return ret;
> +
> +    ret = enable_ats_device(pdev->seg, pdev->bus, pdev->devfn);
> +
> +    return ret >= 0 ? 0 : ret;
> +}
> +
>  static int intel_iommu_remove_device(struct pci_dev *pdev)  {
>      struct acpi_rmrr_unit *rmrr;
> @@ -1931,7 +1944,6 @@ static int intel_iommu_remove_device(str  static 
> void __init setup_dom0_device(struct pci_dev *pdev)  {
>      domain_context_mapping(pdev->domain, pdev->seg, pdev->bus, pdev->devfn);
> -    pci_enable_acs(pdev);
>      pci_vtd_quirk(pdev);
>  }
>  
> @@ -2302,6 +2314,7 @@ const struct iommu_ops intel_iommu_ops =
>      .init = intel_iommu_domain_init,
>      .dom0_init = intel_iommu_dom0_init,
>      .add_device = intel_iommu_add_device,
> +    .enable_device = intel_iommu_enable_device,
>      .remove_device = intel_iommu_remove_device,
>      .assign_device  = intel_iommu_assign_device,
>      .teardown = iommu_domain_teardown,
> --- 2011-09-20.orig/xen/include/xen/iommu.h
> +++ 2011-09-20/xen/include/xen/iommu.h
> @@ -70,6 +70,7 @@ int iommu_enable_x2apic_IR(void);  void 
> iommu_disable_x2apic_IR(void);
>  
>  int iommu_add_device(struct pci_dev *pdev);
> +int iommu_enable_device(struct pci_dev *pdev);
>  int iommu_remove_device(struct pci_dev *pdev);  int 
> iommu_domain_init(struct domain *d);  void iommu_dom0_init(struct 
> domain *d); @@ -120,6 +121,7 @@ struct iommu_ops {
>      int (*init)(struct domain *d);
>      void (*dom0_init)(struct domain *d);
>      int (*add_device)(struct pci_dev *pdev);
> +    int (*enable_device)(struct pci_dev *pdev);
>      int (*remove_device)(struct pci_dev *pdev);
>      int (*assign_device)(struct domain *d, u16 seg, u8 bus, u8 devfn);
>      void (*teardown)(struct domain *d);
> --- 2011-09-20.orig/xen/include/xen/pci.h
> +++ 2011-09-20/xen/include/xen/pci.h
> @@ -134,6 +134,5 @@ struct pirq;
>  int msixtbl_pt_register(struct domain *, struct pirq *, uint64_t 
> gtable); void msixtbl_pt_unregister(struct domain *, struct pirq *);  
> void msixtbl_pt_cleanup(struct domain *d); -void pci_enable_acs(struct 
> pci_dev *pdev);
>  
>  #endif /* __XEN_PCI_H__ */



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel