|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3 2/2] VT-d: Fix vt-d flush timeout issue.
>On 14.12.2015 at 5:28pm, <JBeulich@xxxxxxxx> wrote:
> >>> On 12.12.15 at 14:21, <quan.xu@xxxxxxxxx> wrote:
> > --- a/xen/drivers/passthrough/vtd/iommu.c
> > +++ b/xen/drivers/passthrough/vtd/iommu.c
> > @@ -1890,6 +1890,9 @@ static int intel_iommu_add_device(u8 devfn, struct
> pci_dev *pdev)
> > if ( !pdev->domain )
> > return -EINVAL;
> >
> > + if ( is_pdev_unassignable(pdev) )
> > + return -EACCES;
>
> Is this case possible at all (i.e. a newly added device being unassignable)?
>
IMO, it is possible, and I have checked and printed the invoke flow when Xen
init.
iommu_setup() / iommu_hwdom_init() are called before pci_add_device() when Xen
init.
If it flushes error in iommu_setup() / iommu_hwdom_init(), it will mark device
as unassignable.
So it is possible. right?
> > --- a/xen/drivers/passthrough/vtd/qinval.c
> > +++ b/xen/drivers/passthrough/vtd/qinval.c
> > @@ -27,12 +27,58 @@
> > #include "dmar.h"
> > #include "vtd.h"
> > #include "extern.h"
> > +#include "../ats.h"
> >
> > static int __read_mostly iommu_qi_timeout_ms = 1;
> > integer_param("iommu_qi_timeout_ms", iommu_qi_timeout_ms);
> >
> > #define IOMMU_QI_TIMEOUT (iommu_qi_timeout_ms * MILLISECS(1))
> >
> > +void invalidate_timeout(struct iommu *iommu) {
> > + struct domain *d;
> > + unsigned long nr_dom, i;
> > + struct pci_dev *pdev;
> > +
> > + nr_dom = cap_ndoms(iommu->cap);
> > + i = find_first_bit(iommu->domid_bitmap, nr_dom);
> > + while ( i < nr_dom ) {
> > + d = rcu_lock_domain_by_id(iommu->domid_map[i]);
> > + ASSERT(d);
> > +
> > + /* Mark the devices as unassignable. */
> > + for_each_pdev(d, pdev)
> > + mark_pdev_unassignable(pdev);
> > + if ( !is_hardware_domain(d) )
> > + domain_kill(d);
>
> DYM domain_crash() here?
>
Agreed.
> > +void device_tlb_invalidate_timeout(struct iommu *iommu, u16 did,
> > + u16 seg, u8 bus, u8 devfn) {
> > + struct domain *d;
> > + struct pci_dev *pdev;
> > +
> > + d = rcu_lock_domain_by_id(iommu->domid_map[did]);
> > + ASSERT(d);
> > + for_each_pdev(d, pdev)
> > + if ( (pdev->seg == seg) &&
> > + (pdev->bus == bus) &&
> > + (pdev->devfn == devfn) )
> > + {
> > + mark_pdev_unassignable(pdev);
> > + break;
> > + }
> > +
> > + if ( !is_hardware_domain(d) )
> > + domain_kill(d);
> > + rcu_unlock_domain(d);
> > +}
>
> Except for the variable declarations, indentation is broken for the entire
> function.
>
I will modify it in v4.
> > @@ -262,6 +308,14 @@ static int __iommu_flush_iec(struct iommu *iommu,
> > u8 granu, u8 im, u16 iidx)
> >
> > queue_invalidate_iec(iommu, granu, im, iidx);
> > ret = invalidate_sync(iommu);
> > +
> > + if ( ret == -ETIMEDOUT )
> > + {
> > + invalidate_timeout(iommu);
> > + dprintk(XENLOG_WARNING VTDPREFIX,
> > + "IEC flush timeout.\n");
> > + return ret;
> > + }
> > /*
>
> Considering the recurring pattern, wouldn't it be better for
> invalidate_sync() to invoke invalidate_timeout() (at once making sure no
> current
> or future caller misses the need to do so)?
>
Now invalidate_sync() is a common function for iec/iotlb/context/device-tlb
invalidation.
It is different to deal with the flush error.
For device-tlb, it needs some parameters of device, such device seg/bus/devfn.
But iec/iotlb/context don't need these parameters.
Could I introduce device_tlb_invalidate_sync(struct iommu *iommu, u16 did, u16
seg, u8 bus, u8 devfn) for device-tlb flush?
invalidate_sync(struct iommu *iommu) is for iec/iotlb/context.
> Also please insert the blank line at the end of your additions, and no
> trailing full
> stops in log messages please.
>
I will modify it in v4.
> > @@ -88,6 +89,16 @@ struct pci_dev {
> > #define for_each_pdev(domain, pdev) \
> > list_for_each_entry(pdev, &(domain->arch.pdev_list), domain_list)
> >
> > +static inline void mark_pdev_unassignable(struct pci_dev *pdev) {
> > + pdev->info.is_unassignable = 1;
> > +}
> > +
> > +static inline bool_t is_pdev_unassignable(const struct pci_dev *pdev)
> > +{
> > + return pdev->info.is_unassignable; }
>
> Are you aware that we already have a mechanism to prevent assignment (via
> pci_{ro,hide}_device())? I think at the very least this check should consider
> both
> variants. Whether fully using the existing mechanism for your purpose is
> feasible I can't immediately tell (since the ownership change may be
> problematic at the points where you want the flagging to happen).
pci_{ro,hide}_device() is dangerous, and it makes xen crash when I tried it.
for this case, IMO I think a flag is a better solution.
-Quan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |