[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v9 2/3] VT-d: wrap a _sync version for all VT-d flush interfaces



>>> On 01.04.16 at 16:47, <quan.xu@xxxxxxxxx> wrote:
> The dev_invalidate_iotlb() scans ats_devices list to flush ATS devices,
> and the invalidate_sync() is put after dev_invalidate_iotlb() to
> synchronize with hardware for flush status. If we assign multiple
> ATS devices to a domain, the flush status is about all these multiple
> ATS devices. Once flush timeout expires, we couldn't find out which
> one is the buggy ATS device.

Is that true? Or is that just a limitation of our implementation?

> Then, The invalidate_sync() variant (We need to pass down the device's
> SBDF to hide the ATS device) is put within dev_invalidate_iotlb() to
> synchronize for the flush status one by one.

I don't think this is stating current state of things. So ...

> If flush timeout expires,
> we could find out the buggy ATS device and hide it. However, for other
> VT-d flush interfaces, the invalidate_sync() is still put after at present.
> This is inconsistent.

... taken together, what is inconsistent here needs to be
described better, as well as what it is you do to eliminate the
inconsistency. Note that you should not refer back (or imply
knowledge of) the previous discussion on the earlier version.
In any of that discussion is useful here, you need to summarize
it instead.

> --- a/xen/drivers/passthrough/vtd/extern.h
> +++ b/xen/drivers/passthrough/vtd/extern.h
> @@ -61,6 +61,8 @@ int dev_invalidate_iotlb(struct iommu *iommu, u16 did,
>  
>  int qinval_device_iotlb(struct iommu *iommu,
>                          u32 max_invs_pend, u16 sid, u16 size, u64 addr);
> +int qinval_device_iotlb_sync(struct iommu *iommu, u32 max_invs_pend,
> +                             u16 sid, u16 size, u64 addr);

So are then both functions needed to be externally accessible?
That would seem contrary to the last paragraph of the patch
description.

> --- a/xen/drivers/passthrough/vtd/qinval.c
> +++ b/xen/drivers/passthrough/vtd/qinval.c
> @@ -33,6 +33,10 @@ integer_param("vtd_qi_timeout", vtd_qi_timeout);
>  
>  #define IOMMU_QI_TIMEOUT (vtd_qi_timeout * MILLISECS(1))
>  
> +static int __must_check queue_invalidate_wait(struct iommu *iommu,
> +    u8 iflag, u8 sw, u8 fn);

Indentation.

> @@ -102,6 +106,15 @@ static void queue_invalidate_context(struct iommu *iommu,
>      unmap_vtd_domain_page(qinval_entries);
>  }
>  
> +static int queue_invalidate_context_sync(struct iommu *iommu,

__must_check?

> +    u16 did, u16 source_id, u8 function_mask, u8 granu)

Indentation again.

> +{
> +    queue_invalidate_context(iommu, did, source_id,
> +                             function_mask, granu);
> +
> +    return invalidate_sync(iommu);
> +}

Further down you replace the only call to
queue_invalidate_context() - why keep both functions instead of
just making the existing one do the sync? (That would the likely
also apply to qinval_device_iotlb() and others below.)

> @@ -338,23 +365,24 @@ static int flush_iotlb_qi(
>  
>      if ( qi_ctrl->qinval_maddr != 0 )
>      {
> -        int rc;
> -
>          /* use queued invalidation */
>          if (cap_write_drain(iommu->cap))
>              dw = 1;
>          if (cap_read_drain(iommu->cap))
>              dr = 1;
>          /* Need to conside the ih bit later */
> -        queue_invalidate_iotlb(iommu,
> -                               type >> DMA_TLB_FLUSH_GRANU_OFFSET, dr,
> -                               dw, did, size_order, 0, addr);
> +        ret = queue_invalidate_iotlb_sync(iommu,
> +                  type >> DMA_TLB_FLUSH_GRANU_OFFSET, dr, dw, did,
> +                  size_order, 0, addr);
> +
> +        /* TODO: Timeout error handling to be added later */

As of today queue_invalidate_wait() panics, so this comment is
not very helpful as there is not timeout that could possibly be
detected here.

> +        if ( ret )
> +            return ret;
> +
>          if ( flush_dev_iotlb )
>              ret = dev_invalidate_iotlb(iommu, did, addr, size_order, type);
> -        rc = invalidate_sync(iommu);
> -        if ( !ret )
> -            ret = rc;
>      }

I think leaving the existing logic as is would be better - best effort
invalidation even when an error has occurred.

> --- a/xen/drivers/passthrough/vtd/x86/ats.c
> +++ b/xen/drivers/passthrough/vtd/x86/ats.c
> @@ -118,7 +118,6 @@ int dev_invalidate_iotlb(struct iommu *iommu, u16 did,
>      {
>          u16 sid = PCI_BDF2(pdev->bus, pdev->devfn);
>          bool_t sbit;
> -        int rc = 0;
>  
>          /* Only invalidate devices that belong to this IOMMU */
>          if ( pdev->iommu != iommu )
> @@ -134,8 +133,8 @@ int dev_invalidate_iotlb(struct iommu *iommu, u16 did,
>              /* invalidate all translations: sbit=1,bit_63=0,bit[62:12]=1 */
>              sbit = 1;
>              addr = (~0UL << PAGE_SHIFT_4K) & 0x7FFFFFFFFFFFFFFF;
> -            rc = qinval_device_iotlb(iommu, pdev->ats_queue_depth,
> -                                     sid, sbit, addr);
> +            ret = qinval_device_iotlb_sync(iommu, pdev->ats_queue_depth,
> +                                           sid, sbit, addr);
>              break;
>          case DMA_TLB_PSI_FLUSH:
>              if ( !device_in_domain(iommu, pdev, did) )
> @@ -154,16 +153,13 @@ int dev_invalidate_iotlb(struct iommu *iommu, u16 did,
>                  addr |= (((u64)1 << (size_order - 1)) - 1) << PAGE_SHIFT_4K;
>              }
>  
> -            rc = qinval_device_iotlb(iommu, pdev->ats_queue_depth,
> -                                     sid, sbit, addr);
> +            ret = qinval_device_iotlb_sync(iommu, pdev->ats_queue_depth,
> +                                           sid, sbit, addr);
>              break;
>          default:
>              dprintk(XENLOG_WARNING VTDPREFIX, "invalid vt-d flush type\n");
>              return -EOPNOTSUPP;
>          }
> -
> -        if ( !ret )
> -            ret = rc;
>      }

The removal of "rc" (which btw is unrelated to the purpose of your
patch) means that if an earlier iteration encountering an error is
followed by later successful ones, no error would get reported to
the caller. Hence this part of the change needs to be undone.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.