[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [PATCH 1/1] x86/vtd: Mask DMAR faults for IGD devices


  • To: Brendan Kerrigan <brendank310@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: "Tian, Kevin" <kevin.tian@xxxxxxxxx>
  • Date: Mon, 20 Apr 2020 03:28:17 +0000
  • Accept-language: en-US
  • Cc: Brendan Kerrigan <kerriganb@xxxxxxxxxxxx>
  • Delivery-date: Mon, 20 Apr 2020 03:28:33 +0000
  • Dlp-product: dlpe-windows
  • Dlp-reaction: no-action
  • Dlp-version: 11.2.0.6
  • Ironport-sdr: zt5TGFxK9GnJbilXOEaAUpbu6EMAmzBJVEK1BEVdvvFDuVinpjj/WEJoDf/GXG51c+n0iTsdR1 RPz+YOrGzkhg==
  • Ironport-sdr: nxhYMr9cNGfvlA95hWQwrf4g4yVzsjvb1AIUKMWwKQOq2Ec2hAnXz1UwGfnucXsZp+FKZxt11P R5Dj35xqIZBw==
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHWFL1LsFJ5VBUZwEqsY3bDkrxGQKiBW4ZQ
  • Thread-topic: [PATCH 1/1] x86/vtd: Mask DMAR faults for IGD devices

> From: Brendan Kerrigan <brendank310@xxxxxxxxx>
> Sent: Friday, April 17, 2020 9:36 PM
> 
> From: Brendan Kerrigan <kerriganb@xxxxxxxxxxxx>
> 
> The Intel graphics device records DMAR faults regardless
> of the Fault Process Disable bit being set. When this fault

Since this is an errata for specific generations, let's describe
this way instead of stating it as a generic problem.

> occurs, enable the Interrupt Mask (IM) bit in the Fault
> Event Control (FECTL) register to prevent the continued
> recording of the fault.
> 
> Signed-off-by: Brendan Kerrigan <kerriganb@xxxxxxxxxxxx>
> ---
>  xen/drivers/passthrough/vtd/iommu.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/xen/drivers/passthrough/vtd/iommu.c
> b/xen/drivers/passthrough/vtd/iommu.c
> index 07d40b37fe..288399d816 100644
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -41,6 +41,8 @@
>  #include "vtd.h"
>  #include "../ats.h"
> 
> +#define IS_IGD(seg, id) (0 == seg && 0 == PCI_BUS(id) && 2 == PCI_SLOT(id)
> && 0 == PCI_FUNC(id))
> +
>  struct mapped_rmrr {
>      struct list_head list;
>      u64 base, end;
> @@ -872,6 +874,13 @@ static int iommu_page_fault_do_one(struct
> vtd_iommu *iommu, int type,
>      printk(XENLOG_G_WARNING VTDPREFIX "%s: reason %02x - %s\n",
>             kind, fault_reason, reason);
> 
> +    if ( DMA_REMAP == fault_type && type && IS_IGD(seg, source_id) ) {
> +        u32 fectl = dmar_readl(iommu->reg, DMAR_FECTL_REG);
> +        fectl |= DMA_FECTL_IM;
> +        dmar_writel(iommu->reg, DMAR_FECTL_REG, fectl);
> +        printk(XENLOG_G_WARNING VTDPREFIX "Disabling DMAR faults for
> IGD\n");
> +    }
> +

Several questions. First, I just note that FPD is not touched by any code
today. then how is this errata being caught? Second, we already have
pci_check_disable_device in place which stops DMAs from the problematic
device if it triggers excessive fault reports. why doesn't it work for your
case? Last, dma_msi_end just forces clearing the mask bit at end of handling
the fault interrupt, making above fix meaningless. Those questions just
make me wonder how you actually test this patch and whether it's necessary...

Thanks
Kevin

>      if ( iommu_verbose && fault_type == DMA_REMAP )
>          print_vtd_entries(iommu, PCI_BUS(source_id), PCI_DEVFN2(source_id),
>                            addr >> PAGE_SHIFT);
> --
> 2.17.1




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.