[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] pci: apply workaround for Intel errata HSE43 and BDF2



>>> On 29.11.18 at 18:11, <roger.pau@xxxxxxxxxx> wrote:
> This errata affect the values read from the BAR registers, and could
> render vPCI (and by extension PVH Dom0 unusable).
> 
> HSE43 is a Haswell erratum where a non-BAR register is implemented at
> the position where the first BAR of the device should be found in the

s/in the/in a/ or something like this, because out of the several
Power Control Unit devices only one kind is really affected.

> Power Control Unit device. Note that there are no BARs on this device,
> apart from the bogus CSR register positioned on top of the first BAR.
> 
> BDF2 is a Broadwell erratum where BARs in the Home Agent device will
> return bogus non-zero values.

I'm afraid there's quite a bit of confusion in Intel's docs here: The vol 2
datasheet link for this CPU from
https://www.intel.com/content/www/us/en/processors/xeon/xeon-technical-resources.html
looks to be dead, and the local copy I have of this lists PCI IDs identical
to E5v3. The E7v4 link still works, and vol 2 has the same issue. (I really
just wanted to cross check that we fully cover the issue with the three
PCI IDs used.)

In any event in the code I'd like to see BDX2 mentioned as well (the
same erratum on E7v4). However, given the situation with the
datasheets I can't see a way to associate the device IDs used with
the individual errata (I would generally suspect there being a 3rd
erratum for the 3rd device ID). Several years ago spec updates also
used to have PCI device ID tables, but this doesn't appear to be the
case anymore.

> @@ -298,6 +299,34 @@ static void check_pdev(const struct pci_dev *pdev)
>  #undef PCI_STATUS_CHECK
>  }
>  
> +static void apply_quirks(struct pci_dev *pdev)
> +{
> +    uint16_t vendor = pci_conf_read16(pdev->seg, pdev->bus,
> +                                      PCI_SLOT(pdev->devfn),
> +                                      PCI_FUNC(pdev->devfn), PCI_VENDOR_ID);
> +    uint16_t device = pci_conf_read16(pdev->seg, pdev->bus,
> +                                      PCI_SLOT(pdev->devfn),
> +                                      PCI_FUNC(pdev->devfn), PCI_DEVICE_ID);
> +
> +    if ( vendor == PCI_VENDOR_ID_INTEL && (device == 0x2fc0 ||
> +         device == 0x6f60 || device == 0x6fa0 || device == 0x6fc0) )

Instead of such an ever growing if(), could we make this table based?

> @@ -397,6 +426,7 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, 
> u8 bus, u8 devfn)
>      }
>  
>      check_pdev(pdev);
> +    apply_quirks(pdev);

At which point putting the small loop into check_pdev() might be as
good as adding a new function.

> --- a/xen/drivers/vpci/header.c
> +++ b/xen/drivers/vpci/header.c
> @@ -480,6 +480,9 @@ static int init_bars(struct pci_dev *pdev)
>          return -EOPNOTSUPP;
>      }
>  
> +    if ( pdev->ignore_bars )
> +        num_bars = 0;

While benign for the errata currently dealt with I wonder whether the
ROM BAR wouldn't better be left alone as well with this flag set. Since
additionally enabling memory decoding on a device without BARs is
a questionable operation I wonder whether you couldn't better move
this a few lines down immediately ahead of the loop over the BARs,
and make it return instead of zeroing num_bars.

> --- a/xen/include/xen/pci.h
> +++ b/xen/include/xen/pci.h
> @@ -115,6 +115,9 @@ struct pci_dev {
>  
>      /* Data for vPCI. */
>      struct vpci *vpci;
> +
> +    /* Device with errata, ignore the BARs. */
> +    bool ignore_bars;

Please can you put this into (one of?) the existing hole(s), instead
of at the end?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.