[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RMRRs and Phantom Functions


  • To: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Wed, 27 Apr 2022 08:59:26 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ZqQup/KvJeUQBuhgw0cKHBszX0Zjc6ityX+RiB4ZOqE=; b=Dbb3myhxkVOa8mO+UwkBZiC3PuWjAxjQSjHFA93OVuPANqouqUtW4gmHWnna41Art8CVgqYmBL2IxF8/MupxZ+AtkGdcq/hQV+HmXVQZdtK85rLjccfdTlqBszWlWdR0d0RbG1CRO3Cci+cBX1uVAWOiFZRYQzby5e6jZZ+oFHXclgN6lVTVnyC6Ijd5wlb4kxvxAmLLA+dEXI/Z945YYmV5vQyIwqPQkvXJHQRzoRlCETJ/MsQa3SBrEIBfVvpovbk5GPe2fWg/jRy1vA7dcffXS9cFRu53qGmWJ7+054gSPhsvEJ6zVyAkZRbLWUWRS/0jxSWt/KZ6xBMGmon2Mw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=bplxmPlKnN4oWlFBssBagWNWzoPhK5ayy1XjOy3ABWUxuPUaCKXJ6pxAB5JK/6NVsqbEeztYG+w2u2soss6y8MbdkUeOuXknqz5lVGRwqppVl2o9p3ZvYF/xuhO38SSSnPSfsfTzYjoBJeP86Pv38FDEs3WZwkn+0Eq+Y/pEp3m30vALueRJGiO+QDAf9Uv96HCB7o2uNjRo86GN5/r3j4zXiqZJ08G7IdyZxFA8JJWRpSGa0JBUrXd9VFvbTOb1s0g7nO1H2mmYLL3+oqbgmnLI7bU4Ff7ouFPz3KorKU1hDRxHUyqNHQY/0Cd2kZQBfviaInBFgbjgmEIJUwYQYg==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Roger Pau Monne <roger.pau@xxxxxxxxxx>, Kevin Tian <kevin.tian@xxxxxxxxx>, Edwin Torok <edvin.torok@xxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Wed, 27 Apr 2022 06:59:55 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 26.04.2022 19:51, Andrew Cooper wrote:
> Hello,
> 
> Edvin has found a machine with some very weird properties.  It is an HP
> ProLiant BL460c Gen8 with:
> 
>  \-[0000:00]-+-00.0  Intel Corporation Xeon E5/Core i7 DMI2
>              +-01.0-[11]--
>              +-01.1-[02]--
>              +-02.0-[04]--+-00.0  Emulex Corporation OneConnect 10Gb NIC
> (be3)
>              |            +-00.1  Emulex Corporation OneConnect 10Gb NIC
> (be3)
>              |            +-00.2  Emulex Corporation OneConnect 10Gb
> iSCSI Initiator (be3)
>              |            \-00.3  Emulex Corporation OneConnect 10Gb
> iSCSI Initiator (be3)
> 
> yet all 4 other functions on the device periodically hit IOMMU faults
> (~once every 5 mins, so definitely stats).
> 
> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:04:00.4] fault addr
> bdf80000
> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:04:00.5] fault addr
> bdf80000
> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:04:00.6] fault addr
> bdf80000
> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:04:00.7] fault addr
> bdf80000
> 
> There are several RMRRs covering the these devices, with:
> 
> (XEN) [VT-D]found ACPI_DMAR_RMRR:
> (XEN) [VT-D] endpoint: 0000:03:00.0
> (XEN) [VT-D] endpoint: 0000:01:00.0
> (XEN) [VT-D] endpoint: 0000:01:00.2
> (XEN) [VT-D] endpoint: 0000:04:00.0
> (XEN) [VT-D] endpoint: 0000:04:00.1
> (XEN) [VT-D] endpoint: 0000:04:00.2
> (XEN) [VT-D] endpoint: 0000:04:00.3
> (XEN) [VT-D]dmar.c:608:   RMRR region: base_addr bdf8f000 end_addr bdf92fff
> 
> being the one relevant to these faults.  I've not manually decoded the
> DMAR table because device paths are horrible to follow but there are at
> least the correct number of endpoints.  The functions all have SR-IOV
> (disabled) and ARI (enabled).  None have any Phantom functions described.
> 
> Specifying pci-phantom=04:00,1 does appear to work around the faults,
> but it's not right, because functions 1 thru 3 aren't actually phantom.

Indeed, and I think you really mean "pci-phantom=04:00,4". I guess we
should actually refuse "pci-phantom=04:00,1" in a case like this one.
The problem is that at the point we set pdev->phantom_stride we may
not know of the other devices, yet. But I guess we could attempt a
config space read of the supposed phantom function's device/vendor
and do <whatever> if these aren't both 0xffff.

> Also, I don't see any logic which actually wires up phantom functions
> like this to share RMRRs/IVMDs in IO contexts.

See for example deassign_device():

    while ( pdev->phantom_stride )
    {
        devfn += pdev->phantom_stride;
        if ( PCI_SLOT(devfn) != PCI_SLOT(pdev->devfn) )
            break;
        ret = iommu_call(hd->platform_ops, reassign_device, d, target, devfn,
                         pci_to_dev(pdev));
        if ( ret )
            goto out;
    }

The hook is invoked with a devfn different from pdev's, and the VT-d
function then looks up the RMRR based on pdev while populating the
context entry for the given devfn. Or at least that's how it's
intended to work.

Jan

>  The faults only
> disappear as a side effect of 04:00.0 and 04:00.4 being in dom0, as far
> as I can tell.
> 
> Simply giving the RMRR via rmrr= doesn't work (presumably because of no
> patching actual devices, but there's no warning), but it feels as if it
> ought to.
> 
> ~Andrew




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.