[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Multi-bridged PCIe devices (Was: Re: iommuu/vt-d issues with LSI MegaSAS (PERC5i))



>>> On 07.01.14 at 12:35, Gordan Bobic <gordan@xxxxxxxxxx> wrote:
> On 2014-01-07 11:26, Wu, Feng wrote:
>>> -----Original Message-----
>>> From: xen-devel-bounces@xxxxxxxxxxxxx 
>>> [mailto:xen-devel-bounces@xxxxxxxxxxxxx] On Behalf Of Gordan Bobic
>>> Sent: Tuesday, January 07, 2014 6:44 PM
>>> To: Andrew Cooper
>>> Cc: xen-devel@xxxxxxxxxxxxx 
>>> Subject: Re: [Xen-devel] Multi-bridged PCIe devices (Was: Re: 
>>> iommuu/vt-d
>>> issues with LSI MegaSAS (PERC5i))
>>> 
>>> On 2014-01-07 10:38, Andrew Cooper wrote:
>>> > On 07/01/14 10:35, Gordan Bobic wrote:
>>> >> On 2014-01-07 03:17, Zhang, Yang Z wrote:
>>> >>> Konrad Rzeszutek Wilk wrote on 2014-01-07:
>>> >>>>> Which would look like this:
>>> >>>>>
>>> >>>>> C220 ---> Tundra Bridge -----> (HB6 PCI bridge -> Brooktree BDFs)
>>> >>>>> on the card
>>> >>>>>           \--------------> IEEE-1394a
>>> >>>>>
>>> >>>>> I am actually wondering if this 07:00.0 device is the one that
>>> >>>>> reports itself as 08:00.0 (which I think is what you alluding to
>>> >>>>> Jan)
>>> >>>>>
>>> >>>>
>>> >>>> And to double check that theory I decided to pass in the IEEE-1394a
>>> >>>> to a guest:
>>> >>>>
>>> >>>>            +-1c.5-[07-08]----00.0-[08]----03.0  Texas Instruments
>>> >>>> TSB43AB22A IEEE-1394a-2000 Controller (PHY/Link) [iOHCI-Lynx]
>>> >>>>
>>> >>>>
>>> >>>> (XEN) [VT-D]iommu.c:885: iommu_fault_status: Fault Overflow (XEN)
>>> >>>> [VT-D]iommu.c:887: iommu_fault_status: Primary Pending Fault (XEN)
>>> >>>> [VT-D]iommu.c:865: DMAR:[DMA Read] Request device [0000:08:00.0]
>>> >>>> fault
>>> >>>> addr 370f1000, iommu reg = ffff82c3ffd53000 (XEN) DMAR:[fault reason
>>> >>>> 02h] Present bit in context entry is clear (XEN) print_vtd_entries:
>>> >>>> iommu ffff83083d4939b0 dev 0000:08:00.0 gmfn 370f1 (XEN)
>>> >>>> root_entry
>>> >>>> = ffff83083d47f000 (XEN)     root_entry[8] = 72569b001 (XEN)
>>> >>>> context
>>> >>>> = ffff83072569b000 (XEN)     context[0] = 0_0 (XEN)
>>> >>>> ctxt_entry[0]
>>> >>>> not present
>>> >>>>
>>> >>>> So, capture card OK - Likely the Tundra bridge has an issue:
>>> >>>>
>>> >>>> 07:00.0 PCI bridge: Tundra Semiconductor Corp. Device 8113 (rev 01)
>>> >>>> (prog-if 01 [Subtractive decode])
>>> >>>>         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV-
>>> VGASnoop-
>>> >>>>         ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+
>>> >>>> 66MHz-
>>> >>>>         UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
>>> <MAbort+
>>> >>>>         >SERR- <PERR- INTx- Latency: 0 Bus: primary=07,
>>> >>>> secondary=08,
>>> >>>>         subordinate=08, sec-latency=32 Memory behind bridge:
>>> >>>>         f0600000-f06fffff Secondary status: 66MHz+ FastB2B+ ParErr-
>>> >>>>         DEVSEL=medium TAbort- <TAbort- <MAbort+ <SERR- <PERR-
>>> >>>> BridgeCtl:
>>> >>>>         Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
>>> >>>>                 PriDiscTmr- SecDiscTmr- DiscTmrStat-
>>> DiscTmrSERREn-
>>> >>>>         Capabilities: [60] Subsystem: Super Micro Computer Inc
>>> >>>> Device 0805
>>> >>>>         Capabilities: [a0] Power Management version 3
>>> >>>>                 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
>>> >>>>                 PME(D0+,D1+,D2+,D3hot+,D3cold+) Status: D0
>>> >>>> NoSoftRst+
>>> >>>>                 PME-Enable- DSel=0 DScale=0 PME-
>>> >>>>
>>> >>>> or there is some unknown bridge in the motherboard.
>>> >>>
>>> >>> According your description above, the upstream Linux should also have
>>> >>> the same problem. Did you see it with upstream Linux?
>>> >>
>>> >> The problem I was seeing with LSI cards (phantom device doing DMA)
>>> >> does, indeed, also occur in upstream Linux. If I enable intel-iommu on
>>> >> bare metal Linux, the same problem occurs as with Xen.
>>> >>
>>> >>> There may be some buggy device that generate DMA request with
>>> >>> internal
>>> >>> BDF but it didn't expose it(not like Phantom device). For those
>>> >>> devices, I think we need to setup the VT-d page table manually.
>>> >>
>>> >> I think what is needed is a pci-phantom style override that tells the
>>> >> hypervisor to tell the IOMMU to allow DMA traffic from a specific
>>> >> invisible device ID.
>>> >>
>>> >> Gordan
>>> >
>>> > There is.  See "pci-phantom" in
>>> > http://xenbits.xen.org/docs/unstable/misc/xen-command-line.html 
>>> 
>>> I thought this was only applicable to phantom _functions_ (number 
>>> after
>>> the
>>> dot) rather than whole phantom _devices_. Is that not the case?
>> 
>> I think that's right. I go through the related code for the pci
>> phantom device just now, I find that
>> the information of command line 'pci-phantom' is stored in variable '
>> phantom_devs[8] '
>> with type of s truct phantom_dev{}. This variable is used in function
>> alloc_pdev() as follow:
>> 
>> 
>>                 for ( i = 0; i < nr_phantom_devs; ++i )
>>                     if ( phantom_devs[i].seg == pseg->nr &&
>>                          phantom_devs[i].bus == bus &&
>>                          phantom_devs[i].slot == PCI_SLOT(devfn) &&
>>                          phantom_devs[i].stride > PCI_FUNC(devfn) )
>>                     {
>>                         pdev->phantom_stride = phantom_devs[i].stride;
>>                         break;
>>                     }
>> 
>> So from the code, we can see this command line only works for phantom
>> _function_, not for whole phantom _devices_.
> 
> What would it take to make it work for a whole phantom device?

First and foremost a definition of what a phantom device is and
how one would behave. Once again - phantom functions are part
of the PCIe specification, so those don't require a definition.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.