[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [Xen-unstable] regression in pci passthrough to HVM guests due to commit 568da4f8c43d2e5b614964c6aefd768de3e3af14 "pt-irq fixes and improvements".



Monday, August 4, 2014, 2:57:57 PM, you wrote:

> On 04/08/14 13:29, Sander Eikelenboom wrote:
>> Hi Jan / Andrew,
>>
>> I'm experiencing a regression in pci passthrough to HVM guests due to commit 
>> 568da4f8c43d2e5b614964c6aefd768de3e3af14 "pt-irq fixes and improvements".
>>
>> Before it used to be fine to shutdown and restart guests with the same pci 
>> devices passed through. After this commit it doesn't, the device is passed 
>> through and visible but doesn't function properly (for instance when passing 
>> through a USB card, a "lsusb" fails.
>>
>> From the logs i see there is (at least) a problem with unmapping the irqs at 
>> the 
>> shutdown of the guest, after this commit it gives:
>>
>> (XEN) [2014-08-04 11:15:48.783] irq.c:2119: dom1: forcing unbind of pirq 87
>> (XEN) [2014-08-04 11:15:48.783] irq.c:2119: dom1: forcing unbind of pirq 86
>> (XEN) [2014-08-04 11:15:48.783] irq.c:2119: dom1: forcing unbind of pirq 85
>> (XEN) [2014-08-04 11:15:48.783] irq.c:2119: dom1: forcing unbind of pirq 84
>>
>> While before this commit it gives:
>> (XEN) [2014-08-04 09:00:02.361] io.c:305: d2: unbind: m_gsi=87 g_gsi=16 
>> device=0 intx=0
>> (XEN) [2014-08-04 09:00:02.361] io.c:363: d2 unmap: m_irq=87 device=0 intx=0
>> (XEN) [2014-08-04 09:00:02.361] io.c:305: d2: unbind: m_gsi=86 g_gsi=27 
>> device=64 intx=195
>> (XEN) [2014-08-04 09:00:02.361] io.c:363: d2 unmap: m_irq=86 device=64 
>> intx=195
>> (XEN) [2014-08-04 09:00:02.361] io.c:305: d2: unbind: m_gsi=85 g_gsi=27 
>> device=64 intx=195
>> (XEN) [2014-08-04 09:00:02.361] io.c:363: d2 unmap: m_irq=85 device=64 
>> intx=195
>> (XEN) [2014-08-04 09:00:02.361] io.c:305: d2: unbind: m_gsi=84 g_gsi=27 
>> device=64 intx=195
>> (XEN) [2014-08-04 09:00:02.361] io.c:363: d2 unmap: m_irq=84 device=64 
>> intx=195
>> (XEN) [2014-08-04 09:00:04.497] AMD-Vi: Disable: device id = 0x400, domain = 
>> 2, paging mode = 4
>> (XEN) [2014-08-04 09:00:04.497] AMD-Vi: Setup I/O page table: device id = 
>> 0x400, type = 0x1, root table = 0x54ef79000, domain = 0, paging mode = 3
>> (XEN) [2014-08-04 09:00:04.497] AMD-Vi: Re-assign 0000:04:00.0 from dom2 to 
>> dom0
>>
>> This is for a device with MSI-X enabled, lspci from the guest:
>>
>> 00:05.0 USB controller [0c03]: NEC Corporation uPD720200 USB 3.0 Host 
>> Controller [1033:0194] (rev 03) (prog-if 30 [XHCI])
>>         Subsystem: Micro-Star International Co., Ltd. Device [1462:4257]
>>         Physical Slot: 5
>>         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
>> Stepping- SERR- FastB2B- DisINTx+
>>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
>> <TAbort- <MAbort- >SERR- <PERR- INTx-
>>         Latency: 0, Cache Line Size: 64 bytes
>>         Interrupt: pin A routed to IRQ 36
>>         Region 0: Memory at f3070000 (64-bit, non-prefetchable) [size=8K]
>>         Capabilities: [50] Power Management version 3
>>                 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
>> PME(D0-,D1-,D2-,D3hot-,D3cold-)
>>                 Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
>>         Capabilities: [70] MSI: Enable- Count=1/1 Maskable- 64bit+
>>                 Address: 0000000000000000  Data: 0000
>>         Capabilities: [90] MSI-X: Enable+ Count=8 Masked-
>>                 Vector table: BAR=0 offset=00001000
>>                 PBA: BAR=0 offset=00001080
>>         Capabilities: [a0] Express (v2) Endpoint, MSI 00
>>                 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s 
>> unlimited, L1 unlimited
>>                         ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
>>                 DevCtl: Report errors: Correctable- Non-Fatal- Fatal- 
>> Unsupported-
>>                         RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
>>                         MaxPayload 128 bytes, MaxReadReq 512 bytes
>>                 DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ 
>> TransPend-
>>                 LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Latency 
>> L0 <4us, L1 unlimited
>>                         ClockPM+ Surprise- LLActRep- BwNot-
>>                 LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- 
>> CommClk-
>>                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>>                 LnkSta: Speed 5GT/s, Width x1, TrErr- Train- SlotClk+ 
>> DLActive- BWMgmt- ABWMgmt-
>>                 DevCap2: Completion Timeout: Not Supported, TimeoutDis+
>>                 DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
>>                 LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- 
>> SpeedDis-, Selectable De-emphasis: -6dB
>>                          Transmit Margin: Normal Operating Range, 
>> EnterModifiedCompliance- ComplianceSOS-
>>                          Compliance De-emphasis: -6dB
>>                 LnkSta2: Current De-emphasis Level: -3.5dB, 
>> EqualizationComplete-, EqualizationPhase1-
>>                          EqualizationPhase2-, EqualizationPhase3-, 
>> LinkEqualizationRequest-
>>         Capabilities: [100 v4] #1033
>>         Kernel driver in use: xhci_hcd
>>
>>
>>
>> I also tried using "pci=nomsi" for the guest, to rule out anything msi(-x 
>> specific) and i end up with:
>>
>> Before this commit:
>> (XEN) [2014-08-04 11:57:20.346] irq.c:270: Dom1 PCI link 0 changed 5 -> 0
>> (XEN) [2014-08-04 11:57:20.352] irq.c:270: Dom1 PCI link 1 changed 10 -> 0
>> (XEN) [2014-08-04 11:57:20.357] irq.c:270: Dom1 PCI link 2 changed 11 -> 0
>> (XEN) [2014-08-04 11:57:20.363] irq.c:270: Dom1 PCI link 3 changed 5 -> 0
>> (XEN) [2014-08-04 11:58:17.382] AMD-Vi: Disable: device id = 0x400, domain = 
>> 1, paging mode = 4
>> (XEN) [2014-08-04 11:58:17.382] AMD-Vi: Setup I/O page table: device id = 
>> 0x400, type = 0x1, root table = 0x55d00e000, domain = 0, paging mode = 3
>> (XEN) [2014-08-04 11:58:17.382] AMD-Vi: Re-assign 0000:04:00.0 from dom1 to 
>> dom0
>>
>> After this commit:
>> (XEN) [2014-08-04 11:30:08.923] irq.c:270: Dom1 PCI link 0 changed 5 -> 0
>> (XEN) [2014-08-04 11:30:08.928] irq.c:270: Dom1 PCI link 1 changed 10 -> 0
>> (XEN) [2014-08-04 11:30:08.934] irq.c:270: Dom1 PCI link 2 changed 11 -> 0
>> (XEN) [2014-08-04 11:30:08.939] irq.c:270: Dom1 PCI link 3 changed 5 -> 0
>> (XEN) [2014-08-04 11:31:16.112] AMD-Vi: Disable: device id = 0x400, domain = 
>> 1, paging mode = 4
>> (XEN) [2014-08-04 11:31:16.112] AMD-Vi: Setup I/O page table: device id = 
>> 0x400, type = 0x1, root table = 0x55d00e000, domain = 0, paging mode = 3
>> (XEN) [2014-08-04 11:31:16.112] AMD-Vi: Re-assign 0000:04:00.0 from dom1 to 
>> dom0
>>
>> So that doesn't seem to be changed .. so it's probably a msi(-x) specific 
>> issue.
>>
>> I also checked if the setting of "pci_msitranslate=1" in the guest config 
>> had any effect, but "pci_msitranslate=0" gave the same results.
>>
>> Complete dmesg and xl dmesg from after this commit (with MSI-X enabled), 
>> starting and shutting down the guest is attached.
>>
>> --
>> Sander

> Are there any qemu logs?  The idenitified changeset changed the tools as
> well as Xen when it came to this side of things.  It would be
> interesting to see if Qemu noticed a difference in the results of the
> hypercalls it makes.

> ~Andrew

Nope unfortunately Qemu is still rather silent (though the build environment 
has 
debug=y):

qemu-dm-tv.log:
char device redirected to /dev/pts/2 (label serial0)
VNC server running on `127.0.0.1:5900'
xen be: vkbd-0: initialise() failed
xen be: vkbd-0: initialise() failed
xen be: vkbd-0: initialise() failed
Issued domain 1 poweroff


xl-tv.log:
Waiting for domain tv (domid 1) to die [pid 9081]
Domain 1 has shut down, reason code 0 0x0
Action for shutdown reason code 0 is destroy
Domain 1 needs to be cleaned up: destroying the domain
libxl: error: libxl_qmp.c:443:qmp_next: Socket read error: Connection reset by 
peer
libxl: error: libxl_qmp.c:701:libxl__qmp_initialize: Failed to connect to QMP
libxl: error: libxl_dm.c:1480:kill_device_model: Device Model already exited
Done. Exiting now

From what i can tell nothing other than before the commit.
--
Sander


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.