[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] mptscsih gets SCSI I/O errors in HVM with VT-d


  • To: "Xen-devel@xxxxxxxxxxxxxxxxxxx" <Xen-devel@xxxxxxxxxxxxxxxxxxx>
  • From: "Nadolski, Ed" <Ed.Nadolski@xxxxxxx>
  • Date: Fri, 19 Mar 2010 12:41:28 -0600
  • Accept-language: en-US
  • Acceptlanguage: en-US
  • Cc:
  • Delivery-date: Fri, 19 Mar 2010 11:42:31 -0700
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>
  • Thread-index: AcrHk8mAafXPxuPDQyuMfFvar28zIg==
  • Thread-topic: mptscsih gets SCSI I/O errors in HVM with VT-d

Hi,

I am running Xen 4.0.0-rc6 on a Dell T7500 quad-core Xeon with Fedora 12 as 
dom0.  I have an LSI FC949E quad-port Fibre Channel HBA that works fine when I 
run it from either dom0 or baremetal, but when I try to assign this HBA to an 
HVM using VT-d, I see a bunch of SCSI abort/reset errors from the mptscsih 
driver in the HVM whenever I run disk I/Os thru the HBA.  The HVM OS is 
off-the-shelf Fedora 12.

Here are the mpt driver error messages from the HVM:

> mptscsih: ioc3: attempting task abort! (sc=ffff88001d4fa900)
> sd 5:0:0:2: [sdd] CDB: Read(10): 28 00 03 7a d3 20 00 00 c0 00
> mptscsih: ioc3: WARNING - Issuing Reset from mptscsih_IssueTaskMgmt!!
> mptbase: ioc3: Initiating recovery
> mptscsih: ioc3: task abort: SUCCESS (sc=ffff88001d4fa900)
> mptscsih: ioc3: attempting task abort! (sc=ffff88001d4fa500)
> sd 5:0:0:2: [sdd] CDB: Read(10): 28 00 03 7a d3 e0 00 01 00 00
> mptscsih: ioc3: task abort: FAILED (sc=ffff88001d4fa500)
> mptscsih: ioc3: attempting target reset! (sc=ffff88001d4fa900)
> sd 5:0:0:2: [sdd] CDB: Read(10): 28 00 03 7a d3 20 00 00 c0 00
> mptscsih: ioc3: target reset: SUCCESS (sc=ffff88001d4fa900)
> mptscsih: ioc3: attempting task abort! (sc=ffff88001d4fb400)
> sd 5:0:0:1: [sdc] CDB: Read(10): 28 00 01 b9 fd 20 00 00 c0 00
> mptscsih: ioc3: WARNING - Issuing Reset from mptscsih_IssueTaskMgmt!!
> mptbase: ioc3: Initiating recovery
> mptscsih: ioc3: task abort: SUCCESS (sc=ffff88001d4fb400)
> mptscsih: ioc3: attempting task abort! (sc=ffff88001d4fa600)
> sd 5:0:0:1: [sdc] CDB: Read(10): 28 00 01 b9 fd e0 00 01 00 00
> mptscsih: ioc3: task abort: FAILED (sc=ffff88001d4fa600)
> mptscsih: ioc3: attempting target reset! (sc=ffff88001d4fb400)
> sd 5:0:0:1: [sdc] CDB: Read(10): 28 00 01 b9 fd 20 00 00 c0 00
> mptscsih: ioc3: target reset: SUCCESS (sc=ffff88001d4fb400)


Any thoughts on what could cause something like this under HVM but not 
baremetal? I will try to instrument the mptscsih driver in the HVM to get a 
better idea of what kind of I/O errors are occurring.

Interestingly, this Dell T7500 also has an onboard LSI 1068 SAS controller, 
which works fine when assigned to the HVM. So I wonder if this could have 
something to do with PCI bridging?

FWIW I've also enclosed below the lspci -vvvxxx for the HBA, both baremetal and 
in the HVM, tho I don't see anything obvious there.

Thanks,
Ed


### /etc/grub.conf entry for passthru

title Fedora-12 Xen 4.0.0-rc6 (2.6.31.12) iommu=1 
xen-pciback.hide=(24:00.0)(24:00.1)(25:00.0)(25:00.1)
        root (hd0,0)
        kernel /xen-4.0.0-rc6.gz iommu=1 acpi_skip_timer_override loglvl=all 
guest_loglvl=all sync_console console_to_ring com1=115200,8n1 console=com1
        module /vmlinuz-2.6.31.12 ro 
root=UUID=edbcbc29-f3e4-4985-80c1-3c3b0ce24d17  LANG=en_US.UTF-8 
SYSFONT=latarcyrheb-sun16 KEYBOARDTYPE=pc KEYTABLE=us console=hvc0 
earlyprintk=xen xen-pciback.hide=(24:00.0)(24:00.1)(25:00.0)(25:00.1)
        module /initramfs-2.6.31.12.img





### lspci for HBA device on baremetal Fedora 12:
# lspci
...
24:00.0 Fibre Channel: LSI Logic / Symbios Logic FC949ES Fibre Channel Adapter 
(rev 02)
24:00.1 Fibre Channel: LSI Logic / Symbios Logic FC949ES Fibre Channel Adapter 
(rev 02)
25:00.0 Fibre Channel: LSI Logic / Symbios Logic FC949ES Fibre Channel Adapter 
(rev 02)
25:00.1 Fibre Channel: LSI Logic / Symbios Logic FC949ES Fibre Channel Adapter 
(rev 02)

# lspci -vvvxxx -s 25:00.1
25:00.1 Fibre Channel: LSI Logic / Symbios Logic FC949ES Fibre Channel Adapter 
(rev 02)
        Subsystem: LSI Logic / Symbios Logic Device 1070
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- 
<MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin B routed to IRQ 61
        Region 0: I/O ports at dc00 [size=256]
        Region 1: Memory at dfadc000 (64-bit, non-prefetchable) [size=16K]
        Region 3: Memory at dfaf0000 (64-bit, non-prefetchable) [size=64K]
        Expansion ROM at dc100000 [disabled] [size=1M]
        Capabilities: [50] Power Management version 2
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [68] Express (v1) Endpoint, MSI 00
                DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <64ns, 
L1 <1us
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE- FLReset-
                DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ 
Unsupported-
                        RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 256 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- UncorrErr+ FatalErr- UnsuppReq+ AuxPwr- 
TransPend-
                LnkCap: Port #0, Speed 2.5GT/s, Width x8, ASPM L0s L1, Latency 
L0 <64ns, L1 <1us
                        ClockPM- Surprise- LLActRep- BwNot-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk+ 
DLActive- BWMgmt- ABWMgmt-
        Capabilities: [98] MSI: Enable- Count=1/1 Maskable- 64bit+
                Address: 0000000000000000  Data: 0000
        Capabilities: [b0] MSI-X: Enable- Count=1 Masked-
                Vector table: BAR=1 offset=00002000
                PBA: BAR=1 offset=00003000
        Capabilities: [100] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- 
RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- 
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- 
RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                AERCap: First Error Pointer: 14, GenCap+ CGenEn- ChkCap+ ChkEn-
        Kernel driver in use: mptfc
        Kernel modules: mptfc
00: 00 10 46 06 07 00 10 00 02 00 04 0c 10 00 80 00
10: 01 dc 00 00 04 c0 ad df 00 00 00 00 04 00 af df
20: 00 00 00 00 00 00 00 00 00 00 00 00 00 10 70 10
30: 00 00 b0 df 50 00 00 00 00 00 00 00 0a 02 00 00
40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
50: 01 68 02 06 00 00 00 00 00 00 00 00 00 00 00 00
60: 00 00 00 00 01 25 00 00 10 98 01 00 25 00 00 00
70: 36 28 0a 00 81 0c 00 00 40 00 81 10 00 00 00 00
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
90: 00 00 00 00 00 00 00 00 05 b0 80 00 00 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 11 00 00 00 01 20 00 00 01 30 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00



#### lspci for HBA device on HVM Fedora 12:
# lspci
...
00:04.0 Fibre Channel: LSI Logic / Symbios Logic FC949ES Fibre Channel Adapter 
(rev 02)
00:05.0 Fibre Channel: LSI Logic / Symbios Logic FC949ES Fibre Channel Adapter 
(rev 02)
00:06.0 Fibre Channel: LSI Logic / Symbios Logic FC949ES Fibre Channel Adapter 
(rev 02)
00:07.0 Fibre Channel: LSI Logic / Symbios Logic FC949ES Fibre Channel Adapter 
(rev 02)

# lspci -vvvxxx -s 00:07.0
00:07.0 Fibre Channel: LSI Logic / Symbios Logic FC949ES Fibre Channel Adapter 
(rev 02)
        Subsystem: LSI Logic / Symbios Logic Device 1070
        Physical Slot: 7
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- 
<MAbort- >SERR- <PERR- INTx-
        Latency: 128
        Interrupt: pin B routed to IRQ 45
        Region 0: I/O ports at c400 [size=256]
        Region 1: Memory at f344c000 (64-bit, non-prefetchable) [size=16K]
        Region 3: Memory at f3430000 (64-bit, non-prefetchable) [size=64K]
        Expansion ROM at f3300000 [disabled] [size=1M]
        Capabilities: [50] Power Management version 2
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [68] Express (v1) Endpoint, MSI 00
                DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <64ns, 
L1 <1us
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE- FLReset-
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- 
Unsupported-
                        RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 128 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- UncorrErr+ FatalErr- UnsuppReq+ AuxPwr- 
TransPend-
                LnkCap: Port #0, Speed 2.5GT/s, Width x8, ASPM L0s L1, Latency 
L0 <64ns, L1 <1us
                        ClockPM- Surprise- LLActRep- BwNot-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk-
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk+ 
DLActive- BWMgmt- ABWMgmt-
        Capabilities: [98] MSI: Enable- Count=1/1 Maskable- 64bit+
                Address: 0000000000000000  Data: 0000
        Capabilities: [b0] MSI-X: Enable- Count=1 Masked-
                Vector table: BAR=1 offset=00002000
                PBA: BAR=1 offset=00003000
        Kernel driver in use: mptfc
        Kernel modules: mptfc
00: 00 10 46 06 07 00 10 00 02 00 04 0c 00 80 80 00
10: 01 c4 00 00 04 c0 44 f3 00 00 00 00 04 00 43 f3
20: 00 00 00 00 00 00 00 00 00 00 00 00 00 10 70 10
30: 00 00 30 f3 50 00 00 00 00 00 00 00 05 02 00 00
40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
50: 01 68 02 06 08 00 00 00 00 00 00 00 00 00 00 00
60: 00 00 00 00 01 25 00 00 10 98 01 00 25 00 00 00
70: 10 28 0a 00 81 0c 00 00 00 00 81 10 00 00 00 00
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
90: 00 00 00 00 00 00 00 00 05 b0 80 00 00 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 11 00 00 00 01 20 00 00 01 30 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.