Re: [Xen-devel] domU and dom0 hung with Xen console interrupt binding sh

Subject: Re: [Xen-devel] domU and dom0 hung with Xen console interrupt binding showing in-flight=1, (---M)
From: Bruce Edge <bruce.edge@xxxxxxxxx>
Date: Tue, 17 Aug 2010 10:28:33 -0700
On Tue, Jun 29, 2010 at 1:42 AM, Jan Beulich <JBeulich@xxxxxxxxxx> wrote:
>>> On 28.06.10 at 20:22, Dante Cinco <dantecinco@xxxxxxxxx> wrote:
> I have an HP Proliant DL380-G6 (dual Xeon E5540 @ 2.53GHz) with Xen 4.0.0
> and dom0 Linux x86_64 pvops and domU Linux kernel x86_64.
> I'm using PCI passthrough (pci-stub) to pass my 4-port 8Gb PMC-Sierra Fibre
> Channel HBA to domU. After running I/Os for several hours, both dom0 and
> domU hangs and the Xen console shows the interrupt binding below where IRQ
> 66 shows in-flight=1 and mask set (---M). What's the best way to debug this
> problem?

There are potentially two problems here: One is that the guest may
fail to send the EOI notification. You would want to check whether
pirq_guest_eoi() got run after that last occurrence of the interrupt.

The more worrying part is that Xen should time out on a guest failing
to send the EOI notification, and ack the interrupt nevertheless.
Looking at the code I fail to see how the ack_APIC_irq() would get
sent in this case: non-maskable MSIs get this issued from
end_msi_irq(), but ->end doesn't get invoked from
irq_guest_eoi_timer_fn() (only ->enable does). Keir, am I missing

Otoh I can't see how this can work reliably in the first place: Since
there's no other way to mask such interrupts, sending an ack to the
LAPIC could result in an interrupt storm. Disabling MSI on the
affected device isn't a good option either, as we know there are
devices that switch to legacy IRQ mode irreversibly in that case,
and hence the device becomes unusable (presumably until being
reset). But very likely this would still be better than hanging the
entire box; it probably would just need a more graceful timeout.


This is still happening. I have 2 identical boxes that were running a stress test and both hung after a few hours. They have identical hardware and software configs so I'll report the config for one and attach the xen dump for both.

dom0 info:

HP Proliant DL380-G6 (dual Xeon E5540 @ 2.53GHz) 

# cat /proc/cmdline 
root=/dev/mapper/system-dom0_0 ro earlyprintk=xen loglevel=10 debug acpi=force console=hvc0,115200n8

# uname -a
Linux dpm8800-09 #1 SMP Wed Aug 4 15:38:21 PDT 2010 x86_64 GNU/Linux

The domU is an Ubuntu 10.04 kernel, in hvm mode.

# xm info
host                   : dpm8800-09
release                :
version                : #1 SMP Wed Aug 4 15:38:21 PDT 2010
machine                : x86_64
nr_cpus                : 16
nr_nodes               : 2
cores_per_socket       : 4
threads_per_core       : 2
cpu_mhz                : 2533
hw_caps                : bfebfbff:28100800:00000000:00001b40:009ce3bd:00000000:00000001:00000000
virt_caps              : hvm hvm_directio
total_memory           : 12277
free_memory            : 11631
node_to_cpu            : node0:0,2,4,6,8,10,12,14
node_to_memory         : node0:5601
node_to_dma32_mem      : node0:3506
max_node_id            : 1
xen_major              : 4
xen_minor              : 0
xen_extra              : .1-rc4
xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 
xen_scheduler          : credit
xen_pagesize           : 4096
platform_params        : virt_start=0xffff800000000000
xen_changeset          : unavailable
xen_commandline        : dom0_mem=512M dom0_max_vcpus=1 dom0_vcpus_pin=true iommu=1,passthrough,no-intremap loglvl=all loglvl_guest=all loglevl=10 debug apic=on apic_verbosity=verbose extra_guest_irqs=80 com1=115200,8n1 console=com1 console_to_ring xen-pciback.permissive acpi=force numa=on
cc_compiler            : gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5) 
cc_compile_by          : bedge
cc_compile_domain      : lsi.com
cc_compile_date        : Sun Aug  1 09:44:29 PDT 2010
xend_config_format     : 4

This device (as well as a few more of these) is passed through via pciback:

dpm8800-09:~# lspci | grep 10:
10:00.0 Fibre Channel: PMC-Sierra Inc. Device 8032 (rev 08)
10:00.1 Fibre Channel: PMC-Sierra Inc. Device 8032 (rev 08)
10:00.2 Fibre Channel: PMC-Sierra Inc. Device 8032 (rev 08)
10:00.3 Fibre Channel: PMC-Sierra Inc. Device 8032 (rev 08) <- on both cases it's this device that loses the interrupt in flight

10:00.3 Fibre Channel: PMC-Sierra Inc. Device 8032 (rev 08)
        Flags: bus master, fast devsel, latency 0, IRQ 5
        I/O ports at a800 [size=256]
        I/O ports at ac00 [size=256]
        Memory at fbdc0000 (64-bit, non-prefetchable) [size=32K]
        Capabilities: [50] Power Management version 3
        Capabilities: [60] Message Signalled Interrupts: Mask- 64bit+ Queue=0/1 Enable-
        Capabilities: [70] Express Endpoint, MSI 01
        Capabilities: [b0] MSI-X: Enable- Mask- TabSize=9
        Capabilities: [100] Advanced Error Reporting <?>

From host dpm8800-10:
(XEN)    IRQ: 133 affinity:00000000,00000000,00000000,00000001 vec:94 type=PCI-MSI         status=00000050 in-flight=0 domain-list=2:126(----),
(XEN)    IRQ: 134 affinity:00000000,00000000,00000000,00000001 vec:d4 type=PCI-MSI         status=00000050 in-flight=1 domain-list=2:125(---M),
(XEN)    IRQ: 135 affinity:00000000,00000000,00000000,00000004 vec:9c type=PCI-MSI         status=00000010 in-flight=0 domain-list=2:124(----),

>From host dpm8800-09:
(XEN)    IRQ: 131 affinity:00000000,00000000,00000000,00002000 vec:7f type=PCI-MSI         status=00000010 in-flight=0 domain-list=1: 62(----),
(XEN)    IRQ: 132 affinity:00000000,00000000,00000000,00000001 vec:dd type=PCI-MSI         status=00000010 in-flight=1 domain-list=2:127(---M),
(XEN)    IRQ: 133 affinity:00000000,00000000,00000000,00000001 vec:3e type=PCI-MSI         status=00000010 in-flight=0 domain-list=2:126(----),

This time both cases correspond to 10:00.3:

(XEN) 10:00.3 - dom 2   - MSIs < 132 >

(XEN)  MSI   132 vec=dc  fixed  edge   assert phys    cpu dest=00000010 mask=0/0/-1

Let me know if there's anything else I can provide to assist in diagnosing this problem.



> (XEN)    IRQ:  66 affinity:00000000,00000000,00000000,00000001 vec:b9
> type=PCI-MSI         status=00000010 in-flight=1 domain-list=1: 79(---M),
> (XEN)    IRQ:  67 affinity:00000000,00000000,00000000,00000004 vec:d9
> type=PCI-MSI         status=00000010 in-flight=0 domain-list=1: 78(----),
> (XEN)    IRQ:  68 affinity:00000000,00000000,00000000,00000010 vec:22
> type=PCI-MSI         status=00000010 in-flight=0 domain-list=1: 77(----),
> (XEN)    IRQ:  69 affinity:00000000,00000000,00000000,00000040 vec:2a
> type=PCI-MSI         status=00000010 in-flight=0 domain-list=1: 76(----),
> (XEN) 07:00.3 - dom 1   - MSIs < 69 >
> (XEN) 07:00.2 - dom 1   - MSIs < 68 >
> (XEN) 07:00.1 - dom 1   - MSIs < 67 >
> (XEN) 07:00.0 - dom 1   - MSIs < 66 >
> (XEN)  MSI    66 vec=b9  fixed  edge   assert phys    cpu dest=00000000
> mask=0/0/-1
> (XEN)  MSI    67 vec=d9  fixed  edge   assert phys    cpu dest=00000004
> mask=0/0/-1
> (XEN)  MSI    68 vec=22  fixed  edge   assert phys    cpu dest=00000002
> mask=0/0/-1
> (XEN)  MSI    69 vec=2a  fixed  edge   assert phys    cpu dest=00000006
> mask=0/0/-1
> Thanks.
> Dante

