[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] domU and dom0 hung with Xen console interrupt binding showing in-flight=1, (---M)



>>> On 28.06.10 at 20:22, Dante Cinco <dantecinco@xxxxxxxxx> wrote:
> I have an HP Proliant DL380-G6 (dual Xeon E5540 @ 2.53GHz) with Xen 4.0.0
> and dom0 Linux 2.6.32.12 x86_64 pvops and domU Linux kernel 2.6.30.1 x86_64.
> I'm using PCI passthrough (pci-stub) to pass my 4-port 8Gb PMC-Sierra Fibre
> Channel HBA to domU. After running I/Os for several hours, both dom0 and
> domU hangs and the Xen console shows the interrupt binding below where IRQ
> 66 shows in-flight=1 and mask set (---M). What's the best way to debug this
> problem?

There are potentially two problems here: One is that the guest may
fail to send the EOI notification. You would want to check whether
pirq_guest_eoi() got run after that last occurrence of the interrupt.

The more worrying part is that Xen should time out on a guest failing
to send the EOI notification, and ack the interrupt nevertheless.
Looking at the code I fail to see how the ack_APIC_irq() would get
sent in this case: non-maskable MSIs get this issued from
end_msi_irq(), but ->end doesn't get invoked from
irq_guest_eoi_timer_fn() (only ->enable does). Keir, am I missing
something?

Otoh I can't see how this can work reliably in the first place: Since
there's no other way to mask such interrupts, sending an ack to the
LAPIC could result in an interrupt storm. Disabling MSI on the
affected device isn't a good option either, as we know there are
devices that switch to legacy IRQ mode irreversibly in that case,
and hence the device becomes unusable (presumably until being
reset). But very likely this would still be better than hanging the
entire box; it probably would just need a more graceful timeout.

Jan

> (XEN)    IRQ:  66 affinity:00000000,00000000,00000000,00000001 vec:b9
> type=PCI-MSI         status=00000010 in-flight=1 domain-list=1: 79(---M),
> (XEN)    IRQ:  67 affinity:00000000,00000000,00000000,00000004 vec:d9
> type=PCI-MSI         status=00000010 in-flight=0 domain-list=1: 78(----),
> (XEN)    IRQ:  68 affinity:00000000,00000000,00000000,00000010 vec:22
> type=PCI-MSI         status=00000010 in-flight=0 domain-list=1: 77(----),
> (XEN)    IRQ:  69 affinity:00000000,00000000,00000000,00000040 vec:2a
> type=PCI-MSI         status=00000010 in-flight=0 domain-list=1: 76(----),
> 
> (XEN) 07:00.3 - dom 1   - MSIs < 69 >
> (XEN) 07:00.2 - dom 1   - MSIs < 68 >
> (XEN) 07:00.1 - dom 1   - MSIs < 67 >
> (XEN) 07:00.0 - dom 1   - MSIs < 66 >
> 
> (XEN)  MSI    66 vec=b9  fixed  edge   assert phys    cpu dest=00000000
> mask=0/0/-1
> (XEN)  MSI    67 vec=d9  fixed  edge   assert phys    cpu dest=00000004
> mask=0/0/-1
> (XEN)  MSI    68 vec=22  fixed  edge   assert phys    cpu dest=00000002
> mask=0/0/-1
> (XEN)  MSI    69 vec=2a  fixed  edge   assert phys    cpu dest=00000006
> mask=0/0/-1
> 
> Thanks.
> 
> Dante




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.