[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] cpuidle and un-eoid interrupts at the local apic



Andrew Cooper wrote on 2013-08-12:
> On 12/08/13 11:05, Jan Beulich wrote:
>>>>> On 12.08.13 at 11:28, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> wrote:
>>> On 12/08/13 09:20, Jan Beulich wrote:
>>>>>>> On 09.08.13 at 23:27, "Thimo E." <abc@xxxxxxxxxx> wrote:
>>>>> (XEN) **Pending EOI error (XEN)   irq 29, vector 0x24 (XEN)   s[0]
>>>>> irq 29, vec 0x24, ready 0, ISR 00000001, TMR 00000000, IRR 00000000
>>>>> (XEN) All LAPIC state: (XEN) [vector]      ISR      TMR      IRR
>>>>> (XEN) [1f:00] 00000000 00000000 00000000 (XEN) [3f:20] 00000010
>>>>> 76efa12e 00000000 (XEN) [5f:40] 00000000 e6f0f2fc 00000000 (XEN)
>>>>> [7f:60] 00000000 32d096ca 00000000 (XEN) [9f:80] 00000000 78fcf87a
>>>>> 00000000 (XEN) [bf:a0] 00000000 f9b9fe4e 00000000 (XEN) [df:c0]
>>>>> 00000000 ffdfe7ab 00000000 (XEN) [ff:e0] 00000000 00000000 00000000
>>>>> (XEN) Peoi stack trace records:
>>>> Mind providing (a link to) the patch that was used here, so that
>>>> one can make sense of the printed information (and perhaps also
>>>> suggest adjustments to that debugging code)? Nothing I was able to
>>>> find on the list fully matches the output above...
>>> Attached
>> Thanks. Actually, the second case he sent has an interesting
>> difference:
>> 
>> (XEN)   s[0] irq 29, vec 0x26, ready 0, ISR 00000001, TMR 00000000, IRR
>> 00000001
>> 
>> i.e. we in fact have _three_ instance of the interrupt (two
>> in-service, and one request). I don't see an explanation for this
>> other than buggy hardware. Sadly we still don't know what device it
>> is that is behaving that way (including the confirmation that it's a
>> non- maskable MSI one).
>> 
>> Jan
>> 
> 
> On the XenServer hardware where we have seen this issue, the
> problematic interrupt was from:
> 
> 00:19.0 Ethernet controller: Intel Corporation Ethernet Connection
> I217-LM (rev 02) Subsystem: Intel Corporation Device 0000 Control: I/O+
> Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR-
> FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast
> >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Interrupt: pin
> A routed to IRQ 1275 Region 0: Memory at c2700000 (32-bit,
> non-prefetchable) [size=128K] Region 1: Memory at c273e000 (32-bit,
> non-prefetchable) [size=4K] Region 2: I/O ports at 7080 [size=32]
> Capabilities: [c8] Power Management version 2 Flags: PMEClk- DSI+ D1-
> D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst-
> PME-Enable- DSel=0 DScale=1 PME- Capabilities: [d0] MSI: Enable+
> Count=1/1 Maskable- 64bit+ Address: 00000000fee00318 Data: 0000
> Capabilities: [e0] PCI Advanced Features AFCap: TP+ FLR+ AFCtrl: FLR-
> AFStatus: TP- Kernel driver in use: e1000e Kernel modules: e1000e
> 
> I am still attempting to reproduce the issue, but we haven't seen it
> again since my email at the root of this thread.
Did you see the issue on other HSW machine without this NIC? Also, Thimo, have 
you tried to pin the vcpu and stop irqbalance in dom0?

> 
> ~Andrew
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> http://lists.xen.org/xen-devel


Best regards,
Yang



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.