[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86: adjust handling of interrupts coming in via legacy vectors

>>> On 14.05.12 at 15:33, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:
> On 14/05/12 13:55, Jan Beulich wrote:
>>>>> On 14.05.12 at 14:39, "Jan Beulich" <JBeulich@xxxxxxxx> wrote:
>>> The debugging code added in c/s 24707:96987c324a4f was hit a (small)
>>> number of times (one report being
>>> http://lists.xen.org/archives/html/xen-devel/2012-05/msg00332.html),
>>> apparently always with a vector within the legacy range. Obviously,
>>> besides legacy vectors not normally expected to be in use on systems
>>> with IO-APIC(s), they should never make it to the IRQ migration logic.
>>> This wasn't being prevented so far: Since we don't have a one-to-one
>>> mapping between vectors and IRQs - legacy IRQs may have two vectors
>>> associated with them (one used in either 8259A, the other used in one
>>> of the IO-APICs) -, vector-to-IRQ translations for legacy vectors (as
>>> used in do_IRQ()) would yield a valid IRQ number despite the IRQ
>>> really being handled via an IO-APIC.
>>> This gets changed here - disable_8259A_irq() zaps the legacy vector-to-
>>> IRQ mapping, and enable_8259A_irq(), should it ever be called for a
>>> particular interrupts, restores it.
>>> Additionally, the spurious interrupt logic in do_IRQ() gets adjusted
>>> too: Interrupts coming in via legacy vectors obviously didn't get
>>> reported through the IO-APIC/LAPIC pair (as we never program these
>>> vectors into any RTE), and hence shouldn't get ack_APIC_irq() called on
>>> them. Instead, a new function (pointer) bogus_8259A_irq() gets used to
>>> have the 8259A driver take care of the bogus interrupt (as outside of
>>> automatice EOI mode it may need an EOI to be issued for it to prevent
>>> other interrupts that may legitimately go through the 8259As from
>>> getting masked out).
>> Note that this patch does not make any attempt at dealing with the
>> underlying issue that causes the bogus interrupt(s) to show up. If
>> my analysis is right, we shouldn't see crashes anymore, but instead
>> observe instances of spurious interrupts on legacy vectors. It would
>> certainly be nice to have an actual proof of this (albeit I realize that
>> this isn't readily reproducible), in order to then - if indeed behaving
>> as expected - add debugging code to identify whether such interrupts
>> in fact get raised by one of the 8259A-s (particularly printing the
>> cached and physical mask register values), or whether they get
>> introduced into the system by yet another obscure mechanism.
>> One particular thing I'm suspicious about are the numerous aliases
>> to the two (each) 8259A I/O ports that various chipsets have: What
>> if some component in Dom0 accesses one of the alias ports in order
>> to do something specific to a non-standard platform (say, probe for
>> some special hardware interface), not realizing that it actually plays
>> with PIC state? Linux under the same conditions wouldn't severely
>> suffer - as it has a 1:1 vector <-> IRQ translation, it likely would
>> merely observe an extra interrupt.
> On the whole, the patch looks sensible, but what happens if the spurious
> interrupt is coming in through the Local APIC ?  If this is the case,
> then we still need to ACK it, even if it is a bogus PIC interrupt.
> Perhaps in irq.c, the changes should check whether the observed vector
> has been raised in the LAPIC and ack it, and then decide whether it is
> bogus or not.

Should that really turn out to be the case, we're in much bigger trouble,
as then we need an explanation how an interrupt at that vector could
have got raised in the first place. I'd therefore like to keep the current
change deal only with things that we know can happen.

> Might it also be sensible to remove dom0's permissions to use the PIC
> ports, in case it is some weird issue like that?

That's already being done iirc. The problem is that it's non-trivial (and
perhaps non-reliable) to determine the aliases, and hence we can't
blindly remove more than the two real ports from Dom0's permitted


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.