[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Reentrant NMIs, MCEs and interrupt stack tables.



>>> On 21.11.12 at 22:17, Tim Deegan <tim@xxxxxxx> wrote:
> At 21:06 +0000 on 21 Nov (1353532004), Andrew Cooper wrote:
>> Hello,
>> 
>> While working on a fix for the rare-but-possible problem of reentrant
>> NMIs and MCEs, I have discovered that it is sadly possible to generate
>> fake NMIs and MCEs which will run the relevant handlers on the relevant
>> stacks, without invoking any of the other CPU logic for these special
>> interrupts.
>> 
>> A fake NMI can be generated by a processor in PIC mode as opposed to
>> Virtual wire mode, with a delivery of vector 2.  This setup is certainly
>> possible on a 64bit CPU, but I doubt there are many 64bit CPUs running
>> with only PIC.
>> 
>> A fake MCE is easy to generate.  A mal-programmed IO-APIC, IOMMU or
>> MSI/MSI-X entry which deliveres vector 0x18 is sufficient.  The LAPIC
>> will reject vectors 0 thru 0xf, but will deliver vectors 0x10 thru 0x1f,
>> despite them being architecturally reserved for exceptions.
> 
> You're not suggesting these could be caused by guest activity?
> 
>> The possibility of these fake interrupts (however unlikely) means that
>> there is necessarily a race condition between receiving a fake interrupt
>> and a genuine interrupt during which the handler cannot fixup the stack
>> sufficiently to be able to safely get back out.  If this race condition
>> were to occur, the real interrupt will corrupt the exception frame of
>> the fake interrupt, meaning that we cannot possibly resume the original
>> context.  This situation can be detected, but cannot be corrected, and
>> the only course of action is to crash gracefully.
> 
> If once of these could only be casued by a bug in Xen, then I don't think
> we need to handle it at all.

Fully agree - the nesting we need to deal with cleanly is only
what can result from proper operation. Buggy operation should
not require any extra efforts, as long as it's only hypothetical
(i.e. if we knew a certain chipset/CPU could cause such, the need
for a workaround would surely arise; bugs in Xen we should treat
as such rather than trying to work around their effects).

> If it's trivial to detect it and crash cleanly, that would be nice.

That shouldn't be too difficult, as such interrupts would set ISR bits
in either the PIC or the LAPIC.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.