[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Woes of NMIs and MCEs, and possibly how to fix



On 03/12/12 11:24, George Dunlap wrote:
On Fri, Nov 30, 2012 at 5:34 PM, Andrew Cooper <andrew.cooper3@xxxxxxxxxx <mailto:andrew.cooper3@xxxxxxxxxx>> wrote:

    3) SMM mode executing an iret will re-enable NMIs.  There is
    nothing we
    can do to prevent this, and as an SMI can interrupt NMIs and MCEs, no
    way to predict if/when it may happen.  The best we can do is
    accept that
    it might happen, and try to deal with the after effects.


Did you actually mean IRET, or did you mean RSM? Does it make a difference?

If, for some obscure reason, the SMM code decides, for example, to run code like "int 0x21", where the int 0x21 handler ends with the rather predictable IRET to return to the caller, then you would indeed "unlock" the NMI blocking that happens from the NMI being taken by the processor. NMI will still not interrupt the SMM code, but it WILL interrupt the code that was running before SMI was taken - which could be an NMI handler, that doesn't expect another NMI.

RSM doesn't, in and of itself [unless "messing" with the saved state] alter the NMI state in other ways than "restore to previous value".

    As for 1 possible solution which we cant use:

    If it were not for the sysret stupidness[1] of requiring the
    hypervisor
    to move to the guest stack before executing the `sysret`
    instruction, we
    could do away with the stack tables for NMIs and MCEs alltogether, and
    the above crazyness would be easy to fix.  However, the overhead of
    always using iret to return to ring3 is not likely to be acceptable,
    meaning that we cannot "fix" the problem by discarding interrupt
    stacks
    and doing everything properly on the main hypervisor stack.


64-bit Intel processors have SYSEXIT, right? It's worth pointing out the following alternatives, even if we never actually use them:

1. Use SYSEXIT on Intel processors and let the bugs (or some subset of them) remain on AMD
2. Use SYSEXIT on Intel processors and IRET on AMD
Given that AMD has cut back their investment in OSS development, and is talking about moving to ARM, it may only be a matter of time before Intel is the only important player in the x86 world.
Surely we would still want to support existing machines made with AMD processors. And as far as possible, we should keep the code architecture independent. We do not want a bunch of "IF processor=INTEL" in the assembler code [and even less, "#if BUILD_FOR_INTEL" and separate binaries, I would expect].

SYSCALL and SYSRET is the corresponding pair of instructions as SYSENTER and SYSEXIT but for 64-bit OS's (don't ask me why they decided to add a new pair of instructions, rather than just alter the behaviour of SYSENTER/SYSEXIT. I'm sure there were some reason for this, but it's beyond my understanding.

--
Mats

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.