[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/boot: Make alternative patching NMI-safe



On 05/02/18 16:20, Jan Beulich wrote:
>>>> On 05.02.18 at 16:16, <andrew.cooper3@xxxxxxxxxx> wrote:
>> On 05/02/18 14:09, Jan Beulich wrote:
>>>>>> On 05.02.18 at 11:24, <andrew.cooper3@xxxxxxxxxx> wrote:
>>>> During patching, there is a very slim risk that an NMI or MCE interrupt in 
>>>> the
>>>> middle of altering the code in the NMI/MCE paths, in which case bad things
>>>> will happen.
>>>>
>>>> The NMI risk can be eliminated by running the patching loop in NMI 
>>>> context, at
>>>> which point the CPU will defer further NMIs until patching is complete.
>>>>
>>>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>>> So you continue to think that the risk of hitting an #MC here is
>>> acceptable, despite there being a solution to the problem. To be
>>> honest, I find this a little strange. (I do agree that there's no
>>> good solution to the similar live patching problem.)
>> The risk is already sufficiently tiny that in 3 years, it hasn't been
>> observed, nor do I think it is likely to be observed in the future.  At
>> this point on boot, there is nothing interesting set up, which further
>> reduces the risk of an MCE.
>>
>> Furthermore, whether or not Xen survives the MCE (and don't believe I've
>> ever seen Xen successfully recover from an MCE), the hardware is faulty
>> and needs replacing (modulo cosmic rays, but the chances of those really
>> are astronomical).
>>
>> Irrespective of that, there is no way I'm aware of for generating MCEs
>> on demand, and therefore, no way of testing the logic.  For that reason
>> alone, I don't think it is wise to be taking complicated invasive logic.
> Your first attempt, minus the live patching parts, wasn't all that
> invasive, and the code would have been the same for NMI and
> #MC (so the testing argument also doesn't apply, as NMIs are
> not uncommon).

But as you correctly pointed out, it was a very long way from being
complete.  We currently have no idea whether we are in NMI context, so
arranging not to not execute an iret is hard.

In some copious free time, I should try to port Linux's re-entrant NMI
logic again.  I tried once before and that all became unstuck because of
intermixed NMI and MCEs.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.