[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/HVM: correctly deal with benign exceptions when combining two



>>> On 10.04.19 at 01:58, <andrew.cooper3@xxxxxxxxxx> wrote:
> Across the instruction writeback boundary (priority 10 around to 1),
> PENDING_DBG persists, and feeds into exception considerations at
> priority 2 and 4.  This is why they manifest as traps, but they can be
> equally thought of as faults before the fetch of the next instruction.

Ah yes, this is perhaps a helpful way of looking at it, in particular
wrt behavior of S/G insns running into other than #DB after having
successfully completed at least one element.

>> That's also the way the XSA-156 advisory describes it.
> 
> XSA-156 was written at a time when we both had far less authority to
> comment on the specific details.  Certainly as far as I am concerned,
> the past couple of years have made a massive difference, not least the
> several months spent debugging the STI/singlestep issue with Gil.
> 
> The mistake in XSA-156 is the description of "upon completion of the
> delivery of the first exception".
> 
> The infinite loop occurs because delivery of the first #DB is never
> considered complete (as #DB is still considered pending once the
> exception frame was written, because it was triggered in the process of
> delivering the exception), and therefore does not move from priority 4
> to 5, which would allow an NMI to break the cycle.
> 
> Similarly for the #AC case, priority never moves from 10 back to 1,
> because delivery of the first #AC is never seen to have completed.

To be honest, to me this continues to be a (mis-)implementation
detail. A proper architectural specification would not allow for such
pathological cases in the first place.

>>> For VT-x, we're supposed to fill in the PENDING_DBG control rather than
>>> raise #DB directly, explicitly to allow the priority of interrupts to be
>>> expressed correctly.  There is a very large quantity of untangling
>>> working to do, and very clear I'm not going to have time to fix even the
>>> STI/Singlestep issue in the 4.12 timeframe.
>> Are you saying there need to be any vendor specific adjustments
>> to this function then? I would very much hope that the code here
>> could remain vendor independent, with vendor specific adjustments
>> done in vendor specific code, and independently of the proposed
>> change here.
> 
> I think the better course of action is for {vmx,svm}_inject_event() to
> gain adjustments to their #DB handling.
> 
> The PENDING_DBG control is specifically an accumulation of various
> debugging conditions, which will cause #DB traps to be delivered at the
> appropriate point early in the next instruction cycle.

Agreed - this would also pave the road to actually support raising
data #DB from emulated insns.

>>>> Sadly neither AMD nor Intel really define what happens with two benign
>>>> exceptions - the term "sequentially" used by both is poisoned by how the
>>>> combining of benign and non-benign exceptions is described. Since NMI,
>>>> #MC, and hardware interrupts are all benign and (perhaps with the
>>>> exception of #MC) can't occur second, favor the first in order to not
>>>> lose it.
>>> #MC has the highest priority so should only be recognised immediately
>>> after an instruction boundary.
>> Are you sure? What about an issue with one of the memory
>> accesses involved in delivering a previously raised exception?
> 
> #MC is abort, and is imprecise.

Mind me correcting this to "may be imprecise": There's a flag after
all telling whether in fact it is.

>  It bears no relationship to the content
> of the iret frame it is observed with.  This is why details of the issue
> are stored in MSRs and not on the stack - the CPU can be hundreds of
> instructions further on before an L3 cache/IIO #MC propagates back
> through the uncore.
> 
> Despite being classified as an exception, #MC behaves much more like an
> NMI from the OS point of view, except that getting two of them
> back-to-back is totally fatal.
> 
>>
>>> I don't however see a way of stacking #AC, because you can't know that
>>> one has occured until later in the instruction cycle than all other
>>> sources.  What would happen is that you'd raise #AC from previous
>>> instruction, and then recognise #MC while starting to execute the #AC
>>> entry point.  (I think)
>> Well - see XSA-156 for what hardware does in that situation.
> 
> The details of XSA-156 are inaccurate.
> 
> #AC can stack, but the problem only manifests when Xen emulates an
> injection, which is restricted to SVM for the moment.
> 
> That said, I'm considering moving it back to being common to provide
> architectural behaviour despite the silicon issue which causes XSA-170

Would you mind helping me make the connection between #AC
delivery (and its emulation) and XSA-170, being about VM entry
with non-canonical %rip?

>> Irrespective of all of the above - what am I to take from your
>> response? I.e. what adjustments (within the scope of this patch)
>> do you see necessary for the change to become acceptable?
> 
> By and large, I think the actual code changes are ok.  They are a
> general improvement, but I am not comfortable with the factual
> inaccuracies in the commit message and the comments.
> 
> I will see if I can rephrase it suitably.

I appreciate this, first and foremost because despite all of your
explanations I'm afraid I still can't see the inaccuracies there.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.