[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] [PATCH] X86 MCE: Add SRAR handler



>>> On 11.10.11 at 11:51, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx> wrote:
> Jan Beulich wrote:
>> If the prefetch was from Xen space (only in guest context),
>> delivering a vMCE to the guest is pointless (and perhaps confusing to
>> the guest). 
>> 
> 
> Yes, exactly. how about delay handle it as:
> * at mce isr
>       if ( !(gstatus & MCG_STATUS_RIPV) && !guest_mode(regs))
>               xen panic;
> * at mce softirq
>       if ( (srar error) && (EIPV ==0) && (broken page owned by hypervisor) )
>               xen panic;

Possible, but I'm not convinced.

>>>   * guest may kill app, kernel thread, guest itself, or whatever;
>>> 
>>> The error is still an error, w/ 2 possibilities in the future:
>>>   1. it may not be consumed as an SRAR error, system keep going, h/w
>>> mechanism may detect a SRAO error (i.e. memroy scrub) at some time
>>> point and handled then; 
>>>   2. it may be consumed at some time point and a SRAR error
>>>    triggered again. At this time, 1). if srar occurred at hypervisor
>>>    context, xen will panic. or, 2). if srar occurred at guest
>>> context, xen kill the guest as a malicious one (as what the 2nd
>>> patch do), and move the page to broken page list; 
>>> 
>>> Considering the rare possibility of the above case, I think it's
>>> acceptable to handle it in this way. Thoughts?
>> 
>> You're only discussing instruction fetches (which can be discarded),
>> but you're not covering the other example I gave (GDT access from
>> guest context - just like this is a ring-0 operations from the paging
>> unit's pov, this ought to be an out-of-context operation from MCE's
>> perspective). 
> 
> That would be data load error (EIPV=1), a sync error.

If indeed implemented that way in hardware, that would make the
handling ambiguous: A GDT access must not (unconditionally) be
attributed to the (pv) guest, as it is not a problem the guest can
(necessarily) deal with (considering the split page ownership of
what constitutes the GDT under Xen, the guest should only be
accountable for the non-reserved part of the GDT, the rest should
be attributed back to Xen).

The same would go for (perhaps speculative) page table walks.

Furthermore, data prefetching is possible too - how would a problem
there get reported?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.