Jan Beulich wrote:
>>>> On 11.10.11 at 11:51, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx> wrote:
>> Jan Beulich wrote:
>>> If the prefetch was from Xen space (only in guest context),
>>> delivering a vMCE to the guest is pointless (and perhaps confusing
>>> to the guest).
>>>
>>
>> Yes, exactly. how about delay handle it as:
>> * at mce isr
>> if ( !(gstatus & MCG_STATUS_RIPV) && !guest_mode(regs))
>> xen panic;
>> * at mce softirq
>> if ( (srar error) && (EIPV ==0) && (broken page owned by
>> hypervisor) ) xen panic;
>
> Possible, but I'm not convinced.
>
>>>> * guest may kill app, kernel thread, guest itself, or whatever;
>>>>
>>>> The error is still an error, w/ 2 possibilities in the future:
>>>> 1. it may not be consumed as an SRAR error, system keep going,
>>>> h/w mechanism may detect a SRAO error (i.e. memroy scrub) at some
>>>> time point and handled then;
>>>> 2. it may be consumed at some time point and a SRAR error
>>>> triggered again. At this time, 1). if srar occurred at
>>>> hypervisor context, xen will panic. or, 2). if srar occurred at
>>>> guest
>>>> context, xen kill the guest as a malicious one (as what the 2nd
>>>> patch do), and move the page to broken page list;
>>>>
>>>> Considering the rare possibility of the above case, I think it's
>>>> acceptable to handle it in this way. Thoughts?
>>>
>>> You're only discussing instruction fetches (which can be discarded),
>>> but you're not covering the other example I gave (GDT access from
>>> guest context - just like this is a ring-0 operations from the
>>> paging unit's pov, this ought to be an out-of-context operation
>>> from MCE's perspective).
>>
>> That would be data load error (EIPV=1), a sync error.
>
> If indeed implemented that way in hardware, that would make the
> handling ambiguous: A GDT access must not (unconditionally) be
> attributed to the (pv) guest, as it is not a problem the guest can
> (necessarily) deal with (considering the split page ownership of
> what constitutes the GDT under Xen, the guest should only be
> accountable for the non-reserved part of the GDT, the rest should
> be attributed back to Xen).
>
> The same would go for (perhaps speculative) page table walks.
>
Seems not ambiguous here: who own, who take.
If error caused by hypervisor access broken page, xen panic;
If error caused by guest access, guest would handle it (I guess normally kill
itself);
If guest maliciously access again, it would be killed by hypervisor.
> Furthermore, data prefetching is possible too - how would a problem
> there get reported?
>
It may be reported as unkown error, or nothing, but not as srar data load error
w/ EIPV=1.
Thanks,
Jinsong
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|