WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] [PATCH] X86 MCE: Add SRAR handler

To: Jan Beulich <JBeulich@xxxxxxxx>, "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx>
Subject: RE: [Xen-devel] [PATCH] X86 MCE: Add SRAR handler
From: "Liu, Jinsong" <jinsong.liu@xxxxxxxxx>
Date: Tue, 11 Oct 2011 19:58:43 +0800
Accept-language: en-US
Acceptlanguage: en-US
Cc: "keir.xen@xxxxxxxxx" <keir.xen@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Tue, 11 Oct 2011 05:02:00 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4E9432CF020000780005AADF@xxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <BC00F5384FCFC9499AF06F92E8B78A9E263B557B77@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <4E84ADF70200007800058882@xxxxxxxxxxxxxxxxxxxx> <789F9655DD1B8F43B48D77C5D306597312D2366A9B@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <4E858AF30200007800058A64@xxxxxxxxxxxxxxxxxxxx> <789F9655DD1B8F43B48D77C5D306597312D23D3B7D@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <4E92C369020000780005A68B@xxxxxxxxxxxxxxxxxxxx> <BC00F5384FCFC9499AF06F92E8B78A9E2693B8F9FB@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <4E941DA6020000780005AA9B@xxxxxxxxxxxxxxxxxxxx> <BC00F5384FCFC9499AF06F92E8B78A9E269B10E0CA@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <4E9432CF020000780005AADF@xxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcyH/mDw13VJnyA7TqaEOs2Jqz/4GAAAwoKw
Thread-topic: [Xen-devel] [PATCH] X86 MCE: Add SRAR handler
Jan Beulich wrote:
>>>> On 11.10.11 at 11:51, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx> wrote:
>> Jan Beulich wrote:
>>> If the prefetch was from Xen space (only in guest context),
>>> delivering a vMCE to the guest is pointless (and perhaps confusing
>>> to the guest). 
>>> 
>> 
>> Yes, exactly. how about delay handle it as:
>> * at mce isr
>>      if ( !(gstatus & MCG_STATUS_RIPV) && !guest_mode(regs))                 
>> xen panic;
>> * at mce softirq
>>      if ( (srar error) && (EIPV ==0) && (broken page owned by
>>              hypervisor) ) xen panic;
> 
> Possible, but I'm not convinced.
> 
>>>>   * guest may kill app, kernel thread, guest itself, or whatever;
>>>> 
>>>> The error is still an error, w/ 2 possibilities in the future:
>>>>   1. it may not be consumed as an SRAR error, system keep going,
>>>> h/w mechanism may detect a SRAO error (i.e. memroy scrub) at some
>>>> time point and handled then; 
>>>>   2. it may be consumed at some time point and a SRAR error
>>>>    triggered again. At this time, 1). if srar occurred at
>>>>    hypervisor context, xen will panic. or, 2). if srar occurred at
>>>> guest 
>>>> context, xen kill the guest as a malicious one (as what the 2nd
>>>> patch do), and move the page to broken page list;
>>>> 
>>>> Considering the rare possibility of the above case, I think it's
>>>> acceptable to handle it in this way. Thoughts?
>>> 
>>> You're only discussing instruction fetches (which can be discarded),
>>> but you're not covering the other example I gave (GDT access from
>>> guest context - just like this is a ring-0 operations from the
>>> paging unit's pov, this ought to be an out-of-context operation
>>> from MCE's perspective).
>> 
>> That would be data load error (EIPV=1), a sync error.
> 
> If indeed implemented that way in hardware, that would make the
> handling ambiguous: A GDT access must not (unconditionally) be
> attributed to the (pv) guest, as it is not a problem the guest can
> (necessarily) deal with (considering the split page ownership of
> what constitutes the GDT under Xen, the guest should only be
> accountable for the non-reserved part of the GDT, the rest should
> be attributed back to Xen).
> 
> The same would go for (perhaps speculative) page table walks.
> 

Seems not ambiguous here: who own, who take.
If error caused by hypervisor access broken page, xen panic;
If error caused by guest access, guest would handle it (I guess normally kill 
itself);
If guest maliciously access again, it would be killed by hypervisor.

> Furthermore, data prefetching is possible too - how would a problem
> there get reported?
> 

It may be reported as unkown error, or nothing, but not as srar data load error 
w/ EIPV=1.

Thanks,
Jinsong

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel