WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] [PATCH] X86 MCE: Add SRAR handler

>>> On 08.10.11 at 10:29, "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx> wrote:

> 
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@xxxxxxxx] 
>> Sent: Friday, September 30, 2011 3:25 PM
>> To: Liu, Jinsong; Jiang, Yunhong
>> Cc: keir.xen@xxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxx 
>> Subject: RE: [Xen-devel] [PATCH] X86 MCE: Add SRAR handler
>> 
>> >>> On 30.09.11 at 04:51, "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx>
> wrote:
>> 
>> >
>> >> -----Original Message-----
>> >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx] 
>> >> This made me look at the current source, and there I see in
>> >> mce_urgent_action()
>> >>
>> >>     if ( !(gstatus & MCG_STATUS_RIPV) && !guest_mode(regs))
>> >>         return -1;
>> >>
>> >> which I think should say ... _EIPV and use || instead. Thoughts?
>> >
>> > I think this code means, if the error happens in hypervisor mode (i.e.
>> > !guest_mode()), and RIPV indicate the RIP in stack can't be restarted,
> we
>> > have to panic.
>> 
>> Then the guest_mode() check still lacks an extra check of EIPV, like
>> 
>>      if ( !(gstatus & MCG_STATUS_RIPV) &&
>>           (!(gstatus & MCG_STATUS_EIPV) || !guest_mode(regs)))
>>          return -1;
>> 
> 
> The RIPV is not related to the EIPV. RIPV means the context saved in the
> stack can't be restarted anymore. According to the SDM, RIPV means
> "execution can be restarted reliably at the instruction pointed to by the
> instruction pointer pushed on the stack". It's not about error happened
> synchronously or asynchronously. The point is, if the program is running in
> hypervisor context, and RIPV tells us that the program can't be restarted,
> we can't do anything but panic, because we can't switch context while we are
> in xen. So this code have nothing to do with EIPV. 

I continue to disagree (including the statement in your other response):
RIPV only tells us whether we can resume, not in which context the error
occurred. EIPV tells us whether, by looking at the saved registers, we
can determine the context that the error occurred in. Since with !RIPV
we have to determine in what context the error occurred in order to
decide whether to panic or just kill a guest, we can't ignore EIPV (and if
it's not set we have to assume the worst case, since even if the
registers indicate guest mode the error may have occurred in
hypervisor context or accessing hypervisor structures [consider e.g. a
data load error during a GDT access]).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel