[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] [PATCH] X86 MCE: Add SRAR handler



>>> On 08.10.11 at 10:29, "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx> wrote:

> 
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@xxxxxxxx] 
>> Sent: Friday, September 30, 2011 3:25 PM
>> To: Liu, Jinsong; Jiang, Yunhong
>> Cc: keir.xen@xxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxx 
>> Subject: RE: [Xen-devel] [PATCH] X86 MCE: Add SRAR handler
>> 
>> >>> On 30.09.11 at 04:51, "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx>
> wrote:
>> 
>> >
>> >> -----Original Message-----
>> >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx] 
>> >> This made me look at the current source, and there I see in
>> >> mce_urgent_action()
>> >>
>> >>     if ( !(gstatus & MCG_STATUS_RIPV) && !guest_mode(regs))
>> >>         return -1;
>> >>
>> >> which I think should say ... _EIPV and use || instead. Thoughts?
>> >
>> > I think this code means, if the error happens in hypervisor mode (i.e.
>> > !guest_mode()), and RIPV indicate the RIP in stack can't be restarted,
> we
>> > have to panic.
>> 
>> Then the guest_mode() check still lacks an extra check of EIPV, like
>> 
>>      if ( !(gstatus & MCG_STATUS_RIPV) &&
>>           (!(gstatus & MCG_STATUS_EIPV) || !guest_mode(regs)))
>>          return -1;
>> 
> 
> The RIPV is not related to the EIPV. RIPV means the context saved in the
> stack can't be restarted anymore. According to the SDM, RIPV means
> "execution can be restarted reliably at the instruction pointed to by the
> instruction pointer pushed on the stack". It's not about error happened
> synchronously or asynchronously. The point is, if the program is running in
> hypervisor context, and RIPV tells us that the program can't be restarted,
> we can't do anything but panic, because we can't switch context while we are
> in xen. So this code have nothing to do with EIPV. 

I continue to disagree (including the statement in your other response):
RIPV only tells us whether we can resume, not in which context the error
occurred. EIPV tells us whether, by looking at the saved registers, we
can determine the context that the error occurred in. Since with !RIPV
we have to determine in what context the error occurred in order to
decide whether to panic or just kill a guest, we can't ignore EIPV (and if
it's not set we have to assume the worst case, since even if the
registers indicate guest mode the error may have occurred in
hypervisor context or accessing hypervisor structures [consider e.g. a
data load error during a GDT access]).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.