Jan Beulich wrote:
>>>> On 30.09.11 at 04:51, "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx>
>>>> wrote:
>
>>
>>> -----Original Message-----
>>> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
>>> Sent: Thursday, September 29, 2011 11:42 PM
>>> To: Liu, Jinsong
>>> Cc: Keir Fraser; Jiang, Yunhong; xen-devel@xxxxxxxxxxxxxxxxxxx
>>> Subject: Re: [Xen-devel] [PATCH] X86 MCE: Add SRAR handler
>>>
>>>>>> On 29.09.11 at 17:20, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx>
>>>>>> wrote:
>>>> @@ -782,8 +821,12 @@ static void intel_default_mce_uhandler(
>>>>
>>>> switch (type)
>>>> {
>>>> - /* Panic if no handler for SRAR error */
>>>> case intel_mce_ucr_srar:
>>>> + if ( !guest_mode(regs) )
>>>> + *result = MCER_RESET;
>>>> + else
>>>> + *result = MCER_CONTINUE;
>>>> + break;
>>>> case intel_mce_fatal:
>>>> *result = MCER_RESET;
>>>> break;
>>>
>>> Using the stack based registers for any decision in an MCE handler
>>> seems bogus to me - without knowing that the error occurred
>>> synchronously, the result is meaningless. Unfortunately I wasn't
>>
>> I think the usage of stack in MCE handler should be case by case.
>> For example, it's ok to use it at data load instruction since the
>> EIPV is valid for it.
>
> According to the table in the manual, this is only the case on the
> local thread.
>
OK, we can remove srar check at mce isr 'uhandler'.
so at mce isr, the check
if ( !(gstatus & MCG_STATUS_RIPV) && !guest_mode(regs))
return -1;
is a total insurance for the error which triggered at hypervisor while cannot
restart from RIP.
>> For the instruction load, not that sure and I will check it
>> internally.
>>
>> But agree that we should not do this depends on the error type (like
>> SRAR), but should depends on the specific error code.
>>
>>> able to spot - throughout your patch - what SRAR actually stands
>>> for, and the manual is no help in that respect either. It does
>>> state, however, that EIPV in three of four cases would be clear for
>>> these, so using the registers on stack is likely wrong here.
>>>
>>> This made me look at the current source, and there I see in
>>> mce_urgent_action()
>>>
>>> if ( !(gstatus & MCG_STATUS_RIPV) && !guest_mode(regs))
>>> return -1;
>>>
>>> which I think should say ... _EIPV and use || instead. Thoughts?
>>
>> I think this code means, if the error happens in hypervisor mode
>> (i.e. !guest_mode()), and RIPV indicate the RIP in stack can't be
>> restarted, we have to panic.
>
> Then the guest_mode() check still lacks an extra check of EIPV, like
>
> if ( !(gstatus & MCG_STATUS_RIPV) &&
> (!(gstatus & MCG_STATUS_EIPV) || !guest_mode(regs)))
> return -1;
>
That would be overkilled.
Considering instruction fetch error occur at guest context, hypervisor deliver
to guest to handle the error is perfer, not panic all system.
Thanks,
Jinsong
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|