[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen mce bugfix


  • To: Jan Beulich <JBeulich@xxxxxxxx>
  • From: "Liu, Jinsong" <jinsong.liu@xxxxxxxxx>
  • Date: Wed, 27 Feb 2013 10:37:42 +0000
  • Accept-language: en-US
  • Cc: "Ren, Yongjie" <yongjie.ren@xxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxx>
  • Delivery-date: Wed, 27 Feb 2013 10:38:49 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xen.org>
  • Thread-index: AQHOFNGLqZTiEd/FTjyR1sPJXJqZr5iNfaZQ
  • Thread-topic: Xen mce bugfix

Jan Beulich wrote:
>>>> On 27.02.13 at 10:24, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx> wrote:
>> This work around an issue when test via xen-mceinj tools.
>> 
>> when inject simulated error via xen-mceinj tools,
>> status ADDRV/MISCV bits are simulated hence there is
>> potential risk of #GP if h/w not really support MCi_ADDR/MISC.
>> We temporarily work around by not clean them until we have
>> clean solution.
> 
> Excuse me, but - no. Changing the behavior for real MCE-s (which
> you added) for the benefit of fixing injection is a no-go IMO. Or
> are you telling us that after all that earlier change of yours is not
> really necessary (in which case we could as well revert it).
> 
> Jan
> 

The reason of the former patch to clear MCi_ADDR/MISC is that it's recommended 
by Intel SDM:
                LOG MCA REGISTER:
                SAVE IA32_MCi_STATUS;
                If MISCV in IA32_MCi_STATUS
                THEN
                        SAVE IA32_MCi_MISC;
                FI;
                IF ADDRV in IA32_MCi_STATUS
                THEN
                        SAVE IA32_MCi_ADDR;
                FI;
                IF CLEAR_MC_BANK = TRUE
                THEN
                        SET all 0 to IA32_MCi_STATUS;
                If MISCV in IA32_MCi_STATUS
                THEN
                        SET all 0 to IA32_MCi_MISC;
                FI;
                IF ADDRV in IA32_MCi_STATUS
                THEN
                        SET all 0 to IA32_MCi_ADDR;
                FI;

For Xen mce, it's meaningful to read MCi_ADDR/MISC only when real error occur 
(which indicated by MCi_STATUS), so only clear MCi_STATUS at mce handler is an 
acceptable work around -- after all, to read MCi_ADDR/MISC is pointless if 
MCi_STATUS is 0.

Thanks,
Jinsong

>> Reported-by: Ren Yongjie <yongjie.ren@xxxxxxxxx>
>> Singed-off-by: Liu Jinsong <jinsong.liu@xxxxxxxxx>
>> 
>> diff -r e84a79d11d7a xen/arch/x86/cpu/mcheck/mce.c
>> --- a/xen/arch/x86/cpu/mcheck/mce.c  Thu Nov 01 01:41:03 2012 +0800
>> +++ b/xen/arch/x86/cpu/mcheck/mce.c  Thu Feb 28 00:34:22 2013 +0800
>> @@ -144,10 +144,19 @@ 
>> 
>>      status = mca_rdmsr(MSR_IA32_MCx_STATUS(banknum));
>> 
>> +/*
>> + * TODO: when inject simulated error via xen-mceinj tools,
>> + * status ADDRV/MISCV bits are simulated hence there is
>> + * potential risk of #GP if h/w not really support MCi_ADDR/MISC.
>> + * We temporary work around by not clean them until we have + *
>> clean solution. + */
>> +#if 0
>>      if (status & MCi_STATUS_ADDRV)
>>          mca_wrmsr(MSR_IA32_MCx_ADDR(banknum), 0x0ULL);
>>      if (status & MCi_STATUS_MISCV)
>>          mca_wrmsr(MSR_IA32_MCx_MISC(banknum), 0x0ULL); +#endif
>> 
>>      mca_wrmsr(MSR_IA32_MCx_STATUS(banknum), 0x0ULL);
>>  }


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.