[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/2] Xen/MCA: bugfix for mca bank clear


  • To: Jan Beulich <JBeulich@xxxxxxxx>, "Ren, Yongjie" <yongjie.ren@xxxxxxxxx>
  • From: "Liu, Jinsong" <jinsong.liu@xxxxxxxxx>
  • Date: Tue, 12 Mar 2013 02:06:49 +0000
  • Accept-language: en-US
  • Cc: xen-devel <xen-devel@xxxxxxxxxxxxx>
  • Delivery-date: Tue, 12 Mar 2013 02:07:32 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xen.org>
  • Thread-index: AQHOHkWbo3AgTDHj00CkpRCMA8skS5ihTt1Q
  • Thread-topic: [Xen-devel] [PATCH 1/2] Xen/MCA: bugfix for mca bank clear

Jan Beulich wrote:
>>>> On 11.03.13 at 11:26, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx> wrote:
>> Jan Beulich wrote:
>>>>>> On 28.02.13 at 14:21, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx>
>>>>>> wrote: 
>>>> Jan Beulich wrote:
>>>>>>>> On 28.02.13 at 09:05, "Jan Beulich" <JBeulich@xxxxxxxx> wrote:
>>>>>> And then I don't see why you would do physical MSR accesses in
>>>>>> the injection case at all. It should really be mca_wrmsr() to
>>>>>> take care of this, it just needs to have a way to know it got
>>>>>> called in the context of an injection. One fundamental question
>>>>>> here is 
>>>>>> why the invalidation of the interposed data is being done
>>>>>> before the MSR write rather than after. One thing we surely
>>>>>> don't care much about in the injection logic is interference with
>>>>>> a real #MC.
>>>>> 
>>>>> Like this (RFC only, untested so far), based on having gone
>>>>> through all call sites of mca_wrmsr():
>>>>> 
>>>>> --- a/xen/arch/x86/cpu/mcheck/mce.c
>>>>> +++ b/xen/arch/x86/cpu/mcheck/mce.c
>>>>> @@ -1065,13 +1065,15 @@ static void intpose_add(unsigned int cpu
>>>>>      printk("intpose_add: interpose array full - request
>>>>> dropped\n");  } 
>>>>> 
>>>>> -void intpose_inval(unsigned int cpu_nr, uint64_t msr)
>>>>> +bool_t intpose_inval(unsigned int cpu_nr, uint64_t msr)  { -   
>>>>> struct intpose_ent *ent; +    struct intpose_ent *ent =
>>>>> intpose_lookup(cpu_nr, msr, NULL); 
>>>>> 
>>>>> -    if ((ent = intpose_lookup(cpu_nr, msr, NULL)) != NULL) {
>>>>> -        ent->cpu_nr = -1;
>>>>> -    }
>>>>> +    if ( !ent )
>>>>> +        return 0;
>>>>> +
>>>>> +    ent->cpu_nr = -1;
>>>>> +    return 1;
>>>>>  }
>>>>> 
>>>>>  #define IS_MCA_BANKREG(r) \
>>>>> --- a/xen/arch/x86/cpu/mcheck/mce.h
>>>>> +++ b/xen/arch/x86/cpu/mcheck/mce.h
>>>>> @@ -77,7 +77,7 @@ extern void mce_recoverable_register(mce
>>>>>  /* Read an MSR, checking for an interposed value first */
>>>>>  extern struct intpose_ent *intpose_lookup(unsigned int, uint64_t,
>>>>> uint64_t *); -extern void intpose_inval(unsigned int, uint64_t);
>>>>> +extern bool_t intpose_inval(unsigned int, uint64_t);
>>>>> 
>>>>>  static inline uint64_t mca_rdmsr(unsigned int msr)  {
>>>>> @@ -89,9 +89,9 @@ static inline uint64_t mca_rdmsr(unsigne
>>>>> 
>>>>>  /* Write an MSR, invalidating any interposed value */
>>>>>  #define mca_wrmsr(msr, val) do { \
>>>>> -       intpose_inval(smp_processor_id(), msr); \
>>>>> -       wrmsrl(msr, val); \
>>>>> -} while (0)
>>>>> +    if ( !intpose_inval(smp_processor_id(), msr) ) \ +
>>>>> wrmsrl(msr, val); \ +} while ( 0 )
>>>>> 
>>>>> 
>>>>>  /* Utility function to "logout" all architectural MCA telemetry
>>>>> from the MCA
>>>> 
>>>> No, it doesn't work, considering mce broadcast to all cpus, while
>>>> s/w only simulate 1 cpu.
>>> 
>>> I don't see how that case would be handled any better with the
>>> invalidation happening before the MSR write (as is the case now).
>>> 
>>> Please explain, Jan
>> 
>> Sorry for late reply.
>> 
>> What I meant is, mca_wrmsr updated in this way is not quite clean,
>> since per syntax it still has the risk 'sometimes access simulated
>> value, sometimes access physical bank'. 
>> 
>> But of course your patch works for current xen-mceinj tools, so if
>> you think it's OK we will have a test at the specific machine and
>> ACK later. 
> 
> As said before, I don't think we need to consider interference
> between injection and real events (or if we do, then I think there
> are other aspects that would need fixing too).
> 
> Hence, as long as what I proposed works for the injection case,
> that ought to be better than what we have currently. Perhaps
> explicitly filtering out the not supported case of broadcast being
> requested with the rejection (iirc there's a way to request this)
> would need to be added to the change suggested earlier.
> 
> Jan

Rongjie,

Would you please test Jan's patch at *the* WSM-EX machine? It should be fine 
but test is necessary and better.

Thanks,
Jinsong
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.