[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/2] Xen/MCA: bugfix for mca bank clear


  • To: Jan Beulich <JBeulich@xxxxxxxx>
  • From: "Liu, Jinsong" <jinsong.liu@xxxxxxxxx>
  • Date: Mon, 11 Mar 2013 10:26:00 +0000
  • Accept-language: en-US
  • Cc: "Ren, Yongjie" <yongjie.ren@xxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxx>
  • Delivery-date: Mon, 11 Mar 2013 10:26:42 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xen.org>
  • Thread-index: AQHOFZ6Ko3AgTDHj00CkpRCMA8skS5iPQT4w//99GICAEZlFsA==
  • Thread-topic: [Xen-devel] [PATCH 1/2] Xen/MCA: bugfix for mca bank clear

Jan Beulich wrote:
>>>> On 28.02.13 at 14:21, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx> wrote:
>> Jan Beulich wrote:
>>>>>> On 28.02.13 at 09:05, "Jan Beulich" <JBeulich@xxxxxxxx> wrote:
>>>> And then I don't see why you would do physical MSR accesses in
>>>> the injection case at all. It should really be mca_wrmsr() to take
>>>> care of this, it just needs to have a way to know it got called in
>>>> the context of an injection. One fundamental question here is
>>>> why the invalidation of the interposed data is being done
>>>> before the MSR write rather than after. One thing we surely
>>>> don't care much about in the injection logic is interference with
>>>> a real #MC.
>>> 
>>> Like this (RFC only, untested so far), based on having gone through
>>> all call sites of mca_wrmsr():
>>> 
>>> --- a/xen/arch/x86/cpu/mcheck/mce.c
>>> +++ b/xen/arch/x86/cpu/mcheck/mce.c
>>> @@ -1065,13 +1065,15 @@ static void intpose_add(unsigned int cpu
>>>      printk("intpose_add: interpose array full - request
>>> dropped\n");  } 
>>> 
>>> -void intpose_inval(unsigned int cpu_nr, uint64_t msr)
>>> +bool_t intpose_inval(unsigned int cpu_nr, uint64_t msr)  {
>>> -    struct intpose_ent *ent;
>>> +    struct intpose_ent *ent = intpose_lookup(cpu_nr, msr, NULL);
>>> 
>>> -    if ((ent = intpose_lookup(cpu_nr, msr, NULL)) != NULL) {
>>> -        ent->cpu_nr = -1;
>>> -    }
>>> +    if ( !ent )
>>> +        return 0;
>>> +
>>> +    ent->cpu_nr = -1;
>>> +    return 1;
>>>  }
>>> 
>>>  #define IS_MCA_BANKREG(r) \
>>> --- a/xen/arch/x86/cpu/mcheck/mce.h
>>> +++ b/xen/arch/x86/cpu/mcheck/mce.h
>>> @@ -77,7 +77,7 @@ extern void mce_recoverable_register(mce
>>>  /* Read an MSR, checking for an interposed value first */
>>>  extern struct intpose_ent *intpose_lookup(unsigned int, uint64_t, 
>>> uint64_t *); -extern void intpose_inval(unsigned int, uint64_t);
>>> +extern bool_t intpose_inval(unsigned int, uint64_t);
>>> 
>>>  static inline uint64_t mca_rdmsr(unsigned int msr)  {
>>> @@ -89,9 +89,9 @@ static inline uint64_t mca_rdmsr(unsigne
>>> 
>>>  /* Write an MSR, invalidating any interposed value */
>>>  #define mca_wrmsr(msr, val) do { \
>>> -       intpose_inval(smp_processor_id(), msr); \
>>> -       wrmsrl(msr, val); \
>>> -} while (0)
>>> +    if ( !intpose_inval(smp_processor_id(), msr) ) \ +       
>>> wrmsrl(msr, val); \ +} while ( 0 )
>>> 
>>> 
>>>  /* Utility function to "logout" all architectural MCA telemetry
>>> from the MCA
>> 
>> No, it doesn't work, considering mce broadcast to all cpus, while
>> s/w only simulate 1 cpu.
> 
> I don't see how that case would be handled any better with the
> invalidation happening before the MSR write (as is the case now).
> 
> Please explain, Jan

Sorry for late reply.

What I meant is, mca_wrmsr updated in this way is not quite clean, since per 
syntax it still has the risk 'sometimes access simulated value, sometimes 
access physical bank'.

But of course your patch works for current xen-mceinj tools, so if you think 
it's OK we will have a test at the specific machine and ACK later.

Thanks,
Jinsong
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.