[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] RE: [PATCH] Fix cache flush bug of cpu offline



Keir Fraser wrote:
> On 11/03/2011 17:53, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx> wrote:
> 
>> Keir Fraser wrote:
>>> On 11/03/2011 16:50, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx> wrote:
>>> 
>>>> we did experiment, if did wbinvd at current position (at
>>>> play_dead), sometimes it bring strange issue when repeatly cpu
>>>> offline/online. so for cpu dead, the near wbinvd to last step, the
>>>> safer. 
>>>> wbinvd would better be the last ops before cpu dead, to avoid
>>>> potential cache coherency break.
>>> 
>>> Okay, I applied your patches. However in a follow-up patch (c/s
>>> 23025) I have removed the WBINVD instructions from the default paths
>>> (i.e., the HLT loops) as the CPU still does cache coherency while
>>> in HLT/C1 state. 
>>> 
>>> Does that look okay to you?
>>> 
>>>  -- Keir
>> 
>> It's right that cpu continue snoop bus transaction to keep cache
>> coherency when HLT/C1. 
>> 
>> However, I think it better not remove wbinvd from HLT loops.
>> I'm not quite sure if 'enter Cx' == 'cpu play dead'. After all,
>> resume from Cx different from cpu booting: when cpu online, it start
>> from SIPI-SIPI. 
>> I worried some unknown hardware behavior will bring issue if we
>> treat 'cpu play dead' totally same as 'enter Cx'.
>> Maybe that's the reason why kernel insist on reserveing wbinvd
>> before HLT when play dead.
> 
> An APIC INIT doesn't touch the memory hierarchy, cache controls, nor
> CR0.CD and CR0.NW. The care over cache invalidation I believe solely
> comes down to the deep sleep that the CPU can be placed into via
> MWAIT, in which the cache-control logic gets switched off.
> 
>  -- Keir

Reasonable ot me, thanks!

Jinsong

> 
>> Thanks,
>> Jinsong
>> 
>>> 
>>>> In fact, it can do wbinvd inside loop, but as cpu_offline_3.patch
>>>> said, at Xen 7400 when hyperthreading, the offlined thread may be
>>>> spuriously waken up by its brother, and frequently waken inside
>>>> the dead loop. In such case, considering heavy workload of wbinvd,
>>>> we add a light-weight clflush instruction inside loop.
>>>> 
>>>> Thanks,
>>>> Jinsong
>>>> 
>>>> 
>>>>> 
>>>>>> Signed-off-by: Liu, Jinsong <jinsong.liu@xxxxxxxxx>
>>>>>> 
>>>>>> diff -r 2dc3c1cc1bba xen/arch/x86/acpi/cpu_idle.c
>>>>>> --- a/xen/arch/x86/acpi/cpu_idle.c Mon Mar 07 05:31:46 2022 +0800
>>>>>> +++ b/xen/arch/x86/acpi/cpu_idle.c Thu Mar 10 23:40:51 2022 +0800
>>>>>> @@ -562,11 +562,14 @@ static void acpi_dead_idle(void)
>>>>>>      if ( (cx = &power->states[power->count-1]) == NULL )
>>>>>> goto default_halt;
>>>>>> 
>>>>>> +    /*
>>>>>> +     * cache must be flashed as the last ops before cpu going
>>>>>> into dead, +     * otherwise, cpu may dead with dirty data
>>>>>> breaking cache coherency, +     * leading to strange errors. +  
>>>>>> */ +    wbinvd();
>>>>>>      for ( ; ; )
>>>>>>      {
>>>>>> -        if ( !power->flags.bm_check && cx->type ==
>>>>>> ACPI_STATE_C3 ) 
>>>>>> -            ACPI_FLUSH_CPU_CACHE();
>>>>>> -
>>>>>>          switch ( cx->entry_method )
>>>>>>          {
>>>>>>              case ACPI_CSTATE_EM_FFH:
>>>>>> @@ -584,6 +587,7 @@ static void acpi_dead_idle(void)      }
>>>>>> 
>>>>>>  default_halt:
>>>>>> +    wbinvd();
>>>>>>      for ( ; ; )
>>>>>>          halt();
>>>>>>  }
>>>>>> diff -r 2dc3c1cc1bba xen/arch/x86/domain.c
>>>>>> --- a/xen/arch/x86/domain.c Mon Mar 07 05:31:46 2022 +0800
>>>>>> +++ b/xen/arch/x86/domain.c Thu Mar 10 23:40:51 2022 +0800
>>>>>> @@ -93,6 +93,12 @@ static void default_idle(void)
>>>>>> 
>>>>>>  static void default_dead_idle(void)
>>>>>>  {
>>>>>> +    /*
>>>>>> +     * cache must be flashed as the last ops before cpu going
>>>>>> into dead, +     * otherwise, cpu may dead with dirty data
>>>>>> breaking cache coherency, +     * leading to strange errors. +  
>>>>>> */ +    wbinvd();
>>>>>>      for ( ; ; )
>>>>>>          halt();
>>>>>>  }
>>>>>> @@ -100,7 +106,6 @@ static void play_dead(void)
>>>>>>  static void play_dead(void)
>>>>>>  {
>>>>>>      local_irq_disable();
>>>>>> -    wbinvd();
>>>>>> 
>>>>>>      /*
>>>>>>       * NOTE: After cpu_exit_clear, per-cpu variables are no
>>>>>> longer accessible,


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.