[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 6/9] x86/amd: Allocate resources to cope with LS_CFG being per-core on Fam17h



>>> On 06.12.18 at 19:46, <andrew.cooper3@xxxxxxxxxx> wrote:
> On 06/12/2018 08:54, Jan Beulich wrote:
>>>>> On 05.12.18 at 18:05, <andrew.cooper3@xxxxxxxxxx> wrote:
>>> On 05/12/2018 16:57, Jan Beulich wrote:
>>>>>>> On 03.12.18 at 17:18, <andrew.cooper3@xxxxxxxxxx> wrote:
>>>>> --- a/xen/arch/x86/cpu/amd.c
>>>>> +++ b/xen/arch/x86/cpu/amd.c
>>>>> @@ -419,6 +419,97 @@ static void __init noinline 
>>>>> amd_probe_legacy_ssbd(void)
>>>>>  }
>>>>>  
>>>>>  /*
>>>>> + * This is all a gross hack, but Xen really doesn't have flexible-enough
>>>>> + * per-cpu infrastructure to do it properly.  For Zen(v1) with SMT 
>>>>> active,
>>>>> + * MSR_AMD64_LS_CFG is per-core rather than per-thread, so we need a 
>>>>> per-core
>>>>> + * spinlock to synchronise updates of the MSR.
>>>>> + *
>>>>> + * We can't use per-cpu state because taking one CPU offline would free 
>>>>> state
>>>>> + * under the feet of another.  Ideally, we'd allocate memory on the AP 
>>>>> boot
>>>>> + * path, but by the time the sibling information is calculated 
>>>>> sufficiently
>>>>> + * for us to locate the per-core state, it's too late to fail the AP 
>>>>> boot.
>>>>> + *
>>>>> + * We also can't afford to end up in a heterogeneous scenario with some 
>>>>> CPUs
>>>>> + * unable to safely use LS_CFG.
>>>>> + *
>>>>> + * Therefore, we have to allocate for the worse-case scenario, which is
>>>>> + * believed to be 4 sockets.  Any allocation failure cause us to turn 
>>>>> LS_CFG
>>>>> + * off, as this is fractionally better than failing to boot.
>>>>> + */
>>>>> +static struct ssbd_ls_cfg {
>>>>> + spinlock_t lock;
>>>>> + unsigned int disable_count;
>>>>> +} *ssbd_ls_cfg[4];
>>>> Same question as to Brian for his original code: Instead of the
>>>> hard-coding of 4, can't you use nr_sockets here?
>>>> smp_prepare_cpus() runs before pre-SMP initcalls after all.
>>> nr_sockets has zero connection with reality as far as I can tell.
>>>
>>> On this particular box it reports 6 when the correct answer is 2.  I've
>>> got some Intel boxes where nr_sockets reports 15 and the correct answer
>>> is 4.
>> If you look back at when it was introduced, the main goal was
>> for it to never be too low. Any improvements to its calculation
>> are welcome, provided they maintain that guarantee. To high
>> a socket count is imo still better than a hard-coded one.
> 
> Even for the extra 2k of memory it will waste?

2k sounds quite a lot here, but I guess the number depends on
core count. Question is whether you really need to allocate
space for all sockets up front. It would seem to me that instead
you could use a demand allocation scheme where a new
allocation is done only for the first core on a socket. If the
allocation needs to happen before the point where to
(socket,core,thread) tuple is known for the new AP, an
approach like psr_cpu_prepare()'s could be used (allocating a
new structure "just in case" when the previous one was
consumed).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.