[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 10/10] xen/arm: Enable errata for secondary CPU on hotplug after the boot



Hi Julien,

On Mon, Apr 30, 2018 at 6:09 PM, Julien Grall <julien.grall@xxxxxxx> wrote:
> Hi Mirela,
>
>
> On 27/04/18 18:12, Mirela Simonovic wrote:
>>
>> On boot, enabling errata workarounds will be triggered by the boot CPU
>> from start_xen(). On CPU hotplug (non-boot scenario) this would not be
>> done. This patch adds the code required to enable errata workarounds
>> for a CPU being hotplugged after the system boots. This is triggered
>> using a notifier. If the CPU fails to enable the errata Xen will panic.
>> This is done because it is assumed that the CPU which is hotplugged
>> after the system/Xen boots, was initially hotplugged during the
>> system/Xen boot. Therefore, enabling errata workarounds should never
>> fail.
>>
>> Signed-off-by: Mirela Simonovic <mirela.simonovic@xxxxxxxxxx>
>>
>> ---
>> CC: Stefano Stabellini <sstabellini@xxxxxxxxxx>
>> CC: Julien Grall <julien.grall@xxxxxxx>
>> ---
>>   xen/arch/arm/cpuerrata.c         | 35
>> +++++++++++++++++++++++++++++++++++
>>   xen/arch/arm/cpufeature.c        | 23 +++++++++++++++++++++++
>>   xen/include/asm-arm/cpufeature.h |  1 +
>>   3 files changed, 59 insertions(+)
>>
>> diff --git a/xen/arch/arm/cpuerrata.c b/xen/arch/arm/cpuerrata.c
>> index 1baa20654b..4040f781ec 100644
>> --- a/xen/arch/arm/cpuerrata.c
>> +++ b/xen/arch/arm/cpuerrata.c
>> @@ -5,6 +5,8 @@
>>   #include <xen/spinlock.h>
>>   #include <xen/vmap.h>
>>   #include <xen/warning.h>
>> +#include <xen/notifier.h>
>> +#include <xen/cpu.h>
>>   #include <asm/cpufeature.h>
>>   #include <asm/cpuerrata.h>
>>   #include <asm/psci.h>
>> @@ -349,6 +351,39 @@ void __init enable_errata_workarounds(void)
>>       enable_cpu_capabilities(arm_errata);
>>   }
>>   +static int cpu_errata_callback(
>> +    struct notifier_block *nfb, unsigned long action, void *hcpu)
>> +{
>> +    switch ( action )
>> +    {
>> +    case CPU_STARTING:
>> +        enable_nonboot_cpu_caps(arm_errata);
>> +        break;
>> +    default:
>> +        break;
>> +    }
>> +
>> +    return NOTIFY_DONE;
>> +}
>> +
>> +static struct notifier_block cpu_errata_nfb = {
>> +    .notifier_call = cpu_errata_callback,
>> +};
>> +
>> +static int __init cpu_errata_notifier_init(void)
>> +{
>> +    register_cpu_notifier(&cpu_errata_nfb);
>> +    return 0;
>> +}
>> +/*
>> + * Initialization has to be done at init rather than presmp_init phase
>> because
>> + * the callback should execute only after the secondary CPUs are
>> initially
>> + * booted (in hotplug scenarios when the system state is not boot). On
>> boot,
>> + * the enabling of errata workarounds will be triggered by the boot CPU
>> from
>> + * start_xen().
>> + */
>> +__initcall(cpu_errata_notifier_init);
>> +
>>   /*
>>    * Local variables:
>>    * mode: C
>> diff --git a/xen/arch/arm/cpufeature.c b/xen/arch/arm/cpufeature.c
>> index 525b45e22f..dd30f0d29c 100644
>> --- a/xen/arch/arm/cpufeature.c
>> +++ b/xen/arch/arm/cpufeature.c
>> @@ -68,6 +68,29 @@ void __init enable_cpu_capabilities(const struct
>> arm_cpu_capabilities *caps)
>>       }
>>   }
>>   +/* Run through the enabled capabilities and enable() them on the
>> calling CPU */
>> +void enable_nonboot_cpu_caps(const struct arm_cpu_capabilities *caps)
>> +{
>> +    ASSERT(system_state != SYS_STATE_boot);
>> +
>> +    for ( ; caps->matches; caps++ )
>> +    {
>> +        if ( !cpus_have_cap(caps->capability) )
>> +            continue;
>> +
>> +        if ( caps->enable )
>> +        {
>> +            /*
>> +             * Since the CPU has enabled errata workarounds on boot, it
>> should
>
>
> This function is not really about errata, it is about capabilities. Errata
> is just a sub-category of them.
>

I've fixed the comment, thanks.

>> +             * never fail to enable them here.
>> +             */
>> +            if ( caps->enable((void *)caps) )
>> +                panic("CPU%u failed to enable capability %u\n",
>> +                      smp_processor_id(), caps->capability);
>
>
> We should really avoid to use panic(...) if this is something the system can
> survive. In that specific case, it would only affect the current CPU. So it
> would be better to return an error and let the caller decide what to do.
>

I need to emphasize two points:
1) I don't see how is this different compared to PSCI CPU OFF where we
do panic. Essentially, in both cases the system will not be able to
use that CPU and we already agreed that is a good reason to panic.
As oppose to CPU_OFF which wasn't called on boot so we indeed have no
idea whether it will pass on suspend, no matter how unlikely it could
fail, in this scenario we are sure that enabling capability should
pass because it already passed on boot. So if it doesn't pass, which I
consider to be impossible, I believe we should panic.
On the other hand, I understand how would this make a difference on
big.LITTLE where you try to hotplug a CPU that was never booted.
However, that scenario is out of this scope.

2) I still wanted to give a chance to your proposal and just convert
panic into stop_cpu+printing error. The system cannot survive if
enabling a capability fails. In order to test this I added a
capability that will always fail after the boot. This is not realistic
in my opinion, but I used it only for testing to check whether the
system will survive. Instead of panic I printed an error and stopped
the CPU. However, Xen crashed. The boot CPU properly concluded that
the erroneous CPU will never become online, but later on credit
scheduler's assertion fails. I believe this is something that a person
who adds big.LITTLE support should deal with.

Do we have an agreement to keep panic?

Thanks,
Mirela

> Cheers,
>
>> +        }
>> +    }
>> +}
>> +
>>   /*
>>    * Local variables:
>>    * mode: C
>> diff --git a/xen/include/asm-arm/cpufeature.h
>> b/xen/include/asm-arm/cpufeature.h
>> index e557a095af..b14e226401 100644
>> --- a/xen/include/asm-arm/cpufeature.h
>> +++ b/xen/include/asm-arm/cpufeature.h
>> @@ -88,6 +88,7 @@ void update_cpu_capabilities(const struct
>> arm_cpu_capabilities *caps,
>>                                const char *info);
>>     void enable_cpu_capabilities(const struct arm_cpu_capabilities *caps);
>> +void enable_nonboot_cpu_caps(const struct arm_cpu_capabilities *caps);
>>     #endif /* __ASSEMBLY__ */
>>
>
>
> Cheers,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.