[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] xen: Don't call panic if ARM TF cpu off returns DENIED





чт, 16 июн. 2022 г. в 22:09, Julien Grall <julien@xxxxxxx>:


On 16/06/2022 19:40, Volodymyr Babchuk wrote:
>
> Hi Julien,

Hi Volodymyr,

>
> Julien Grall <julien@xxxxxxx> writes:
>
>> Hi,
>>
>> On 16/06/2022 14:55, dmitry.semenets@xxxxxxxxx wrote:
>>> From: Dmytro Semenets <dmytro_semenets@xxxxxxxx>
>>> According to PSCI specification ARM TF can return DENIED on CPU OFF.
>>
>> I am confused. The spec is talking about Trusted OS and not
>> firmware. The docummentation is also not specific to ARM Trusted
>> Firmware. So did you mean "Trusted OS"?
>
> It should be "firmware", I believe.

Hmmm... I couldn't find a reference in the spec suggesting that CPU_OFF
could return DENIED because of the firmware. Do you have a pointer to
the spec?
Actually CPU_OFF performed by Trusted OS. But Trusted OS is called by the ARM TF.
ARM TF for our platform doesn't call Trusted OS for CPU0 and returns DENIED instead.

>
>>
>> Also, did you reproduce on HW? If so, on which CPU this will fail?
>>
>
> Yes, we reproduced this on HW. In our case it failed on CPU0. To be
> fair, in our case it had nothing to do with Trusted OS. It is just
> platform limitation - it can't turn off CPU0. But from Xen perspective
> there is no difference - CPU_OFF call returns DENIED.

Thanks for the clarification. I think I have seen that in the wild also
but it never got on top of my queue. It is good that we are fixing it.

>
>>> This patch brings the hypervisor into compliance with the PSCI
>>> specification.
>>
>> Now it means CPU off will never be turned off using PSCI. Instead, we
>> would end up to spin in Xen. This would be a problem because we could
>> save less power.
>
> Agreed.
>
>>
>>> Refer to "Arm Power State Coordination Interface (DEN0022D.b)"
>>> section 5.5.2
>>
>> Reading both 5.5.2 and 5.9.1 together, DENIED would be returned when
>> the trusted OS can only run on one core.
>>
>> Some of the trusted OS are migratable. So I think we should first
>> attempt to migrate the CPU. Then if it doesn't work, we should prevent
>> the CPU to go offline.
>>
>> That said, upstream doesn't support cpu offlining (I don't know for
>> your use case). In case of shutdown, it is not necessary to offline
>> the CPU, so we could avoid to call CPU_OFF on all CPUs but
>> one. Something like:
>>
>
> This is even better approach yes. But you mentioned CPU_OFF. Did you
> mean SYSTEM_RESET?

By CPU_OFF I was referring to the fact that Xen will issue the call all
CPUs but one. The remaining CPU will issue the command to reset/shutdown
the system.

>>   void machine_halt(void)
>> @@ -21,10 +23,6 @@ void machine_halt(void)
>>       smp_call_function(halt_this_cpu, NULL, 0);
>>       local_irq_disable();
>>
>> -    /* Wait at most another 10ms for all other CPUs to go offline. */
>> -    while ( (num_online_cpus() > 1) && (timeout-- > 0) )
>> -        mdelay(1);
>> -
>>       /* This is mainly for PSCI-0.2, which does not return if success. */
>>       call_psci_system_off();
>>
>>> Signed-off-by: Dmytro Semenets <dmytro_semenets@xxxxxxxx>
>>> Reviewed-by: Volodymyr Babchuk <volodymyr_babchuk@xxxxxxxx>
>>
>> I don't recall to see patch on the ML recently for this. So is this an
>> internal review?
>
> Yeah, sorry about that. Dmytro is a new member in our team and he is not
> yet familiar with differences in internal reviews and reviews in ML.

No worries. I usually classify internal review anything that was done
privately. This looks to be a public review, althought not on xen-devel.

I understand that not all some of the patches are still in PoC stage and
doing the review on your github is a good idea. But for those are meant
to be for upstream (e.g. bug fixes, small patches), I would suggest to
do the review on xen-devel directly
Sorry about that 

Cheers,

--
Julien Grall

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.