[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] xen: Don't call panic if ARM TF cpu off returns DENIED


  • To: Julien Grall <julien@xxxxxxx>
  • From: Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>
  • Date: Fri, 17 Jun 2022 09:10:28 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=sBt4BpruxuDW+3U+YaEMd/LSKPRuAhGP8rBevWZMe/U=; b=fTHsrHFfpj1NzJ8PNAswgFl+phgYpvy1T7K+fJPyWsfk/xJrEnCg4Bx4aMJUz7Hg5uHYdwumZWoEB89iPJybGz0hND7NHuSq/qMoSHuNBfRRWkoLMEdVaeZHGUo+smqxS/rNVhlhuAjXLJc4jyRJjvHSInO9sv0uQMbvgtDvtWocyJ4eNYejIX2AeAwsZrVB40MfDtYLzzdxrMFdpOObYcSZVUWTO69NbimZVxfIx9OiAwX2q3BsZIQX8JQuFKMxEbAHPsOjoO2J1enG/xeptj2Jzeyvh8YjFB0jvo9WapoROgiX+JyohXNJOeAp4Q4L4ys5SaqzsCC0B/kbM5Exvw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=G2unOd8An8tfZCjZ9O/2omr1tO6a1LH/68pOoUeVepD2i7cBTSAuKByKf+HVdJq9Nt5Rr8eqfjBwe0u4Elv4R+m9PSgRc2hltP4uQHvk2Pete3gtN+lIUPs7hPQIRVti/J9Ss6XCmxias4Xx4c5moJdExTFVeAw0gvhRI55crdMJ+HpIZjS1Io7usOriP5Uf0/S+QmvafOsSwdA9pjOpR8PJebzke9rj01d6k6KItT7WUBpfGk5XMMjglbFjYkaBsreS2AUF33k7nM2coEoJBICThUfhrt7UKyQSQLCctg0pmTKfCb3IsvWlCi3tI51+CFDTzbez42tylOhBCNF5HA==
  • Cc: "dmitry.semenets@xxxxxxxxx" <dmitry.semenets@xxxxxxxxx>, Dmytro Semenets <Dmytro_Semenets@xxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Fri, 17 Jun 2022 09:10:59 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHYgYkEncCZtJfZsEuchGWkwMivea1SI+UAgAA3C4CAAAs5gIAA5jKA
  • Thread-topic: [PATCH] xen: Don't call panic if ARM TF cpu off returns DENIED

Hi Julien,


Julien Grall <julien@xxxxxxx> writes:

> On 16/06/2022 19:40, Volodymyr Babchuk wrote:
>> Hi Julien,
>
> Hi Volodymyr,
>
>> Julien Grall <julien@xxxxxxx> writes:
>> 
>>> Hi,
>>>
>>> On 16/06/2022 14:55, dmitry.semenets@xxxxxxxxx wrote:
>>>> From: Dmytro Semenets <dmytro_semenets@xxxxxxxx>
>>>> According to PSCI specification ARM TF can return DENIED on CPU OFF.
>>>
>>> I am confused. The spec is talking about Trusted OS and not
>>> firmware. The docummentation is also not specific to ARM Trusted
>>> Firmware. So did you mean "Trusted OS"?
>> It should be "firmware", I believe.
>
> Hmmm... I couldn't find a reference in the spec suggesting that
> CPU_OFF could return DENIED because of the firmware. Do you have a
> pointer to the spec?

Ah, looks like we are talking about different things. Indeed, CPU_OFF
can return DENIED only because of Trusted OS. But entity that *returns*
the error to a caller is a firmware.

>> 
>>>
>>> Also, did you reproduce on HW? If so, on which CPU this will fail?
>>>
>> Yes, we reproduced this on HW. In our case it failed on CPU0. To be
>> fair, in our case it had nothing to do with Trusted OS. It is just
>> platform limitation - it can't turn off CPU0. But from Xen perspective
>> there is no difference - CPU_OFF call returns DENIED.
>
> Thanks for the clarification. I think I have seen that in the wild
> also but it never got on top of my queue. It is good that we are
> fixing it.
>
>> 
>>>> This patch brings the hypervisor into compliance with the PSCI
>>>> specification.
>>>
>>> Now it means CPU off will never be turned off using PSCI. Instead, we
>>> would end up to spin in Xen. This would be a problem because we could
>>> save less power.
>> Agreed.
>> 
>>>
>>>> Refer to "Arm Power State Coordination Interface (DEN0022D.b)"
>>>> section 5.5.2
>>>
>>> Reading both 5.5.2 and 5.9.1 together, DENIED would be returned when
>>> the trusted OS can only run on one core.
>>>
>>> Some of the trusted OS are migratable. So I think we should first
>>> attempt to migrate the CPU. Then if it doesn't work, we should prevent
>>> the CPU to go offline.
>>>
>>> That said, upstream doesn't support cpu offlining (I don't know for
>>> your use case). In case of shutdown, it is not necessary to offline
>>> the CPU, so we could avoid to call CPU_OFF on all CPUs but
>>> one. Something like:
>>>
>> This is even better approach yes. But you mentioned CPU_OFF. Did you
>> mean SYSTEM_RESET?
>
> By CPU_OFF I was referring to the fact that Xen will issue the call
> all CPUs but one. The remaining CPU will issue the command to
> reset/shutdown the system.
>

I just want to clarify: change that you suggested removes call to
stop_cpu() in halt_this_cpu(). So no CPU_OFF will be sent at all.

All CPUs except one will spin in

    while ( 1 )
        wfi();

while last cpu will issue SYSTEM_OFF or SYSTEM_RESET.

Is this correct?

>>>   void machine_halt(void)
>>> @@ -21,10 +23,6 @@ void machine_halt(void)
>>>       smp_call_function(halt_this_cpu, NULL, 0);
>>>       local_irq_disable();
>>>
>>> -    /* Wait at most another 10ms for all other CPUs to go offline. */
>>> -    while ( (num_online_cpus() > 1) && (timeout-- > 0) )
>>> -        mdelay(1);
>>> -
>>>       /* This is mainly for PSCI-0.2, which does not return if success. */
>>>       call_psci_system_off();
>>>
>>>> Signed-off-by: Dmytro Semenets <dmytro_semenets@xxxxxxxx>
>>>> Reviewed-by: Volodymyr Babchuk <volodymyr_babchuk@xxxxxxxx>
>>>
>>> I don't recall to see patch on the ML recently for this. So is this an
>>> internal review?
>> Yeah, sorry about that. Dmytro is a new member in our team and he is
>> not
>> yet familiar with differences in internal reviews and reviews in ML.
>
> No worries. I usually classify internal review anything that was done
> privately. This looks to be a public review, althought not on
> xen-devel.
>
> I understand that not all some of the patches are still in PoC stage
> and doing the review on your github is a good idea. But for those are
> meant to be for upstream (e.g. bug fixes, small patches), I would
> suggest to do the review on xen-devel directly.

It not always clear if patch is eligible for upstream. At first we
thought that problem is platform-specific and we weren't sure that we
will find a proper upstreamable fix. Probably you saw that PR's name
quite differs from final patch. This is because initial solution was
completely different from final one.

-- 
Volodymyr Babchuk at EPAM


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.