[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] xen: Don't call panic if ARM TF cpu off returns DENIED



Hi,

On 21/06/2022 11:05, Dmytro Semenets wrote:
but machine_halt() doesn't work at al
... this should also be the case here because machine_halt() could also
be called from cpu0. So I am a bit confused why you are saying it never
works.
If machine_halt() called on a CPU other than CPU0 it caused panic and reboot.
If it called on a CPU0 it also caused panic but after system power off
and reboot
is not issued. In this state you can still call the xen console. But
you can't reboot the system.

I am lost. In a previous e-mail you said that PSCI CPU_OFF would return
DENIED on CPU0. IOW, I understood that for other CPUs, it would succeed.
I'm sorry I confused You.
Yes it causes panic and prints it will be rebooted but actual reboot
doesn't happen.

Ok. That's most likely because of the call to smp_call_function() in machine_restart(). It is using cpu_online_map to decide which CPUs to send the IPI.

This will block because some of the CPUs are already off. So they will never acknowledge the IPI.



But here, you are tell me the opposite:

"If it called on a CPU0 it also caused panic but after system power off
   and reboot".

If machine_halt() is called from CPU0, then CPU_OFF should not be called
on CPU0. So where is that panic coming from?


Transit execution to CPU0 for my understanding is a workaround and
this approach will fix
machine_restart() function but will not fix machine_halt().

I would say it is a more specific case of what the spec suggests (see
below). But it should fix both machine_restart() and machine_halt()
because the last CPU running will be CPU0. So Xen would call SYSTEM_*
rather than CPU_OF. So I don't understand why you think it will fix one
but not the other.
Looks like this is specific for my HW case. SYSTEM_OFF doesn't stop
the whole system.

Hmmm... All the other CPUs should be off (or spinning with interrupt
disabled), so are you saying that SYSTEM_OFF return?
Yes. SYSTEM_OFF returns on my HW.

Hmmm... This is not compliant with the specification. Could you check why PSCI SYSTEM_OFF is returning?

This is reason when CPU_OFF for CPU0 happens.

Right, machine_halt() will call halt_this_cpu() that in turn will call PSCI CPU_OFF.

If you modify halt_this_cpu() to avoid calling PSCI CPU_OFF (as I suggested before) then the panic() will never happen. Instead, the CPU will execute "wfi" in a loop with interrupt disabled.

To summarize there are two parts to resolve:
  1) Understand why PSCI SYSTEM_OFF returns on your platform
  2) Modify stop_cpu() to not call PSCI CPU_OFF

Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.