|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3 0/5] Implement CPU hotplug on Arm
Hi Mykyta, Thank you for your answers On Mon, Oct 20, 2025 at 5:15 PM Mykyta Poturai <Mykyta_Poturai@xxxxxxxx> wrote: > > On 15.10.25 20:30, Mykola Kvach wrote: > > Hi Mykyta, > > > > Thanks for the series. > > > > It seems there might be issues here -- please take a look and let me > > know if my concerns are valid: > > > > 1. FF-A notification IRQ: after a CPU down->up cycle the IRQ > > configuration may be lost. > > OPTEE and FFA are marked as unsupported. Understood, thanks. Would it be worth documenting this? > > > 2. GICv3 LPIs: a CPU may fail to come back up unless its LPI pending > > table exists (is allocated) on bring-up. See > > gicv3_lpi_allocate_pendtable() and its call chain. > > ITS is marked as unsupported. I have a plan to deal with this, but it is > out of scope of this series. Thanks for the clarification. Should we document this somewhere? > > > 3. IRQ migration on CPU down: if an IRQ targets a CPU being offlined, > > its affinity should be moved to an online CPU before completing the > > offlining. > > All guest tied IRQ migration is handled by the scheduler. Regarding the > irqs used by Xen, I didn't find any with affinity to other CPUs than CPU > 0, which can't be disabled. I think theoretically it is possible for > them to have different affinity, but it seems unlikely considering that > x86 hotplug code also doesn't seem to do any Xen irq migration AFAIU. What about arm_smmu_init_domain_context and its related call chains? As far as I can see, some of these paths touch XEN_DOMCTL_* hypercalls, and my understanding is they can be issued on any CPU. Should we add a check that no enabled (e)SPIs owned by Xen are pinned to the offlining CPU? > > > 4. Race between the new hypercalls and disable/enable_nonboot_cpus(): > > disable_nonboot_cpus is called, enable_nonboot_cpus() reads > > frozen_cpus, and before it calls cpu_up() a hypercall onlines the CPU. > > cpu_up() then fails as "already online", but the CPU_RESUME_FAILED > > path may still run for an already-online CPU, risking use-after-free > > of per-CPU state (e.g. via free_percpu_area()) and other issues > > related to CPU_RESUME_FAILED notification. > > > > There don't seem to be any calls to disable/enable_nonboot_cpus() on > Arm. If we take x86 as an example, then they are called with all domains > already paused, and I don't see how paused domains can issue hypercalls. Agreed; this looks even less likely given that disable_* runs on CPU0 and your new hypercalls execute on CPU0. The only plausible issue would be a contrived case where code disables non-boot CPUs from CPU0 but enables them from another CPU woken by a hypercall. That seems unrealistic. > > > > > Best regards, > > Mykola > > -- > Mykyta Best regards, Mykola
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |