[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 0/7] xen/arm: CPU hotplug fixes



Hi Julien,

On Mon, Apr 16, 2018 at 1:33 PM, Julien Grall <julien.grall@xxxxxxx> wrote:
> Hi,
>
>
> On 13/04/18 11:19, Mirela Simonovic wrote:
>>
>> On Thu, Apr 12, 2018 at 10:43 AM, Julien Grall <julien.grall@xxxxxxx>
>> wrote:
>>>
>>>
>>>
>>> On 11/04/18 17:37, Mirela Simonovic wrote:
>>>>
>>>>
>>>> Hi Julien,
>>>
>>>
>>>
>>> Hi,
>>>
>>> May I ask you to configure your mail client to use > for quoting and use
>>> plain text? Otherwise, this is going to be really difficult to follow the
>>> discussion after few round (see already below).
>>>
>>>> On Wed, Apr 11, 2018 at 6:02 PM, Julien Grall <julien.grall@xxxxxxx
>>>> <mailto:julien.grall@xxxxxxx>> wrote:
>>>>
>>>>      Hi,
>>>>
>>>>      On 11/04/18 16:58, Mirela Simonovic wrote:
>>>>
>>>>          On 04/11/2018 05:07 PM, Julien Grall wrote:
>>>>
>>>>              On 11/04/18 14:19, Mirela Simonovic wrote:
>>>>
>>>>          Migrating interrupts when turning off a CPU already works.
>>>>          However, when a CPU is turned back on there is no interrupt
>>>>          migration back to the hotplugged CPU - all interrupts will
>>>>          remain routed to the CPU#0.
>>>>          Patch 7/7 fixes this
>>>>
>>>>
>>>>      What do you mean by all interrupts? Interrupts routed to guest will
>>>>      always follow the vCPU. So are you sure they are going to be
>>>>      migrated when that vCPU is paused/off?
>>>>
>>>>
>>>> Just to make sure we're on the same page - this is about hotplugging
>>>> physical CPUs. Hotplugging vCPUs using virtual PSCI CPU_OFF interface is
>>>> already implemented and unrelated to this series.
>>>
>>>
>>>
>>> Yes, we are on the same page :). I was just wondering what happen to
>>> interrupt routed to that pCPU.
>>>
>>>>
>>>> Assuming that system has 2 pCPUs by 'all interrupts' I mean interrupts
>>>> that were targeted to the pCPU#0 and pCPU#1 prior to doing any hotplug.
>>>>
>>>> For example, if a guest is pinned to pCPU#1 an interrupt of a device it
>>>> owns will be targeted to pCPU#1.
>>>> When pCPU#1 is turned off that interrupt will be migrated to pCPU#0.
>>>> pCPU#0 finalizes the suspend and receives wake-up interrupts. However,
>>>> when
>>>> CPU#1 is turned back on that interrupt will remain targeted to the
>>>> CPU#0,
>>>> which I assumed is wrong.
>>>> The scenario described here is also how I tested this.
>>>>
>>>>      Can you give the path in Xen doing that?
>>>>
>>>>
>>>> Sure, here is a backtrace (dumped on the CPU being turned off):
>>>>       0  0x2603dc arch_move_irqs(): vgic.c, line 309
>>>>       1  0x22ee58 sched_move_irqs()+20: schedule.c, line 303
>>>>       2  0x2318e8 cpu_disable_scheduler()+1000: schedule.c, line 586
>>>>       3  0x2318e8 cpu_disable_scheduler()+1000: schedule.c, line 586
>>>>       4  0x25aff8 __cpu_disable()+96: smpboot.c, line 386
>>>>       5  0x201608 take_cpu_down()+52: cpu.c, line 75
>>>>       6  0x23426c stopmachine_action()+188: stop_machine.c, line 159
>>>>       7  0x235858 do_tasklet_work()+176: tasklet.c, line 94
>>>>       8  0x235c80 do_tasklet()+104: tasklet.c, line 126
>>>>       9  0x24daec idle_loop()+144: domain.c, line 72
>>>>      10  0x25b1f8 start_secondary()+404: smpboot.c, line 368
>>>
>>>
>>>
>>>
>>> So this cover interrupt routed to a virtual CPU. However, this does not
>>> handle interrupts used by Xen. How do you handle them?
>>>
>>> For instance SMMUs IRQ might be routed to other interrupt than CPU #0.
>>
>>
>> Interrupts used by Xen should not wake-up the system and will be
>> disabled when we suspend the devices used by Xen.
>
> Here you only speak about the suspend use case. While I understand your
> ultimate goal is suspend/resume, this series is about CPU hotplug.
>

AFAIK, the only way and occasion to hotplug a CPU is using
disable/enable_nonboot_cpus() within the Xen suspend/resume procedure.
We are implementing CPU hotplug only to enable Xen suspend/resume.
This is how it is also done for x86 and we wanted to implement the
equivalent behavior for ARM.
If the cover-letter is misleading please let me know what would be
more appropriate title.

However, I absolutely agree that the interrupt routing before and
after the hotplug has to be the same.

> IHMO, the suspend/resume case is no more than a superset of CPU up/down. If
> you solve the problem for up/down, likely you are going to solve it for
> suspend/resume.
>
> So, what would happen to interrupts routed to the CPU going offline?
>
>> However, I need to double check that such interrupts get enabled on
>> the right CPU on resume. Could you please tell me which mechanism in
>> Xen is used to target such an interrupt to a secondary CPU only? Is
>> that even possible and why would that be used?
>
>
> SPIs will be routed to the CPU calling setup_irq. It may not always be
> CPU#0. For instance, this is the case context interrupt for the SMMU because
> they are setup when the device is assigned.
>
> I guess this decision is arguable. If you move all the interrupts to CPU#0
> it will potentially disrupt vCPU running on it. I am thinking in the case of
> SMMU fault that could be triggered easily by another domain.
>

I'll come back to this, need to do some research/debugging to better
understand what's going on.

Thanks,
Mirela

> Cheers,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.