[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 10/10] xen/arm: Enable errata for secondary CPU on hotplug after the boot



On Thu, 2018-05-10 at 15:24 +0200, Mirela Simonovic wrote:
> On Thu, May 10, 2018 at 1:57 PM, Mirela Simonovic
> 
> > Please take a look at function cpu_schedule_callback in schedule.c.
> > Within switch, case CPU_DEAD doesn't have a break, causing the
> > bellow
> > CPU_UP_CANCELED to execute as well when the CPU goes down. This
> > looks
> > wrong to me.
> > Dario, could you please confirm that this is a bug? Otherwise could
> > you please confirm the reasoning beyond?
> > 
> 
> Dario sorry, this looked wrong in my scenario but actually it is
> correct. I understand the purpose of the missing break now.
> 
No problem.

> For the curious ones (if any) here is detailed description: errata
> notifier added in this patch had the same priority as scheduler
> notifier. I though priority doesn't matter, but I was wrong. In this
> particular scenario where a CPU fails to enable capabilities
> (triggered by errata notifier added in this patch), scheduler
> callback
> executed before the errata callback for CPU_STARTING event. 
>
So, you're adding your errata callback to the CPU_STARTING notifier,
right? (Sorry for having to ask, I don't have the patch handy right
now.)

> In other
> words, scheduler already called init_pdata before the errata callback
> fired (and stopped the CPU).
> Later on when errata callback fired, enabling of capabilities has
> failed, so the erroneous non-boot CPU stopped itself and never
> declared to be online.
> Then CPU#0 fired new notification with CPU_UP_CANCELED event in order
> to clean up for the job done on CPU_STARTING. However, this broke the
> assumption (which is good) made in cpu_schedule_callback. The
> assumption is that the sequence of steps should be
> alloc_pdata-->init_pdata-->deinit_pdata-->free_pdata. In this
> particular case deinit_pdata was not done because this would be done
> only upon CPU_DEAD event which makes no sense in this scenario.
> In order to avoid running into the invalid scenario described above,
> the errata callback should fire before the scheduler callback. If
> enabling capabilities fails, the scheduler callback for CPU_STARTING
> will never execute afterwards, so the following CPU_UP_CANCELED event
> triggered by the CPU#0 will do free_pdata, which is ok because
> init_pdata was not executed and alloc_pdata-->free_pdata flow is also
> valid. Congratulations to the reader who reached this point :)
> 
Ok, but the flow is, AFAICR, CPU_UP_PREPARE->CPU_STARTING->CPU_ONLINE.

If you add your callback to CPU_UP_PREPARE, instead than to
CPU_STARTING, SCHED_OP(init_pdata) wouldn't be called, without having
to fiddle with priorities.

Is there any reason why you can't do it that way? It would look more
natural to me, and it's definitely going to be easier debug and
maintain (e.g., look at how many callbacks CPU_UP_PREPARE has, as
compared to CPU_STARTING ;-P).

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.