[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2] xen: make sure stop_machine_run() is always called in a tasklet

On 28.02.20 10:15, Jan Beulich wrote:
On 28.02.2020 09:58, Jürgen Groß wrote:
On 28.02.20 09:27, Jan Beulich wrote:
On 28.02.2020 08:19, Juergen Gross wrote:
With core scheduling active it is mandatory for stop_machine_run() to
be called in idle context only (so either during boot or in a tasklet),
as otherwise a scheduling deadlock would occur: stop_machine_run()
does a cpu rendezvous by activating a tasklet on all other cpus. In
case stop_machine_run() was not called in an idle vcpu it would block
scheduling the idle vcpu on its siblings with core scheduling being
active, resulting in a hang.

Put a BUG_ON() into stop_machine_run() to test for being called in an
idle vcpu only and adapt the missing call site (ucode loading) to use a
tasklet for calling stop_machine_run().

Signed-off-by: Juergen Gross <jgross@xxxxxxxx>
- rephrase commit message (Julien Grall)
   xen/arch/x86/microcode.c  | 54 
   xen/common/stop_machine.c |  1 +
   2 files changed, 35 insertions(+), 20 deletions(-)

There's no mention anywhere of a connection to your RCU series,
nor to rcu_barrier(). Yet the change puts a new restriction also
on its use, and iirc this was mentioned in prior discussion. Did
I miss anything?

Basically this patch makes the restriction explicit. Without the patch
stop_machine_run() being called outside of a tasklet would just hang
with core scheduling being active. Now it will catch those violations
early even with core scheduling inactive.

Note that currently there are no violations of this restriction anywhere
in the hypervisor other than the one addressed by this patch.

Well, there is a connection to core scheduling. Without it, i.e.
prior to 4.13, there was no restriction on the placement of
rcu_barrier() invocations. According to what you say above, the
restriction was implicitly introduced with core scheduling. It
should imo be made explicit by attaching a comment, which would
(again imo) best be done here because now you also make this
case crash without core scheduling in use.

Okay, I'll add a comment to stop_machine_run() and to rcu_barrier(). The
rcu_barrier() comment will be then removed by my rcu series again.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.