[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2] xen: make sure stop_machine_run() is always called in a tasklet

On 28.02.2020 09:58, Jürgen Groß wrote:
> On 28.02.20 09:27, Jan Beulich wrote:
>> On 28.02.2020 08:19, Juergen Gross wrote:
>>> With core scheduling active it is mandatory for stop_machine_run() to
>>> be called in idle context only (so either during boot or in a tasklet),
>>> as otherwise a scheduling deadlock would occur: stop_machine_run()
>>> does a cpu rendezvous by activating a tasklet on all other cpus. In
>>> case stop_machine_run() was not called in an idle vcpu it would block
>>> scheduling the idle vcpu on its siblings with core scheduling being
>>> active, resulting in a hang.
>>> Put a BUG_ON() into stop_machine_run() to test for being called in an
>>> idle vcpu only and adapt the missing call site (ucode loading) to use a
>>> tasklet for calling stop_machine_run().
>>> Signed-off-by: Juergen Gross <jgross@xxxxxxxx>
>>> ---
>>> V2:
>>> - rephrase commit message (Julien Grall)
>>> ---
>>>   xen/arch/x86/microcode.c  | 54 
>>> +++++++++++++++++++++++++++++------------------
>>>   xen/common/stop_machine.c |  1 +
>>>   2 files changed, 35 insertions(+), 20 deletions(-)
>> There's no mention anywhere of a connection to your RCU series,
>> nor to rcu_barrier(). Yet the change puts a new restriction also
>> on its use, and iirc this was mentioned in prior discussion. Did
>> I miss anything?
> Basically this patch makes the restriction explicit. Without the patch
> stop_machine_run() being called outside of a tasklet would just hang
> with core scheduling being active. Now it will catch those violations
> early even with core scheduling inactive.
> Note that currently there are no violations of this restriction anywhere
> in the hypervisor other than the one addressed by this patch.

Well, there is a connection to core scheduling. Without it, i.e.
prior to 4.13, there was no restriction on the placement of
rcu_barrier() invocations. According to what you say above, the
restriction was implicitly introduced with core scheduling. It
should imo be made explicit by attaching a comment, which would
(again imo) best be done here because now you also make this
case crash without core scheduling in use.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.