[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 1/2] xen: fix a (latent) cpupool-related race during domain destroy



On 14/07/16 17:18, Dario Faggioli wrote:
> So, during domain destruction, we do:
>  cpupool_rm_domain()    [ in domain_destroy() ]
>  sched_destroy_domain() [ in complete_domain_destroy() ]
>
> Therefore, there's a window during which, from the
> scheduler's point of view, a domain stilsts outside
> of any cpupool.
>
> In fact, cpupool_rm_domain() does d->cpupool=NULL,
> and we don't allow that to hold true, for anything
> but the idle domain (and there are, in fact, ASSERT()s
> and BUG_ON()s to that effect).
>
> Currently, we never really check d->cpupool during the
> window, but that does not mean the race is not there.
> For instance, Credit2 at some point (during load balancing)
> iterates on the list of domains, and if we add logic that
> needs checking d->cpupool, and any one of them had
> cpupool_rm_domain() called on itself already... Boom!
>
> (In fact, calling __vcpu_has_soft_affinity() from inside
> balance_load() makes `xl shutdown <domid>' reliably
> crash, and this is how I discovered this.)
>
> On the other hand, cpupool_rm_domain() "only" does
> cpupool related bookkeeping, and there's no harm
> postponing it a little bit.
>
> Also, considering that, during domain initialization,
> we do:
>  cpupool_add_domain()
>  sched_init_domain()
>
> It makes sense for the destruction path to look like
> the opposite of it, i.e.:
>  sched_destroy_domain()
>  cpupool_rm_domain()
>
> And hence that's what this patch does.
>
> Actually, for better robustness, what we really do is
> moving both cpupool_add_domain() and cpupool_rm_domain()
> inside sched_init_domain() and sched_destroy_domain(),
> respectively (and also add a couple of ASSERT()-s).
>
> Signed-off-by: Dario Faggioli <dario.faggioli@xxxxxxxxxx>
> ---
> Cc: Juergen Gross <jgross@xxxxxxxx>
> Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> Cc: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>
> Cc: Jan Beulich <jbeulich@xxxxxxxx>

Acked-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

This definitely looks better, although I will leave the in-depth review
to people more familiar with the scheduler internals.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.