[Xen-devel] [PATCH v2 2/2] sched: fix race between sched_move_domain() and vcpu_wake()

From: David Vrabel <david.vrabel@xxxxxxxxxx>

sched_move_domain() changes v->processor for all the domain's VCPUs.
If another domain, softirq etc. triggers a simultaneous call to
vcpu_wake() (e.g., by setting an event channel as pending), then
vcpu_wake() may lock one schedule lock and try to unlock another.

vcpu_schedule_lock() attempts to handle this but only does so for the
window between reading the schedule_lock from the per-CPU data and the
spin_lock() call.  This does not help with sched_move_domain()
changing v->processor between the calls to vcpu_schedule_lock() and

Fix the race by taking the schedule_lock for v->processor in

Signed-off-by: David Vrabel <david.vrabel@xxxxxxxxxx>
Acked-by: Juergen Gross <juergen.gross@xxxxxxxxxxxxxx>
Acked-by: Keir Fraser <keir@xxxxxxx>

Use vcpu_schedule_lock_irq() (which now returns the lock) to properly
retry the locking should the to be used lock have changed in the course
of acquiring it (issue pointed out by George Dunlap).

Add a comment explaining the state after the v->processor adjustment.

Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>

--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -276,6 +276,8 @@ int sched_move_domain(struct domain *d, 
     new_p = cpumask_first(c->cpu_valid);
     for_each_vcpu ( d, v )
+        spinlock_t *lock;
         vcpudata = v->sched_priv;
         migrate_timer(&v->periodic_timer, new_p);
@@ -283,7 +285,16 @@ int sched_move_domain(struct domain *d, 
         migrate_timer(&v->poll_timer, new_p);
+        lock = vcpu_schedule_lock_irq(v);
         v->processor = new_p;
+        /*
+         * With v->processor modified we must not
+         * - make any further changes assuming we hold the scheduler lock,
+         * - use vcpu_schedule_unlock_irq().
+         */
+        spin_unlock_irq(lock);
         v->sched_priv = vcpu_priv[v->vcpu_id];

Attachment: sched-move-wake-race
Description: Binary data

