Xen project Mailing List

Re: [Xen-devel] [PATCH 1/3] xen: sched: introduce the 'null' semi-static scheduler

To: Dario Faggioli <dario.faggioli@xxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxx>

From: George Dunlap <george.dunlap@xxxxxxxxxx>

Date: Mon, 27 Mar 2017 11:31:26 +0100

Cc: Jonathan Davies <Jonathan.Davies@xxxxxxxxxx>, Julien Grall <julien.grall@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Marcus Granado <marcus.granado@xxxxxxxxxx>

Delivery-date: Mon, 27 Mar 2017 10:31:36 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 17/03/17 18:42, Dario Faggioli wrote: > In cases where one is absolutely sure that there will be > less vCPUs than pCPUs, having to pay the cose, mostly in > terms of overhead, of an advanced scheduler may be not > desirable. > > The simple scheduler implemented here could be a solution. > Here how it works: > - each vCPU is statically assigned to a pCPU; > - if there are pCPUs without any vCPU assigned, they > stay idle (as in, the run their idle vCPU); > - if there are vCPUs which are not assigned to any > pCPU (e.g., because there are more vCPUs than pCPUs) > they *don't* run, until they get assigned; > - if a vCPU assigned to a pCPU goes away, one of the > waiting to be assigned vCPU, if any, gets assigned > to the pCPU and can run there. Hmm -- I'm not sure about this 'waitqueue' thing. If you have a multi-vcpu VM and one vcpu hangs, what normally happens is that the rest of the VM ends up wedging itself in an unpredictable way, and if there's a watchdog timer or sanity check of any sort then it will hit a bugcheck. As implemented, any number of mundane operations may cause such a situation if you have one less pcpu or one more vcpu than you thought. This seems like a fairly "sharp edge" to have in the interface. Would it be possible instead to have domain assignment, vcpu-add / remove, pcpu remove, &c just fail (perhaps with -ENOSPC and/or -EBUSY) if we ever reach a situation where |vcpus| > |pcpus|? Or, to fail as many operations *as possible* which would bring us to that state, use the `waitqueue` idea as a backup for situations where we can't really avoid it? Regarding the code, my brain doesn't seem to be at 100% this morning for some reason, so just a couple of questions... > +static void null_vcpu_insert(const struct scheduler *ops, struct vcpu *v) > +{ > + struct null_private *prv = null_priv(ops); > + struct null_vcpu *nvc = null_vcpu(v); > + unsigned int cpu; > + spinlock_t *lock; > + > + ASSERT(!is_idle_vcpu(v)); > + > + retry: > + lock = vcpu_schedule_lock_irq(v); > + > + cpu = pick_cpu(prv, v); > + > + /* We hold v->processor's runq lock, but we need cpu's one */ > + if ( cpu != v->processor ) > + { > + spin_unlock(lock); > + lock = pcpu_schedule_lock(cpu); Don't we need to hold the lock for v->processor until we change v->processor? Otherwise someone might call vcpu_schedule_lock(v) at this point and reasonably believe that is has the right to modify v. Or does this not matter because we're just now calling insert (and so nobody else is going to call vcpu_schedule_lock() on v? > diff --git a/xen/common/schedule.c b/xen/common/schedule.c > index 223a120..b482037 100644 > --- a/xen/common/schedule.c > +++ b/xen/common/schedule.c > @@ -1785,6 +1785,8 @@ int schedule_cpu_switch(unsigned int cpu, struct > cpupool *c) > > out: > per_cpu(cpupool, cpu) = c; > + /* Trigger a reschedule so the CPU can pick up some work ASAP. */ > + cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ); Is this a more generic fix / improvement? At first blush everything else looks good. -George _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.