[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC v1 42/74] sched/null: skip vCPUs on the waitqueue that are blocked


First of all, my filters somehow failed to highlight this for me, so
sorry if I did not notice it earlier (and now, I need new filters
anyway, as the email I'm using is different :-D).

I'll have a look at the patch ASAP.

On Mon, 2018-01-08 at 11:12 +0000, George Dunlap wrote:
> On 01/08/2018 10:37 AM, Jan Beulich wrote:
> > I don't understand: Isn't the null scheduler not moving around
> > vCPU-s at all? At least that's what the comment at the top of the
> > file says, unless I'm mis-interpreting it. If so, how can "some CPU
> > (...) pick this vCPU"?
> There's no current way to prevent a user from adding more vcpus to a
> pool than there are pcpus (if nothing else, by creating a new VM in a
> given pool), or from taking pcpus from a pool in which #vcpus >=
> #pcpus.
Exactly. And something that checks for that is all but easy to
introduce (let's just avoid even mentioning enforcing!).

> The null scheduler deals with this by having a queue of "unassigned"
> vcpus that are waiting for a free pcpu.  When a pcpu becomes
> available,
> it will do the assignment.  When a pcpu that has a vcpu is assigned
> is
> removed from the pool, that vcpu is assigned to a different pcpu if
> one
> is available; if not, it is put on the list.
Err... yes. BTW, either there are a couple of typos in the above
paragraph, or it's me that can't read it well. Anyway, just to be
clear, if we have 4 pCPUs, and 6 VMs, with 1 vCPU each, this might be
the situation:

CPU0 <-- d1v0
CPU1 <-- d2v0
CPU2 <-- d3v0
CPU3 <-- d4v0

Waitqueue: d5v0,d6v0

Then, if d2 leaves/dies/etc, leaving CPU1 idle, d5v0 is picked up from
the waitqueue and assigned to CPU1.

> In the case of shim mode, this also seems to happen whenever curvcpus
> <
> maxvcpus: The L1 hypervisor (shim) only sees curvcpus cpus on which
> to
> schedule L2 vcpus, but the L2 guest has maxvcpus vcpus to schedule,
> of
> which (maxvcpus-curvcpus) are  marked 'down'.  
Mmm, wait. In case of a domain which specifies both maxvcpus and
curvcpus, how many vCPUs does the domain in which the shim run?

> In this case, it also
> seems that the null scheduler sometimes schedules a "down" vcpu when
> there are "up" vcpus on the list; meaning that the "up" vcpus are
> never
> scheduled.
I'm not sure how an offline vCPU can end up there... but maybe I need
to look at the code better, with the shim use case in mind.

Anyway, I'm fine with checks that prevent offline vCPUs to be assigned
to either pCPUs (like, the CPUs of L0 Xen) or shim's vCPUs (so, the
CPUs of L1 Xen). I'm less fine with rescheduling everyone at every

Roger, Wei, if/when you want to talk a bit about this, to explain the
situation a bit better, so I'll be able to help, feel free to ping me
 (email or IRC). :-)

<<This happens because I choose it to happen!>> (Raistlin Majere)
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

Attachment: signature.asc
Description: This is a digitally signed message part

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.