[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC 0/5] xen/arm: support big.little SoC



On Mon, 2016-09-19 at 21:33 +0800, Peng Fan wrote:
> On Mon, Sep 19, 2016 at 11:33:58AM +0100, George Dunlap wrote:
>
> > No, I think it would be a lot simpler to just teach the scheduler
> > about
> > different classes of cpus.  credit1 would probably need to be
> > modified
> > so that its credit algorithm would be per-class rather than pool-
> > wide;
> > but credit2 shouldn't need much modification at all, other than to
> > make
> > sure that a given runqueue doesn't include more than one class; and
> > to
> > do load-balancing only with runqueues of the same class.
> 
> I try to follow.
>  - scheduler needs to be aware of different classes of cpus. ARM
> big.Little cpus.
>
Yes, I think this is essential.

>  - scheduler schedules vcpus on different physical cpus in one
> cpupool.
>
Yep, that's what the scheduler does. And personally, I'd start
implementing big.LITTLE support for a situation where both big and
LITTLE cpus coexists in the same pool.

>  - different cpu classes needs to be in different runqueue.
> 
Yes. So, basically, imagine to use vcpu pinning to support big.LITTLE.
I've spoken briefly about this in my reply to Juergen. You probably can
even get something like this up-&-running by writing very few or zero
code (you'll need --for now-- max_dom0_vcpus, dom0_vcpus_pin, and then,
in domain config files, "cpus='...'").

Then, the real goal, would be to achieve the same behavior
automatically, by acting on runqueues' arrangement and load balancing
logic in the scheduler(s).

Anyway, sorry for my ignorance on big.LITTLE, but there's something I'm
missing: _when_ is it that it is (or needs to be) decided whether a
vcpu will run on a big or LITTLE core?

Thinking to a bare metal system, I think that cpu X is, for instance, big, and 
will always be like that; similarly, cpu Y is LITTLE.

This makes me think that, for a virtual machine, it is ok to choose/specify at 
_domain_creation_ time, which vcpus are big and which vcpus are LITTLE, is this 
correct?
If yes, this also means that --whatever way we find to make this happen, 
cpupools, scheduler, etc-- the vcpus that we decided they are big, must only be 
scheduled on actual big pcpus, and pcpus that we decided they are LITTLE, must 
only be scheduled on actual LITTLE pcpus, correct again?

> Then for implementation.
>  - When create a guest, specific physical cpus that the guest will be
> run on.
>
I'd actually do that the other way round. I'd ask the user to specify
how many --and, if that's important-- vcpus are big and how many/which
are LITTLE.

Knowing that, we also know whether the domain is a big only, LITTLE
only or big.LITTLE one. And we also know on which set of pcpus each set
of vcpus should be restrict to.

So, basically (but it's just an example) something like this, in the xl
config file of a guest:

1) big.LITTLE guest, with 2 big and 2 LITTLE pcpus. User doesn't care  
   which is which, so a default could be 0,1 big and 2,3 LITTLE:

 vcpus = 4
 vcpus.big = 2

2) big.LITTLE guest, with 8 vcpus, of which 0,2,4 and 6 are big:

vcpus = 8
vcpus.big = [0, 2, 4, 6]

Which would be the same as

vcpus = 8
vcpus.little = [1, 3, 5, 7]

3) guest with 4 vcpus, all big:

vcpus = 4
vcpus.big = "all"

Which would be the same as:

vcpus = 4
vcpus.little = "none"

And also the same as just:

vcpus = 4


Or something like this

>  - If the physical cpus are different cpus, indicate the guest would
> like to be a big.little guest.
>    And have big vcpus and little vcpus.
>
Not liking this as _the_ way of specifying the guest topology, wrt
big.LITTLE-ness (see alternative proposal right above. :-))

However, right now we support pinning/affinity already. We certainly
need to decide what to do if, e.g., no vcpus.big or vcpus.little are
present, but the vcpus have hard or soft affinity to some specific
pcpus.

So, right now, this, in the xl config file:

cpus = [2, 8, 12, 13, 15, 17]

means that we want to ping 1-to-1 vcpu 0 to pcpu 2, vcpu 1 to pcpu 8,
vcpu 2 to pcpu 12, vcpu 3 to pcpu 13, vcpu 4 to pcpu 15 and vcpu 5 to
pcpu 17. Now, if cores 2, 8 and 12 are big, and no vcpus.big or
vcpu.little is specified, I'd put forward the assumption that the user
wants vcpus 0, 1 and 2 to be big, and vcpus 3, 4, and 5 to be LITTLE.

If, instead, there are vcpus.big or vcpus.little specified, and there's
disagreement, I'd either error out or decide which overrun the other
(and print a WARNING about that happening).

Still right now, this:

cpus = "2-12"

means that all the vcpus of the domain have hard affinity (i.e., are
pinned) to pcpus 2-12. And in this case I'd conclude that the user
wants for all the vcpus to be big.

I'm less sure what to do if _only_ soft-affinity is specified (via
"cpus_soft="), or if hard-affinity contains both big and LITTLE pcpus,
like, e.g.:

cpus = "2-15"

>  - If no physical cpus specificed, then the guest may runs on big
> cpus or on little cpus. But not both.
>
Yes. if nothing (or something contradictory) is specified, we "just"
have to decide what's the sanest default.

>    How to decide runs on big or little physical cpus?
>
I'd default to big.

>  - For Dom0, I am still not sure,default big.little or else?
> 
Again, if nothing is specified, I'd probably default to:
 - give dom0 as much vcpus are there are big cores
 - restrict them to big cores

But, of course, I think we should add boot time parameters like these
ones:

 dom0_vcpus_big = 4
 dom0_vcpus_little = 2

which would mean the user wants dom0 to have 4 big and 2 LITTLE
cores... and then we act accordingly, as described above, and in other
emails.

> If use scheduler to handle the different classes cpu, we do not need
> to use cpupool
> to block vcpus be scheduled onto different physical cpus. And using
> scheudler to handle this
> gives an opportunity to support big.little guest.
> 
Exactly, this is one strong point in favour of this solution, IMO!

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.