[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] xen/arm: introduce vwfi parameter



On Sun, 2017-02-19 at 21:27 +0000, Julien Grall wrote:
> Hi Dario,
> 
Hi,

> On 02/18/2017 01:47 AM, Dario Faggioli wrote:
> >  - vcpu A yields, and there are no runnable but not running vcpus
> >    around. In this case, A gets to run again. Full stop.
> 
> Which turn to be the busy looping I was mentioning when one vCPU is 
> assigned to a pCPU. 
>
Absolutely. Actually, it would mean busy looping, no matter whether
vCPUs are assigned or not.

As I said already, it's not exactly identical, but it would have a very
similar behavior of the Linux's idle=poll option:

http://tomoyo.osdn.jp/cgi-bin/lxr/source/Documentation/kernel-parameters.txt?v=linux-4.9.9#L1576
1576         idle=           [X86]
1577                         Format: idle=poll, idle=halt, idle=nomwait
1578                         Poll forces a polling idle loop that can slightly
1579                         improve the performance of waking up a idle CPU, 
but
1580                         will use a lot of power and make the system run 
hot.
1581                         Not recommended.

And as I've also said, I don't see it as a solution to wakeup latency
problems, not one that I'd like to recommend using, outside of testing
and debugging. It perhaps may be a useful testing and debugging aid,
though.

> This is not the goal of WFI and I would be really 
> surprised that embedded folks will be happy with a solution using
> more 
> power.
> 
Me neither. It's a showstopper for anything that's battery power or may
incur in thermal/cooling issues. Outside

So, just to be clear, I'm happy to help and assist in understanding the
scheduling background and implications, but I am equally happy to leave
the decision of whether or not this is something nice or desirable to
have (as an option) on ARM. :-)

I've never been a fan of it, and never used it, on Linux on x86, not
even when actually working on real-time and low-latency stuff. That
being said, I also personally think that having the option would be no
harm, but I understand concerns that, when an option is there, people
will try to use it in the weirdest way, and then comply at your 'door'
if their CPU went on fire! :-O

> > What will never happen is that a yielding vcpu, by busy looping,
> > prevents other runnable (and non yielding) vcpus to run. And if it
> > does, it's a bug. :-)
> 
> I didn't say it will prevent another vCPU to run. But it will at
> least 
> use slot that could have been used for good purpose by another pCPU.
> 
Not really. Maybe I wasn't clear on explaining yielding, or maybe I'm
not getting what you're trying to say.

It indeed does depend a little bit on the implementation of yield, but
it won't (or at least must not) happen for busy looping issuing yield()
to be much different for the pCPU when that is happening to be sleeping
in deep C-state (or ARM equivalente). Performance aside, of course.

> So in similar workload Xen will perform worst with vwfi=idle, not
> even 
> mentioning the power consumption...
> 
It'd probably be a little bit more inefficient, even performance wise,
if, e.g., scheduler specific yielding code acquire locks, or means that
there is one more vCPU in the runqueues to be dealt with, but nothing
than that. And whether or not this would be significant or noticeable,
I don't know (should be measured, if interesting).

> > In fact, in work conserving schedulers, if pCPU x becomes idle, it
> > means there is _nothing_ that can execute on x itself around. And
> > our
> > schedulers are (with the exception of ARRINC, and if not using caps
> > in
> > Credit1) work conserving, or at least they want and try to be an as
> > much work conserving as possible.
> 
> My knowledge of the scheduler is limited. Does the scheduler take
> into 
> account the cost of context switch when scheduling? When do you
> decide 
> when to run the idle vCPU? Is it only the no other vCPU are runnable
> or 
> do you have an heuristic?
> 
Again, not sure I understand. Context switches, between running vCPUs,
must happen where the scheduling algorithm decides they must happen.
You can try to design an algorithm that requires not too many context
switches, or introduce countermeasures (we have something like that),
but apart from these, I don't know what (else?) you may refer to when
asking about "take into account the cost of context switch".

We do try to take into account the cose of migration, i.e., moving a
vCPU from a pCPU to another... but that's an entirely different thing.

About the idle vCPU... I think the answer to your question is yes.
Credit and Credit2 are work conserving schedulers, so they only let a
pCPU go idle, if there is no one wanting to run in the system (well, in
Credit2, this may not be 100% true, until the load balancer gets to
execute, but in practise, it happens very few and very infrequently).

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.