[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen on ARM IRQ latency and scheduler overhead



On Thu, 16 Feb 2017, Dario Faggioli wrote:
> On Fri, 2017-02-10 at 10:32 -0800, Stefano Stabellini wrote:
> > On Fri, 10 Feb 2017, Dario Faggioli wrote:
> > > Right, interesting use case. I'm glad to see there's some interest
> > > in
> > > it, and am happy to help investigating, and trying to make things
> > > better.
> > 
> > Thank you!
> > 
> Hey, FYI, I am looking into this. It's just that I've got a couple of
> other things in my plate right now.

OK


> > > Ok, do you (or anyone) mind explaining in a little bit more details
> > > what the app tries to measure and how it does that.
> > 
> > Give a look at app/xen/guest_irq_latency/apu.c:
> > 
> > https://github.com/edgarigl/tbm/blob/master/app/xen/guest_irq_latency
> > /apu.c
> > 
> > This is my version which uses the phys_timer (instead of the
> > virt_timer):
> > 
> > https://github.com/sstabellini/tbm/blob/phys-timer/app/xen/guest_irq_
> > latency/apu.c
> > 
> Yep, I did look at those.
> 
> > Edgar can jump in to add more info if needed (he is the author of the
> > app), but as you can see from the code, the app is very simple. It
> > sets
> > a timer event in the future, then, after receiving the event, it
> > checks
> > the current time and compare it with the deadline.
> > 
> Right, and you check the current time with:
> 
>   now = aarch64_irq_get_stamp(el);
> 
> which I guess is compatible with the values you use for the counter.

Yes


> > > > These are the results, in nanosec:
> > > > 
> > > >                         AVG     MIN     MAX     WARM MAX
> > > > 
> > > > NODEBUG no WFI          1890    1800    3170    2070
> > > > NODEBUG WFI             4850    4810    7030    4980
> > > > NODEBUG no WFI credit2  2217    2090    3420    2650
> > > > NODEBUG WFI credit2     8080    7890    10320   8300
> > > > 
> > > > DEBUG no WFI            2252    2080    3320    2650
> > > > DEBUG WFI               6500    6140    8520    8130
> > > > DEBUG WFI, credit2      8050    7870    10680   8450
> > > > 
> > > > DEBUG means Xen DEBUG build.
> > > > 
> [...]
> > > > As you can see, depending on whether the guest issues a WFI or
> > > > not
> > > > while
> > > > waiting for interrupts, the results change significantly.
> > > > Interestingly,
> > > > credit2 does worse than credit1 in this area.
> > > > 
> > > This is with current staging right? 
> > 
> > That's right.
> > 
> So, when you have the chance, can I see the output of
> 
>  xl debug-key r
>  xl dmesg
> 
> Both under Credit1 and Credit2?

I'll see what I can do.


> > > I can try sending a quick patch for disabling the tick when a CPU
> > > is
> > > idle, but I'd need your help in testing it.
> > 
> > That might be useful, however, if I understand this right, we don't
> > actually want a periodic timer in Xen just to make the system more
> > responsive, do we?
> > 
> IMO, no. I'd call that an hack, and don't think we should go that
> route.
> 
> Not until we have figured out and squeezed as much as possible all the
> other sources of latency, and that has proven not to be enough, at
> least.
> 
> I'll send the patch.
> 
> > > > Assuming that the problem is indeed the scheduler, one workaround
> > > > that
> > > > we could introduce today would be to avoid calling vcpu_unblock
> > > > on
> > > > guest
> > > > WFI and call vcpu_yield instead. This change makes things
> > > > significantly
> > > > better:
> > > > 
> > > >                                      AVG     MIN     MAX     WARM
> > > > MAX
> > > > DEBUG WFI (yield, no block)          2900    2190    5130    5130
> > > > DEBUG WFI (yield, no block) credit2  3514    2280    6180    5430
> > > > 
> > > > Is that a reasonable change to make? Would it cause significantly
> > > > more
> > > > power consumption in Xen (because xen/arch/arm/domain.c:idle_loop
> > > > might
> > > > not be called anymore)?
> > > > 
> > > Exactly. So, I think that, as Linux has 'idle=poll', it is
> > > conceivable
> > > to have something similar in Xen, and if we do, I guess it can be
> > > implemented as you suggest.
> > > 
> > > But, no, I don't think this is satisfying as default, not before
> > > trying
> > > to figure out what is going on, and if we can improve things in
> > > other
> > > ways.
> > 
> > OK. Should I write a patch for that? I guess it would be arm specific
> > initially. What do you think it would be a good name for the option?
> > 
> Well, I think such an option may be useful on other arches too, but we
> better measure/verify that before. Therefore, I'd be ok for this to be
> only implemented on ARM for now.
> 
> As per the name, I actually like the 'idle=', and as values, what about
> 'sleep' or 'block' for the current default, and stick to 'poll' for the
> new behavior you'll implement? Or do you think it is at risk of
> confusion with Linux?
> 
> An alternative would be something like 'wfi=[sleep,idle]', or
> 'wfi=[block,poll]', but that is ARM specific, and it'd mean we will
> need another option for making x86 behave similarly.

That's a good idea. vwfi=[sleep,idle] looks like the right thing to
introduce, given that the option would be ARM only at the moment and
that it's the virtual wfi not the physical wfi behavior that we are
changing.

 
> > > But it would not be much more useful than that, IMO.
> > 
> > Why? Actually I know of several potential users of Xen on ARM
> > interested
> > exactly in this use-case. They only have a statically defined number
> > of
> > guests with a total amount of vcpu lower or equal to the number of
> > pcpu
> > in the system. Wouldn't a scheduler like that help in this scenario?
> >
> What I'm saying is that would be rather inflexible. In the sense that
> it won't be possible to have statically pinned and dynamically moving
> vcpus in the same guest, it would be hard to control what vcpu is
> statically assigned to what pcpu, making a domain statically assigned
> would mean move it to another cpupool (which is the only way to use a
> different scheduler, right now, in Xen), and things like this.
> 
> I know there are static use cases... But I'm not entirely sure how
> static they really are, and whether they, in the end, will really like
> such degree of inflexibility.

They are _very_ static :-)
Think about the board on a mechanical robot or a drone. VMs are only
created at host boot and never again. In fact we are planning to
introduce a feature in Xen to be able to create a few VMs directly from
the hypervisor, to skip the tools in Dom0 for these cases.


> But anyway, indeed I can give you a scheduler that, provided it leaves
> in a cpupool with M pcpus, a soon as a new domain with n vcpus is moved
> inside the pool, statically assign its n0,n1...,nk,k<=M vcpus to a
> pcpu, and always stick with that. And we'll see what will happen! :-)

I am looking forward to it.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.