[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] xen: idle_loop: either deal with tasklets or go idle



On Fri, 2017-06-16 at 05:47 -0600, Jan Beulich wrote:
> > On Fri, 2017-06-16 at 02:54 -0600, Jan Beulich wrote:
> > > > > > On 14.06.17 at 18:53, <dario.faggioli@xxxxxxxxxx> wrote:
> > > >      unsigned long *work_to_do = &per_cpu(tasklet_work_to_do,
> > > > cpu);
> > > >      struct list_head *list = &per_cpu(tasklet_list, cpu);
> > > >  
> > > > -    /*
> > > > -     * Work must be enqueued *and* scheduled. Otherwise there
> > > > is
> > > > no work to
> > > > -     * do, and/or scheduler needs to run to update idle vcpu
> > > > priority.
> > > > -     */
> > > > -    if ( likely(*work_to_do !=
> > > > (TASKLET_enqueued|TASKLET_scheduled)) )
> > > > -        return;
> > > 
> > > Perhaps it also wouldn't hurt to convert this to an ASSERT() too.
> > > 
> > 
> > Nope, I can't. It's a best effort check, and *work_to_do (which is
> > per_cpu(tasklet_work_to_do,cpu)) can change, and the assert may
> > fail.
> 
> How that? TASKLET_enqueued can only be cleared by do_tasklet()
> itself, and I don't think nesting invocations of the function can or
> should occur. TASKLET_scheduled is only being cleared when
> schedule() observes that bit set without TASKLET_enqueued also
> set. IOW there may be races in setting of the bits, but since we
> expect the caller to have done this check already, I think an
> ASSERT() would be quite fine here.
> 
Ok, makes sense. I will add the ASSERT() (with something like what you
wrote here as a comment).

> > > Wouldn't cpu_is_haltable() then also better use this new
> > > function?
> > > 
> > 
> > Mmm... Perhaps. It's certainly less code chrun.
> > 
> > ARM code would then contain two invocations of cpu_is_haltable()
> > (the
> > first happens with IRQ enabled, so a second one with IRQ disabled
> > is
> > necessary). But that is *exactly* the same thing we do on x86
> > (they're
> > just in different functions in that case).
> > 
> > So, I reworked the patch according to these suggestions, and you
> > can
> > look at it below.
> 
> I'm confused: You've added further uses of cpu_is_haltable()
> where the cheaper tasklet_work_to_do() would do.
>
Indeed. Sorry!

Fact is, I've read your comment backwards, i.e., as you were saying
something like "wouldn't cpu_is_haltable() better be used here, instead
of this new function?"

And it's not that your wording is ambiguous, it's me that, apparently,
can't read English! :-/

I'll rework the patch again...

> Of course that suggestion
> of mine was more than 1:1 replacement - the implied question was
> whether cpu_is_haltable() simply checking tasklet_work_to_do to
> be non-zero isn't too lax (and wouldn't better be
> !tasklet_work_to_do()).
> 
Now, to try to answer the real question...

Let's assume that, on cpu x, we are about to check cpu_is_haltable(),
while, on cpu y, tasklet_schedule_on_cpu(x) is being called, and
manages to set _TASKLET_enqueued in *work_to_do.

I.e., in current code:

 CPU x                           CPU y
 |                               |
 cpu_is_haltable(x)              tasklet_schedule_on_cpu(x)
 |!softirq_pending(x) == true    tasklet_enqueue()
 |cpu_online(x) == true          test_and_set(TASKLET_enqueued,
 |                                            work_to_do)
 |!work_to_do == FALSE

So we don't go to sleep, and we stay in the idle loop for the next
iteration. At which point, CPU y will have raised SCHEDULE_SOFTIRQ on
x, schedule (still on x) will set TASKLET_scheduled, and we'll call
do_tasklet().

Basically, right now, we risk spinning for the time that passes between
TASKLET_enqueued being set and SCHEDULE_SOFTIRQ being raised and
reaching cpu x. This should be a very short window, and, considering
how the TASKLET_* flags are handled, this looks the correct behavior to
me.

If we use !tasklet_work_to_do() in cpu_is_haltable():

 CPU x                               CPU y
 |                                   |
 cpu_is_haltable(x)                  tasklet_schedule_on_cpu(x)
 |!softirq_pending(x) == true        tasklet_enqueue()
 |cpu_online(x) == true              test_and_set(TASKLET_enqueued,
 |                                              work_to_do)
 |!(work_to_do == TASKLET_enqueued+
                  TASKLET_scheduled) == TRUE

Which means we'd go to sleep... just for (most likely) be woken up very
very soon by SCHEDULE_SOFTIRQ being thrown at us (cpu x) by cpu y.

Am I overlooking anything? And is this (this time) what you were
asking?

Assuming answers are 'no' and 'yes', I think I'd leave
cpu_is_haltable() as it is (perhaps adding a brief comment on why it's
ok/better to check work_to_do directly, instead than calling
tasklet_work_to_do()).

Sorry again for the misunderstanding,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.