[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 3/7] xen: credit: consider tickled pCPUs as busy.



On Fri, Apr 7, 2017 at 5:56 PM, Dario Faggioli
<dario.faggioli@xxxxxxxxxx> wrote:
> Currently, it can happen that __runq_tickle(),
> running on pCPU 2 because vCPU x woke up, decides
> to tickle pCPU 3, because it's idle. Just after
> that, but before pCPU 3 manages to schedule and
> pick up x, either __runq_tickel() or
> __csched_cpu_pick(), running on pCPU 6, sees that
> idle pCPUs are 0, 1 and also 3, and for whatever
> reason it also chooses 3 for waking up (or
> migrating) vCPU y.
>
> When pCPU 3 goes through the scheduler, it will
> pick up, say, vCPU x, and y will sit in its
> runqueue, even if there are idle pCPUs.
>
> Alleviate this by marking a pCPU to be idle
> right away when tickling it (like, e.g., it happens
> in Credit2).
>
> Note that this does not eliminate the race. That
> is not possible without introducing proper locking
> for the cpumasks the scheduler uses. It significantly
> reduces the window during which it can happen, though.
>
> Introduce proper locking for the cpumask can, in
> theory, be done, and may be investigated in future.
> It is a significant amount of work to do it properly
> (e.g., avoiding deadlock), and it is likely to adversely
> affect scalability, and so it may be a path it is just
> not worth following.
>
> Signed-off-by: Dario Faggioli <dario.faggioli@xxxxxxxxxx>
> ---
> Cc: George Dunlap <george.dunlap@xxxxxxxxxxxxx>
> Cc: Anshul Makkar <anshul.makkar@xxxxxxxxxx>
> ---
> Changes from v2:
> * add comments to better explain the meaning of the added ASSERT()-s.

Thanks.

Reviewed-by: George Dunlap <george.dunlap@xxxxxxxxxx>

> ---
>  xen/common/sched_credit.c |   32 +++++++++++++++++++++++++++++---
>  1 file changed, 29 insertions(+), 3 deletions(-)
>
> diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
> index 59ed4ca..7b4ea02 100644
> --- a/xen/common/sched_credit.c
> +++ b/xen/common/sched_credit.c
> @@ -489,7 +489,17 @@ static inline void __runq_tickle(struct csched_vcpu *new)
>                  __trace_var(TRC_CSCHED_TICKLE, 1, sizeof(cpu), &cpu);
>          }
>
> -        /* Send scheduler interrupts to designated CPUs */
> +        /*
> +         * Mark the designated CPUs as busy and send them all the scheduler
> +         * interrupt. We need the for_each_cpu for dealing with the
> +         * !opt_tickle_one_idle case. We must use cpumask_clear_cpu() and
> +         * can't use cpumask_andnot(), because prv->idlers needs atomic 
> access.
> +         *
> +         * In the default (and most common) case, when opt_rickle_one_idle is
> +         * true, the loop does only one step, and only one bit is cleared.
> +         */
> +        for_each_cpu(cpu, &mask)
> +            cpumask_clear_cpu(cpu, prv->idlers);
>          cpumask_raise_softirq(&mask, SCHEDULE_SOFTIRQ);
>      }
>      else
> @@ -990,6 +1000,13 @@ csched_vcpu_acct(struct csched_private *prv, unsigned 
> int cpu)
>              SCHED_VCPU_STAT_CRANK(svc, migrate_r);
>              SCHED_STAT_CRANK(migrate_running);
>              set_bit(_VPF_migrating, &current->pause_flags);
> +            /*
> +             * As we are about to tickle cpu, we should clear its bit in
> +             * idlers. But, if we are here, it means there is someone running
> +             * on it, and hence the bit must be zero already.
> +             */
> +            ASSERT(!cpumask_test_cpu(cpu,
> +                                     CSCHED_PRIV(per_cpu(scheduler, 
> cpu))->idlers));
>              cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
>          }
>      }
> @@ -1082,13 +1099,22 @@ static void
>  csched_vcpu_sleep(const struct scheduler *ops, struct vcpu *vc)
>  {
>      struct csched_vcpu * const svc = CSCHED_VCPU(vc);
> +    unsigned int cpu = vc->processor;
>
>      SCHED_STAT_CRANK(vcpu_sleep);
>
>      BUG_ON( is_idle_vcpu(vc) );
>
> -    if ( curr_on_cpu(vc->processor) == vc )
> -        cpu_raise_softirq(vc->processor, SCHEDULE_SOFTIRQ);
> +    if ( curr_on_cpu(cpu) == vc )
> +    {
> +        /*
> +         * We are about to tickle cpu, so we should clear its bit in idlers.
> +         * But, we are here because vc is going to sleep while running on 
> cpu,
> +         * so the bit must be zero already.
> +         */
> +        ASSERT(!cpumask_test_cpu(cpu, CSCHED_PRIV(per_cpu(scheduler, 
> cpu))->idlers));
> +        cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
> +    }
>      else if ( __vcpu_on_runq(svc) )
>          __runq_remove(svc);
>  }
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.