[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 28/47] xen/sched: add code to sync scheduling of all vcpus of a sched unit



On 14.09.2019 10:52, Juergen Gross wrote:
> --- a/xen/common/schedule.c
> +++ b/xen/common/schedule.c
> @@ -55,6 +55,9 @@ boolean_param("sched_smt_power_savings", 
> sched_smt_power_savings);
>  int sched_ratelimit_us = SCHED_DEFAULT_RATELIMIT_US;
>  integer_param("sched_ratelimit_us", sched_ratelimit_us);
>  
> +/* Number of vcpus per struct sched_unit. */
> +static unsigned int __read_mostly sched_granularity = 1;

Didn't you indicate earlier that this would be a per-pool property?
Or was that just a longer term plan?

> +/*
> + * Rendezvous before taking a scheduling decision.
> + * Called with schedule lock held, so all accesses to the rendezvous counter
> + * can be normal ones (no atomic accesses needed).
> + * The counter is initialized to the number of cpus to rendezvous initially.
> + * Each cpu entering will decrement the counter. In case the counter becomes
> + * zero do_schedule() is called and the rendezvous counter for leaving
> + * context_switch() is set. All other members will wait until the counter is
> + * becoming zero, dropping the schedule lock in between.
> + */

This recurring lock/unlock is liable to cause a massive cache line
ping-pong, especially for socket or node scheduling. Instead of
just a cpu_relax() between the main unlock and re-lock, could there
perhaps be lock-less checks to determine whether there's any point
at all re-acquiring the lock?

> +static void schedule(void)
> +{
> +    struct vcpu          *vnext, *vprev = current;
> +    struct sched_unit    *prev = vprev->sched_unit, *next = NULL;
> +    s_time_t              now;
> +    struct sched_resource *sd;
> +    spinlock_t           *lock;
> +    int cpu = smp_processor_id();
> +
> +    ASSERT_NOT_IN_ATOMIC();
> +
> +    SCHED_STAT_CRANK(sched_run);
> +
> +    sd = get_sched_res(cpu);
> +
> +    lock = pcpu_schedule_lock_irq(cpu);
> +
> +    if ( prev->rendezvous_in_cnt )
> +    {
> +        /*
> +         * We have a race: sched_slave() should be called, so raise a softirq
> +         * in order to re-enter schedule() later and call sched_slave() now.
> +         */
> +        pcpu_schedule_unlock_irq(lock, cpu);
> +
> +        raise_softirq(SCHEDULE_SOFTIRQ);
> +        return sched_slave();
> +    }
> +
> +    now = NOW();
> +
> +    stop_timer(&sd->s_timer);

Is the order of these two relevant? A while ago there were a couple
of changes moving such NOW() invocations past anything that may take
non-negligible time, to make accounting as accurate as possible.

> --- a/xen/include/xen/softirq.h
> +++ b/xen/include/xen/softirq.h
> @@ -4,6 +4,7 @@
>  /* Low-latency softirqs come first in the following list. */
>  enum {
>      TIMER_SOFTIRQ = 0,
> +    SCHED_SLAVE_SOFTIRQ,
>      SCHEDULE_SOFTIRQ,
>      NEW_TLBFLUSH_CLOCK_PERIOD_SOFTIRQ,
>      RCU_SOFTIRQ,

Seeing the comment, is the insertion you do as well as the pre-
existing placement of SCHEDULE_SOFTIRQ still appropriate with
the rendezvous-ing you introduce?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.