[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] xen: Credit2: enable fully custom runqueue arrangement



On 09/13/2017 05:21 PM, Dario Faggioli wrote:
> The patch introduces yet another runqueue arrangement option
> for Credit2. In fact, it allows the user to specify, explicitly
> and precisely, what pCPUs should belong to which runqueue.
> 
> Signed-off-by: Dario Faggioli <dario.faggioli@xxxxxxxxxx>
> Signed-off-by: Praveen Kumar <kpraveen.lkml@xxxxxxxxx>
> ---
> Cc: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>
> Cc: Praveen Kumar <kpraveen.lkml@xxxxxxxxx>
> ---
> This is derived --although *heavily* reworked-- from what Praveen
> has submitted before, here:
> 
>  https://lists.xen.org/archives/html/xen-devel/2017-04/msg02402.html
>  https://lists.xen.org/archives/html/xen-devel/2017-06/msg00201.html
> 
> During my review of that, I suggested to make the array dynamically
> allocated, using nr_cpu_ids as its size. Turns out, that does not
> make much sense, as at the time the parameters are parsed, nr_cpu_ids
> is still equal to NR_CPUS.
> ---
>  docs/misc/xen-command-line.markdown |   12 +++
>  xen/common/sched_credit2.c          |  135 
> +++++++++++++++++++++++++++++++++++
>  2 files changed, 146 insertions(+), 1 deletion(-)
> 
> diff --git a/docs/misc/xen-command-line.markdown 
> b/docs/misc/xen-command-line.markdown
> index 9797c8d..dbf5d4c 100644
> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -567,7 +567,7 @@ also slow in responding to load changes.
>  The default value of `1 sec` is rather long.
>  
>  ### credit2\_runqueue
> -> `= cpu | core | socket | node | all`
> +> `= cpu | core | socket | node | all | <custom>`
>  
>  > Default: `socket`
>  
> @@ -585,6 +585,16 @@ Available alternatives, with their meaning, are:
>  * `node`: one runqueue per each NUMA node of the host;
>  * `all`: just one runqueue shared by all the logical pCPUs of
>           the host
> +* `<custom>`: one runqueue per each specified subset of logical
> +              pCPUs of the host. Subsets are defined either as:
> +              `[[0,1,][2,6][3,5][4,7]]`, or as:

Is the comma after the 1 there a typo, or do are you explicitly trying
to communicate that you tolerate terminal commas?

> +              `'0,1;2,6;3,5;4,7'`. That means
> +               - pCPUs 0 and 1 belong to runqueue 0
> +               - pCPUs 2 and 6 belong to runqueue 1
> +               - pCPUs 3 and 5 belong to runqueue 2
> +               - pCPUs 4 and 7 belong to runqueue 3
> +              pCPUs that are present on the host, but are not
> +              part of any subset, are assigned to runqueue 0.
>  
>  ### dbgp
>  > `= ehci[ <integer> | @pci<bus>:<slot>.<func> ]`
> diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
> index 0f93ad5..10da084 100644
> --- a/xen/common/sched_credit2.c
> +++ b/xen/common/sched_credit2.c
> @@ -12,6 +12,7 @@
>  
>  #include <xen/init.h>
>  #include <xen/lib.h>
> +#include <xen/ctype.h>
>  #include <xen/sched.h>
>  #include <xen/domain.h>
>  #include <xen/delay.h>
> @@ -323,6 +324,17 @@ integer_param("credit2_balance_over", 
> opt_overload_balance_tolerance);
>   *           (logical) processors of the host belong. This will happen if
>   *           the opt_runqueue parameter is set to 'all'.
>   *
> + * - custom: meaning that there will be one runqueue per each specified
> + *           subset, as shown in the following examples:
> + *           - credit2_runqueue=[[0,1][3][4,5]]
> + *           - credit2_runqueue='0,1;3;4,5'
> + *           These (both) mean the following:
> + *           - CPU 0 and CPU 1 belong to runqueue 0
> + *           - CPU 3 belongs to runqueue 1
> + *           - CPU 4 and CPU 5 belong to runqueue 2
> + *           CPUs that are present on the host, but are not part of any
> + *           defined subset, will be assigned to runqueue 0.
> + *
>   * Depending on the value of opt_runqueue, therefore, cpus that are part of
>   * either the same physical core, the same physical socket, the same NUMA
>   * node, or just all of them, will be put together to form runqueues.
> @@ -332,6 +344,7 @@ integer_param("credit2_balance_over", 
> opt_overload_balance_tolerance);
>  #define OPT_RUNQUEUE_SOCKET 2
>  #define OPT_RUNQUEUE_NODE   3
>  #define OPT_RUNQUEUE_ALL    4
> +#define OPT_RUNQUEUE_CUSTOM 5
>  static const char *const opt_runqueue_str[] = {
>      [OPT_RUNQUEUE_CPU] = "cpu",
>      [OPT_RUNQUEUE_CORE] = "core",
> @@ -341,6 +354,115 @@ static const char *const opt_runqueue_str[] = {
>  };
>  static int __read_mostly opt_runqueue = OPT_RUNQUEUE_SOCKET;
>  
> +static unsigned int __read_mostly custom_cpu_runqueue[NR_CPUS];
> +
> +static int parse_custom_runqueue(const char *s)
> +{
> +    unsigned int cpu, rqi = 0;
> +    bool in_subset = false, ret = true;
> +    cpumask_t cpus;
> +
> +    /*
> +     * If we are dealing with format 1 (i.e., [[0,1][3,5]]), first
> +     * and last character must be '[' and ']', respectively.
> +     *
> +     * If we are dealing with format 2 (i.e., 0,1;3,5), first and
> +     * last character must be numbers.
> +     */
> +    if ( *s == '[' && *(s+strlen(s)-1) == ']' )
> +        s++;
> +    else if ( !(isdigit(*s) && isdigit(s[strlen(s)-1])) )
> +        return false;
> +
> +    cpumask_clear(&cpus);
> +    while ( !(*s == ']' && *(s+1) == '\0') && *s != '\0' )
> +    {
> +        /*
> +         * We tolerate only the allowed characters (depending on the
> +         * format). Also, we don't accept empty subsets.
> +         */
> +        if ( *(s+strlen(s)-1) == ']' )
> +        {
> +            /* Format 1 */
> +            if ( !(*s == '[' || *s == ']' || *s == ',' || isdigit(*s)) ||
> +                 (*s == '[' && *(s+1) == ']') )
> +                return false;
> +        }
> +        else
> +        {
> +            /* Format 2 */
> +            if ( !(*s == ';' || *s == ',' || isdigit(*s)) ||
> +                 (*s == ';' && *(s+1) == ';') )
> +                return false;
> +        }
> +
> +        /* Are we at the beginning of a subset, in format 1? */
> +        if ( *s == '[' )
> +        {
> +            /*
> +             * If we are opening a subset, we must have closed all the
> +             * previously defined ones, or the string is malformed.
> +             */
> +            if ( in_subset )
> +                return false;
> +
> +            s++;
> +            in_subset = true;
> +            continue;
> +        }
> +        /* Are we at the end of a subset? */
> +        if ( *s == ']' || *s == ';' )
> +        {
> +            /*
> +             * If we are closing a subset, in format 1, we must have
> +             * opened it before. If not, the string is malformed.
> +             */
> +            if ( *s == ']' && !in_subset )
> +                return false;
> +
> +            s++;
> +            rqi++;
> +            in_subset = false;
> +            continue;
> +        }
> +
> +        /*
> +         * At this point, we must be dealing with either a CPU ID (i.e.,
> +         * a number) or CPU separator, within a subset (i.e., ',').
> +         */
> +        if ( *s == ',' )
> +        {
> +            s++;
> +            continue;
> +        }
> +        cpu = simple_strtoul(s, &s, 10);

Er, sorry -- it looks like this won't handle multi-digit pcpu numbers;
in fact, [01][23][45][67] will parse the same way as [0,1][2,3][4,5][6,7]?

Or am I missing something?

> +
> +        /* Is the cpu ID we found valid? */
> +        if ( cpu >= nr_cpu_ids )
> +            return false;
> +
> +        /*
> +         * CPU IDs are considered only the first time they're found in the
> +         * string considere. Multiple subsequent occurrences are ignored.
> +         */
> +        if ( cpumask_test_cpu(cpu, &cpus) )
> +            continue;
> +
> +        cpumask_set_cpu(cpu, &cpus);
> +        custom_cpu_runqueue[cpu] = rqi;
> +    }
> +
> +    /*
> +     * If in_subset is true, it means we are in format 1, and we
> +     * found a subset that was not closed with its ']', which
> +     * means the string is malformed.
> +     */
> +    if ( in_subset )
> +        return false;
> +
> +    return ret;
> +}

Rather than nitpick the state machine here, I wonder if we could add
some unit tests to make sure that the parser continues to be have as
expected.  We could either do some trickery to allow it to run in user
mode (as the x86_emulate() harness); or we could run some tests during
boot-up when running in DEBUG mode.

Thoughts?

Other than that, looks good.

If you can't pick this up, I can probably make time to do it.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.