[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/3] x86: refactor psr implementation in hypervisor.



>>> On 25.08.16 at 07:22, <yi.y.sun@xxxxxxxxxxxxxxx> wrote:
> --- a/xen/arch/x86/psr.c
> +++ b/xen/arch/x86/psr.c
> @@ -23,22 +23,116 @@
>  #define PSR_CAT        (1<<1)
>  #define PSR_CDP        (1<<2)
>  
> -struct psr_cat_cbm {
> -    union {
> -        uint64_t cbm;
> -        struct {
> -            uint64_t code;
> -            uint64_t data;
> -        };
> -    };
> +/*
> + * To add a new PSR feature, please follow below steps:
> + * 1. Implement feature specific callback functions listed in
> + *    "struct feat_ops";
> + * 2. Implement a feature specific "struct feat_ops" object and register
> + *    feature specific callback functions into it;
> + * 3. Delcare feat_list object for the feature and malloc memory for it
> + *    in internal_cpu_prepare(). Correspondingly, free them in
> + *    free_feature();
> + * 4. Add feature initialization codes in internal_cpu_init().
> + */
> +
> +struct feat_list;
> +struct psr_socket_alloc_info;
> +
> +/* Every feature enabled should implement such ops and callback functions. */
> +struct feat_ops {
> +    /*
> +     * init_feature is used in cpu initialization process to do feature
> +     * specific initialization works.
> +     */
> +    void (*init_feature)(unsigned int eax, unsigned int ebx,
> +                         unsigned int ecx, unsigned int edx,
> +                         struct feat_list *pFeat,

Here and below - we don't normally use identifiers with capital letters
in the middle.

> +                         struct psr_socket_alloc_info *info);
> +    /*
> +     * get_old_set_new is used in set value process to get all features'
> +     * COS registers values according to original cos id of the domain.
> +     * Then, assemble them into an mask array as feature list order.

This sentence in particular doesn't make any sense to me. What
follows below also looks like it is in need of improvement.

> +     * Then, set the new feature value into feature corresponding position
> +     * in the array. This function is used to pair with compare_mask.
> +     */
> +    int (*get_old_set_new)(uint64_t *mask,
> +                           struct feat_list *pFeat,
> +                           unsigned int old_cos,
> +                           enum mask_type type,
> +                           uint64_t m);
> +    /*
> +     * compare_mask is used to in set value process to compare if the
> +     * input mask array can match all the features' COS registers values
> +     * according to input cos id.
> +     */
> +    int (*compare_mask)(uint64_t *mask, struct feat_list *pFeat,
> +                        unsigned int cos, bool_t *found);

For a "compare" function I'd expect a word on the meaning of the
return value, and I'd expect most if not all pointer parameters to
be pointers to const.

Also plain bool please.

> +    /*
> +     * get_cos_max_as_type is used to get the cos_max value of the feature
> +     * according to input mask_type.
> +     */
> +    unsigned int (*get_cos_max_as_type)(struct feat_list *pFeat,
> +                                        enum mask_type type);

DYM mean "get_cos_max_from_type()"?

> +    /*
> +     * exceed_range is used to check if the input cos id exceeds the
> +     * feature's cos_max and if the input mask value is not the default one.
> +     * Even if the associated cos exceeds the cos_max, HW can work as default
> +     * value. That is the reason we need check is mask value is default one.
> +     * If both criteria are fulfilled, that means the input exceeds the
> +     * range.
> +     */
> +    unsigned int (*exceed_range)(uint64_t *mask, struct feat_list *pFeat,
> +                                 unsigned int cos);

According to the comment this is kind of a predicate, which seems
unlikely to return an unsigned value. In fact without a word on the
return value I'd expect such to return bool. And I'd also expect the
name to reflect the purpose, i.e. "exceeds_name()". Plus just like
for compare above I wonder whether come or all of the parameters
should be pointers to const (please go over the entire patch and do
constification wherever possible/sensible).

> +    /*
> +     * write_msr is used to write feature specific MSR registers.
> +     */

This is a single line comment.

> +    int (*write_msr)(unsigned int cos, uint64_t *mask,
> +                     struct feat_list *pFeat);

Such write operations don't normally alter the value to be written.
If that's different here, that's perhaps worth a word in the comment.
If it isn't, then I don't see why the address of the value gets passed
instead of the value itself.

Oh, having gone further in the patch I see this is to update multiple
MSRs. I guess this would best be reflected by making the parameter
const uint64_t masks[] (albeit it's also not clear to me why this
necessarily have to be masks kind items, and not just arbitrary
values - proper choice of variable and parameter names helps the
understanding).

> +#define MAX_FEAT_INFO_SIZE 8
> +#define MAX_COS_REG_NUM  128

Are these numbers arbitrary, or based on something?

> +struct psr_socket_alloc_info {

I've yet to see whether the "alloc" in the name really makes sense.

> @@ -58,8 +150,427 @@ static unsigned int __read_mostly opt_cos_max = 255;
>  static uint64_t rmid_mask;
>  static DEFINE_PER_CPU(struct psr_assoc, psr_assoc);
>  
> -static struct psr_cat_cbm *temp_cos_to_cbm;
> +static struct psr_ref *temp_cos_ref;
> +/* Every feature has its own object. */
> +struct feat_list *pL3CAT;

static. And a better name please, more in line with the other variable
or similar purpose.

> +/* Common functions for supporting feature callback functions. */
> +static void add_feature(struct feat_list *pHead, struct feat_list *pTmp)
> +{
> +    if ( NULL == pHead || NULL == pTmp )
> +        return;
> +
> +    while ( pHead->pNext )
> +        pHead = pHead->pNext;
> +
> +    pHead->pNext = pTmp;
> +}

Do you really need custom list management here?

> +static void free_feature(struct psr_socket_alloc_info *info)
> +{
> +    struct feat_list *pTmp;
> +    struct feat_list *pPrev;
> +
> +    if ( NULL == info )
> +        return;
> +
> +    if ( NULL == info->pFeat )
> +        return;
> +
> +    pTmp = info->pFeat->pNext;
> +    while ( pTmp )
> +    {
> +        pPrev = pTmp;
> +        pTmp = pTmp->pNext;
> +        clear_bit(pPrev->feature, &(info->features));
> +        xfree(pPrev);
> +        pPrev = NULL;
> +    }
> +}

Why does info->pFeat not get freed here?

> +static void l3_cat_init_feature(unsigned int eax, unsigned int ebx,
> +                                unsigned int ecx, unsigned int edx,
> +                                struct feat_list *pFeat,
> +                                struct psr_socket_alloc_info *info)
> +{
> +    struct psr_cat_lvl_info l3_cat;
> +    unsigned int socket;
> +    uint64_t val;
> +
> +    if ( MAX_FEAT_INFO_SIZE < sizeof(struct psr_cat_lvl_info) )
> +        return;

DYM BUILD_BUG_ON()? Also the MAX_FEAT_INFO_SIZE dimensioned
array is one with unsigned int as the base type, which doesn't get
reflected here.

> +    /* No valid value so do not enable feature. */
> +    if ( 0 == eax || 0 == edx )
> +        return;
> +
> +    l3_cat.cbm_len = (eax & 0x1f) + 1;
> +    l3_cat.cos_max = min(opt_cos_max, edx & 0xffff);
>  
> +    /* cos=0 is reserved as default cbm(all ones). */
> +    pFeat->cos_reg_val[0] = (1ull << l3_cat.cbm_len) - 1;
> +
> +    pFeat->feature = PSR_SOCKET_L3_CAT;
> +    set_bit(PSR_SOCKET_L3_CAT, &(info->features));

Does this need to be set_bit(), or would __set_bit() suffice? Also
there are many cases of this strange &() construct - the
parentheses are pointless there.

> +    if ( (ecx & PSR_CAT_CDP_CAPABILITY) && (opt_psr & PSR_CDP) &&
> +         !test_bit(PSR_SOCKET_L3_CDP, &(info->features)) )
> +    {
> +        /* Data */
> +        pFeat->cos_reg_val[0] = (1ull << l3_cat.cbm_len) - 1;
> +        /* Code */
> +        pFeat->cos_reg_val[1] = (1ull << l3_cat.cbm_len) - 1;
> +
> +        /* We only write mask1 since mask0 is always all ones by default. */
> +        wrmsrl(MSR_IA32_PSR_L3_MASK(1),
> +               (1ull << l3_cat.cbm_len) - 1);
> +        rdmsrl(MSR_IA32_PSR_L3_QOS_CFG, val);
> +        wrmsrl(MSR_IA32_PSR_L3_QOS_CFG, val | (1 << 
> PSR_L3_QOS_CDP_ENABLE_BIT));
> +
> +        /* Cut half of cos_max when CDP is enabled. */
> +        l3_cat.cos_max >>= 1;
> +
> +        pFeat->feature = PSR_SOCKET_L3_CDP;
> +        set_bit(PSR_SOCKET_L3_CDP, &(info->features));
> +        clear_bit(PSR_SOCKET_L3_CAT, &(info->features));

A few lines up you set this bit, and now you clear it again. Please
instead set it in an else branch below.

> +    }
> +
> +    memcpy(pFeat->feat_info, &l3_cat, sizeof(struct psr_cat_lvl_info));

Aforementioned BUILD_BUG_ON() would probably bets be placed
right ahead of this memcpy().

> +static int l3_cat_compare_mask(uint64_t *mask, struct feat_list *pFeat,
> +                               unsigned int cos, bool_t *found)
> +{
> +    struct psr_cat_lvl_info cat_info;
> +    uint64_t l3_def_cbm;
> +
> +    memcpy(&cat_info, pFeat->feat_info, sizeof(struct psr_cat_lvl_info));

Already here I think this memcpy()ing gets unwieldy. Can't you
simply make the structure field a union of all types of interest?

> +    l3_def_cbm = (1ull << cat_info.cbm_len) - 1;
> +
> +    /* CDP */
> +    if ( pFeat->feature == PSR_SOCKET_L3_CDP )
> +    {
> +        if ( cos > cat_info.cos_max )
> +        {
> +            if ( mask[0] != l3_def_cbm ||
> +                 mask[1] != l3_def_cbm )
> +            {
> +                *found = 0;

false and true for literal boolean values please.

> +                printk(XENLOG_ERR "CDP exceed cos max.\n");

I haven't figured yet at what times this function will get called, but
I suspect it's inappropriate for it to log messages.

> +                return -ENOENT;
> +            }
> +            *found = 1;
> +            return 2;

Who or what is 2?

> +        }
> +
> +        if ( mask[0] == pFeat->cos_reg_val[cos * 2] &&
> +             mask[1] == pFeat->cos_reg_val[cos * 2 + 1])
> +            *found = 1;
> +        else
> +            *found = 0;
> +
> +        return 2;
> +    }
> +
> +    /* CAT */
> +    if ( cos > cat_info.cos_max )
> +    {
> +        if ( mask[0] != l3_def_cbm )
> +        {
> +            *found = 0;
> +            printk(XENLOG_ERR "CAT exceed cos max.\n");
> +            return -ENOENT;
> +        }
> +        *found = 1;
> +        return 1;
> +    }
> +
> +    if ( mask[0] == pFeat->cos_reg_val[cos] )
> +        *found = 1;
> +    else
> +        *found = 0;

Please use a simple assignment in cases like this, instead of if/else.

> +static unsigned int l3_cat_exceed_range(uint64_t *mask, struct feat_list 
> *pFeat,
> +                                        unsigned int cos)
> +{
> +    struct psr_cat_lvl_info cat_info;
> +    uint64_t l3_def_cbm;
> +
> +    memcpy(&cat_info, pFeat->feat_info, sizeof(struct psr_cat_lvl_info));
> +    l3_def_cbm = (1ull << cat_info.cbm_len) - 1;
> +
> +    /* CDP */
> +    if ( pFeat->feature == PSR_SOCKET_L3_CDP )
> +    {
> +        if ( cos > cat_info.cos_max )
> +            if ( mask[0] != l3_def_cbm ||
> +                 mask[1] != l3_def_cbm )

Please combine such if()s.

> +                /*
> +                 * Exceed cos_max and value to set is not default,
> +                 * return error.
> +                 */
> +                return 0;
> +
> +        return 2;

Same question as above - who or what is 2?

> +static int l3_cat_get_old_set_new(uint64_t *mask,
> +                                  struct feat_list *pFeat,
> +                                  unsigned int old_cos,
> +                                  enum mask_type type,
> +                                  uint64_t m)
> +{
> +    struct psr_cat_lvl_info cat_info;
> +
> +    memcpy(&cat_info, pFeat->feat_info, sizeof(struct psr_cat_lvl_info));
> +
> +    if ( type == PSR_MASK_TYPE_L3_DATA ||
> +         type == PSR_MASK_TYPE_L3_CODE ||
> +         type == PSR_MASK_TYPE_L3_CBM )
> +    {
> +        if ( !psr_check_cbm(cat_info.cbm_len, m) )
> +            return -EINVAL;
> +    }
> +
> +    if ( ( type == PSR_MASK_TYPE_L3_DATA ||
> +         type == PSR_MASK_TYPE_L3_CODE ) &&
> +         pFeat->feature != PSR_SOCKET_L3_CDP )
> +        return -ENXIO;

Please either combine the two if()s into a single switch() (with
appropriate fall through behavior) or, perhaps better, move
this second if() - suitably shrunk - past the following one. This is
to avoid redundant code, which makes things more difficult to
follow.

> +    /* CDP */
> +    if ( pFeat->feature == PSR_SOCKET_L3_CDP )
> +    {
> +        if ( old_cos > cat_info.cos_max )
> +        {
> +            /* Data */
> +            mask[0] = pFeat->cos_reg_val[0];
> +            /* Code */
> +            mask[1] = pFeat->cos_reg_val[1];
> +        }
> +        else
> +        {
> +            /* Data */
> +            mask[0] = pFeat->cos_reg_val[old_cos * 2];
> +            /* Code */
> +            mask[1] = pFeat->cos_reg_val[old_cos * 2 + 1];
> +        }
> +
> +        /* Set new mask */
> +        if ( type == PSR_MASK_TYPE_L3_DATA )
> +            mask[0] = m;
> +        if ( type == PSR_MASK_TYPE_L3_CODE )
> +            mask[1] = m;
> +
> +        return 2;
> +    }
> +
> +    /* CAT */
> +    if ( old_cos > cat_info.cos_max )
> +        mask[0] =  pFeat->cos_reg_val[0];
> +    else
> +        mask[0] =  pFeat->cos_reg_val[old_cos];

Across the entire function - wouldn't it be easier to range check
old_cos once up front, forcing it to zero if the check fails? That
would reduce code size to a better readable amount. (And just to
avoid any misunderstanding - comments like this also may apply
to other functions; I'm not going to repeat them there.)

Also note that there are undue double blanks in here.

> +    if ( type == PSR_MASK_TYPE_L3_CBM )
> +        mask[0] = m;

This overwriting behavior also looks quite strange to me. What's
the purpose? And if this really is meant to be that way, why is
this not (leaving aside the other suggested adjustment)

    if ( type == PSR_MASK_TYPE_L3_CBM )
        mask[0] = m;
    else if ( old_cos > cat_info.cos_max )
        mask[0] = pFeat->cos_reg_val[0];
    else
        mask[0] = pFeat->cos_reg_val[old_cos];

?

> +static int l3_cat_get_feat_info(struct feat_list *pFeat, enum mask_type type,
> +                                uint32_t *dat0, uint32_t *dat1,
> +                                uint32_t *dat2)
> +{
> +    struct psr_cat_lvl_info cat_info;
> +
> +    if ( type != PSR_MASK_TYPE_L3_DATA &&
> +         type != PSR_MASK_TYPE_L3_CODE &&
> +         type != PSR_MASK_TYPE_L3_CBM )
> +        return 0;
> +
> +    memcpy(&cat_info, pFeat->feat_info, sizeof(struct psr_cat_lvl_info));
> +
> +    *dat0 = cat_info.cbm_len;
> +    *dat1 = cat_info.cos_max;
> +
> +    if ( pFeat->feature == PSR_SOCKET_L3_CDP )
> +        *dat2 |= XEN_SYSCTL_PSR_CAT_L3_CDP;
> +    else
> +        *dat2 = 0;
> +
> +    return 1;
> +}

The latest here it becomes clear that you want to use switch()
much more widely.

>  static inline void psr_assoc_init(void)
>  {
>      struct psr_assoc *psra = &this_cpu(psr_assoc);
> +    struct psr_socket_alloc_info *info;
> +    unsigned int cos_max;
> +    unsigned int socket;
>  
> -    if ( cat_socket_info )
> +    if ( socket_alloc_info )
>      {
> -        unsigned int socket = cpu_to_socket(smp_processor_id());
> +        socket = cpu_to_socket(smp_processor_id());

What's the point of moving the variable declaration? Quite the
other way around, the new variable you add could - afaict - all
live in this more narrow scope. And info could presumably become
a pointer to const.

> -int psr_get_cat_l3_info(unsigned int socket, uint32_t *cbm_len,
> -                        uint32_t *cos_max, uint32_t *flags)
> +int psr_get_info(unsigned int socket, enum mask_type type,
> +                 uint32_t *dat0, uint32_t *dat1,
> +                 uint32_t *dat2)
>  {
> -    struct psr_cat_socket_info *info = get_cat_socket_info(socket);
> +    struct psr_socket_alloc_info *info = get_socket_alloc_info(socket);
> +    struct feat_list *pTmp;
>  
>      if ( IS_ERR(info) )
>          return PTR_ERR(info);
>  
> -    *cbm_len = info->cbm_len;
> -    *cos_max = info->cos_max;
> +    if ( NULL == info->pFeat || NULL == info->pFeat->pNext)
> +       return -ENODEV;
>  
> -    *flags = 0;
> -    if ( cdp_is_enabled(socket) )
> -        *flags |= XEN_SYSCTL_PSR_CAT_L3_CDP;
> +    pTmp = info->pFeat->pNext;
> +    while ( pTmp )
> +    {
> +        if ( pTmp->ops.get_feat_info(pTmp, type, dat0, dat1, dat2) )

Considering that this is one of the generic ops: What guarantees
that 3 elements will suffice going forward? I.e. why is this not an
array?

> -int psr_get_l3_cbm(struct domain *d, unsigned int socket,
> -                   uint64_t *cbm, enum cbm_type type)
> +static int get_old_set_new(uint64_t *mask,
> +                           struct psr_socket_alloc_info *info,
> +                           unsigned int old_cos,
> +                           enum mask_type type,
> +                           uint64_t m)
>  {
> -    struct psr_cat_socket_info *info = get_cat_socket_info(socket);
> -    bool_t cdp_enabled = cdp_is_enabled(socket);
> -
> -    if ( IS_ERR(info) )
> -        return PTR_ERR(info);
> +    struct feat_list *pTmp;
> +    int ret;
>  
> -    switch ( type )
> -    {
> -    case PSR_CBM_TYPE_L3:
> -        if ( cdp_enabled )
> -            return -EXDEV;
> -        *cbm = info->cos_to_cbm[d->arch.psr_cos_ids[socket]].cbm;
> -        break;
> +    if ( !mask )
> +        return -EINVAL;
>  
> -    case PSR_CBM_TYPE_L3_CODE:
> -        if ( !cdp_enabled )
> -            *cbm = info->cos_to_cbm[d->arch.psr_cos_ids[socket]].cbm;
> -        else
> -            *cbm = info->cos_to_cbm[d->arch.psr_cos_ids[socket]].code;
> -        break;
> +    if ( !info || !info->pFeat )
> +        return -ENODEV;

Earlier on you used (slightly oddly arranged) checks against NULL.
Please be consistent (and I'd prefer if you used ! across the board).

> -    case PSR_CBM_TYPE_L3_DATA:
> -        if ( !cdp_enabled )
> -            *cbm = info->cos_to_cbm[d->arch.psr_cos_ids[socket]].cbm;
> -        else
> -            *cbm = info->cos_to_cbm[d->arch.psr_cos_ids[socket]].data;
> -        break;
> +    pTmp = info->pFeat->pNext;
> +    while ( pTmp )
> +    {
> +        /* mask getting and setting order is same as feature list */
> +        ret = pTmp->ops.get_old_set_new(mask, pTmp, old_cos, type, m);
> +        if ( ret < 0 )
> +            return ret;
>  
> -    default:
> -        ASSERT_UNREACHABLE();
> +        mask += ret;

With this - what ensures the array isn't going to be overrun?

And I'm also having some difficulty seeing how an array of outputs
matches up with exactly one input (m). Can you explain this please?

> -static bool_t psr_check_cbm(unsigned int cbm_len, uint64_t cbm)
> +int psr_get_val(struct domain *d, unsigned int socket,
> +                uint64_t *val, enum mask_type type)
>  {
> -    unsigned int first_bit, zero_bit;
> -
> -    /* Set bits should only in the range of [0, cbm_len). */
> -    if ( cbm & (~0ull << cbm_len) )
> -        return 0;
> +    struct psr_socket_alloc_info *info = get_socket_alloc_info(socket);
> +    unsigned int cos = d->arch.psr_cos_ids[socket];
> +    struct feat_list *pTmp;
> +    int ret;
>  
> -    /* At least one bit need to be set. */
> -    if ( cbm == 0 )
> -        return 0;
> +    if ( IS_ERR(info) )
> +        return PTR_ERR(info);
>  
> -    first_bit = find_first_bit(&cbm, cbm_len);
> -    zero_bit = find_next_zero_bit(&cbm, cbm_len, first_bit);
> +    if ( NULL == info->pFeat || NULL == info->pFeat->pNext )
> +        return -ENODEV;
>  
> -    /* Set bits should be contiguous. */
> -    if ( zero_bit < cbm_len &&
> -         find_next_bit(&cbm, cbm_len, zero_bit) < cbm_len )
> -        return 0;
> +    pTmp = info->pFeat->pNext;
> +    while ( pTmp )
> +    {
> +        ret = pTmp->ops.get_val(pTmp, cos, type, val);
> +        if ( ret < 0 )
> +            return -EINVAL;

Why don't you return "ret" here?

> +        else if ( ret > 0 )

Pointless "else".

> -static void do_write_l3_cbm(void *data)
> +static void do_write_psr_ref(void *data)

What does "_ref" stand for?

> -static int find_cos(struct psr_cat_cbm *map, unsigned int cos_max,
> -                    uint64_t cbm_code, uint64_t cbm_data, bool_t cdp_enabled)
> +static int find_cos(uint64_t *mask, enum mask_type type,
> +                    struct psr_socket_alloc_info *info)
>  {
>      unsigned int cos;
> +    struct psr_ref *map = info->cos_ref;
> +    struct feat_list *pTmp;
> +    uint64_t *pMask = mask;
> +    int ret;
> +    bool_t found = 0;
> +    unsigned int cos_max = 0;
> +
> +    /* cos_max is the max COS of the feature to be set. */
> +    pTmp = info->pFeat->pNext;
> +    while ( pTmp )
> +    {
> +        cos_max = pTmp->ops.get_cos_max_as_type(pTmp, type);
> +        if ( cos_max > 0 )
> +            break;
> +
> +        pTmp = pTmp->pNext;
> +    }
>  
> +    pTmp = info->pFeat->pNext;
>      for ( cos = 0; cos <= cos_max; cos++ )
>      {
> -        if ( (map[cos].ref || cos == 0) &&
> -             ((!cdp_enabled && map[cos].cbm == cbm_code) ||
> -              (cdp_enabled && map[cos].code == cbm_code &&
> -                              map[cos].data == cbm_data)) )
> +        if ( 0 != cos && 0 == map[cos].ref )
> +            continue;
> +
> +        while ( pTmp )
> +        {
> +            /*
> +             * Compare mask according to feature list order.
> +             * We must follow this order because mask is assembled
> +             * as this order in get_old_set_new().
> +             */
> +            ret = pTmp->ops.compare_mask(pMask, pTmp, cos, &found);
> +            if ( ret < 0 )
> +                return ret;
> +
> +            /* If fail to match, go to next cos to compare. */
> +            if ( !found )
> +                break;
> +
> +            pMask += ret;
> +            pTmp = pTmp->pNext;
> +        }
> +
> +        /* Every feature can match, this cos is what we find. */
> +        if ( found )
>              return cos;
> +
> +        /* Not found, need find again from beginning. */
> +        pTmp = info->pFeat->pNext;
> +        pMask = mask;

Please do once (at the appropriate place inside the outer loop) what
you have done prior to the loop and repeat here. As already alluded
to - limiting the scope of variables also helps ensuring that things
work as intended.

>      }
>  
> +    printk(XENLOG_INFO "%s: cannot find cos.\n", __func__);

Again a questionable log message with a questionable use of __func__.
In any case log messages should normally not have a full stop at their
end.

> -static int pick_avail_cos(struct psr_cat_cbm *map, unsigned int cos_max,
> -                          unsigned int old_cos)
> +static int check_exceed_range(uint64_t *mask, struct feat_list *pTmp,
> +                              unsigned int cos)
>  {
> +    unsigned int ret = 0;

Pointless initializer.

> +static int alloc_new_cos(struct psr_socket_alloc_info *info,
> +                         uint64_t *mask, unsigned int old_cos,
> +                         enum mask_type type)
> +{
> +    unsigned int cos_max = 0;
> +    struct feat_list *pTmp;
>      unsigned int cos;
> +    struct psr_ref *map = info->cos_ref;
> +
> +    /*
> +     * cos_max is the max COS of the feature to be set.
> +     */
> +    pTmp = info->pFeat->pNext;
> +    while ( pTmp )
> +    {
> +        cos_max = pTmp->ops.get_cos_max_as_type(pTmp, type);
> +        if ( cos_max > 0 )
> +            break;
> +
> +        pTmp = pTmp->pNext;
> +    }
> +
> +    if ( !cos_max )
> +        return -ENOENT;
>  
>      /* If old cos is referred only by the domain, then use it. */
>      if ( map[old_cos].ref == 1 && old_cos != 0 )
> +    {
> +        pTmp = info->pFeat->pNext;
> +        if ( check_exceed_range(mask, pTmp, old_cos) )
> +            goto find_new;
> +
>          return old_cos;
> +    }
>  
> +find_new:

Labels indented by at least one blank please. In this case, though,
it looks like you could easily do without any label and goto.

> -int psr_set_l3_cbm(struct domain *d, unsigned int socket,
> -                   uint64_t cbm, enum cbm_type type)
> +int psr_set_val(struct domain *d, unsigned int socket,
> +                uint64_t val, enum mask_type type)
>  {
> -    unsigned int old_cos, cos_max;
> +    unsigned int old_cos;
>      int cos, ret;
> -    uint64_t cbm_data, cbm_code;
> -    bool_t cdp_enabled = cdp_is_enabled(socket);
> -    struct psr_cat_cbm *map;
> -    struct psr_cat_socket_info *info = get_cat_socket_info(socket);
> +    struct psr_ref *map;
> +    uint64_t *mask;
> +    struct psr_socket_alloc_info *info = get_socket_alloc_info(socket);
>  
>      if ( IS_ERR(info) )
>          return PTR_ERR(info);
>  
> -    if ( !psr_check_cbm(info->cbm_len, cbm) )
> -        return -EINVAL;
> -
> -    if ( !cdp_enabled && (type == PSR_CBM_TYPE_L3_CODE ||
> -                          type == PSR_CBM_TYPE_L3_DATA) )
> -        return -ENXIO;
> -
> -    cos_max = info->cos_max;
> +    map = info->cos_ref;
>      old_cos = d->arch.psr_cos_ids[socket];
> -    map = info->cos_to_cbm;
> -
> -    switch ( type )
> -    {
> -    case PSR_CBM_TYPE_L3:
> -        cbm_code = cbm;
> -        cbm_data = cbm;
> -        break;
> -
> -    case PSR_CBM_TYPE_L3_CODE:
> -        cbm_code = cbm;
> -        cbm_data = map[old_cos].data;
> -        break;
> -
> -    case PSR_CBM_TYPE_L3_DATA:
> -        cbm_code = map[old_cos].code;
> -        cbm_data = cbm;
> -        break;
>  
> -    default:
> -        ASSERT_UNREACHABLE();
> -        return -EINVAL;
> -    }
> +    /* Considering CDP liking features */

DYM "like"? And what do you want to say with this comment?

> +    mask = xzalloc_array(uint64_t, info->nr_feat * 2);

Perhaps it's related to the doubling of nr_feat here? If so - what
precludes another future feature to require 3 or 4 slots per element?
I think this needs abstracting out, now that you create an abstract
framework.

> +    if ( (ret = get_old_set_new(mask, info, old_cos, type, val)) != 0 )

What if the xzalloc_array() returned NULL?

> +        return ret;
>  
> -    spin_lock(&info->cbm_lock);
> -    cos = find_cos(map, cos_max, cbm_code, cbm_data, cdp_enabled);
> +    spin_lock(&info->mask_lock);
> +    cos = find_cos(mask, type, info);
>      if ( cos >= 0 )
>      {
>          if ( cos == old_cos )
>          {
> -            spin_unlock(&info->cbm_lock);
> +            spin_unlock(&info->mask_lock);
> +            xfree(mask);
>              return 0;
>          }
>      }
>      else
>      {
> -        cos = pick_avail_cos(map, cos_max, old_cos);
> +        cos = alloc_new_cos(info, mask, old_cos, type);
>          if ( cos < 0 )
>          {
> -            spin_unlock(&info->cbm_lock);
> +            spin_unlock(&info->mask_lock);
> +            xfree(mask);
>              return cos;
>          }
>  
> -        /* We try to avoid writing MSR. */
> -        if ( (cdp_enabled &&
> -             (map[cos].code != cbm_code || map[cos].data != cbm_data)) ||
> -             (!cdp_enabled && map[cos].cbm != cbm_code) )
> +        /* Write all features mask MSRs corresponding to the COS */
> +        ret = write_psr_ref(socket, cos, mask);
> +        if ( ret )
>          {
> -            ret = write_l3_cbm(socket, cos, cbm_code, cbm_data, cdp_enabled);
> -            if ( ret )
> -            {
> -                spin_unlock(&info->cbm_lock);
> -                return ret;
> -            }
> -            map[cos].code = cbm_code;
> -            map[cos].data = cbm_data;
> +            spin_unlock(&info->mask_lock);
> +            xfree(mask);
> +            return ret;
>          }
>      }
>  
>      map[cos].ref++;
>      map[old_cos].ref--;
> -    spin_unlock(&info->cbm_lock);
> +
> +    spin_unlock(&info->mask_lock);
>  
>      d->arch.psr_cos_ids[socket] = cos;
> +    xfree(mask);
> +    mask = NULL;
>  
>      return 0;

I don't think setting mask to NULL right ahead of a return statement
is very helpful.

> -static void cat_cpu_init(void)
> +static void internal_cpu_init(void)

Is "internal_" really a useful prefix here?

>  {
>      unsigned int eax, ebx, ecx, edx;
> -    struct psr_cat_socket_info *info;
> +    struct psr_socket_alloc_info *info;
>      unsigned int socket;
>      unsigned int cpu = smp_processor_id();
> -    uint64_t val;
>      const struct cpuinfo_x86 *c = cpu_data + cpu;
> +    struct feat_list *pTmp;
>  
>      if ( !cpu_has(c, X86_FEATURE_PQE) || c->cpuid_level < 
> PSR_CPUID_LEVEL_CAT )
>          return;
>  
>      socket = cpu_to_socket(cpu);
> -    if ( test_bit(socket, cat_socket_enable) )
> +    info = socket_alloc_info + socket;
> +    if ( info->features )
>          return;
>  
>      cpuid_count(PSR_CPUID_LEVEL_CAT, 0, &eax, &ebx, &ecx, &edx);
>      if ( ebx & PSR_RESOURCE_TYPE_L3 )
>      {
> -        cpuid_count(PSR_CPUID_LEVEL_CAT, 1, &eax, &ebx, &ecx, &edx);
> -        info = cat_socket_info + socket;
> -        info->cbm_len = (eax & 0x1f) + 1;
> -        info->cos_max = min(opt_cos_max, edx & 0xffff);
> -
> -        info->cos_to_cbm = temp_cos_to_cbm;
> -        temp_cos_to_cbm = NULL;
> -        /* cos=0 is reserved as default cbm(all ones). */
> -        info->cos_to_cbm[0].cbm = (1ull << info->cbm_len) - 1;
> +        pTmp = pL3CAT;
> +        if ( !pTmp )
> +            return;

You check the allocation result in internal_cpu_prepare() - can you
really get here with this being NULL? If you can't, drop the if() (or
make it an ASSERT()).

> +        pL3CAT = NULL;
>  
> -        spin_lock_init(&info->cbm_lock);
> -
> -        set_bit(socket, cat_socket_enable);
> -
> -        if ( (ecx & PSR_CAT_CDP_CAPABILITY) && (opt_psr & PSR_CDP) &&
> -             cdp_socket_enable && !test_bit(socket, cdp_socket_enable) )
> -        {
> -            info->cos_to_cbm[0].code = (1ull << info->cbm_len) - 1;
> -            info->cos_to_cbm[0].data = (1ull << info->cbm_len) - 1;
> -
> -            /* We only write mask1 since mask0 is always all ones by 
> default. */
> -            wrmsrl(MSR_IA32_PSR_L3_MASK(1), (1ull << info->cbm_len) - 1);
> -
> -            rdmsrl(MSR_IA32_PSR_L3_QOS_CFG, val);
> -            wrmsrl(MSR_IA32_PSR_L3_QOS_CFG, val | (1 << 
> PSR_L3_QOS_CDP_ENABLE_BIT));
> -
> -            /* Cut half of cos_max when CDP is enabled. */
> -            info->cos_max >>= 1;
> -
> -            set_bit(socket, cdp_socket_enable);
> -        }
> -        printk(XENLOG_INFO "CAT: enabled on socket %u, cos_max:%u, 
> cbm_len:%u, CDP:%s\n",
> -               socket, info->cos_max, info->cbm_len,
> -               cdp_is_enabled(socket) ? "on" : "off");
> +        /* Initialize this feature according to CPUID. */
> +        cpuid_count(PSR_CPUID_LEVEL_CAT, 1, &eax, &ebx, &ecx, &edx);
> +        pTmp->ops = l3_cat_ops;
> +        pTmp->ops.init_feature(eax, ebx, ecx, edx, pTmp, info);

This function can return without having consumed pTmp, which means
you'd leak the memory in that case.

> -static void __init psr_cat_free(void)
> +static void __init psr_free(void)
>  {
> -    xfree(cat_socket_enable);
> -    cat_socket_enable = NULL;
> -    xfree(cat_socket_info);
> -    cat_socket_info = NULL;
> +    int i;

unsigned int

> -static void __init init_psr_cat(void)
> +static void __init init_psr(void)
>  {
> +    int i;

Again.

> +
>      if ( opt_cos_max < 1 )
>      {
>          printk(XENLOG_INFO "CAT: disabled, cos_max is too small\n");
>          return;
>      }
>  
> -    cat_socket_enable = xzalloc_array(unsigned long, 
> BITS_TO_LONGS(nr_sockets));
> -    cat_socket_info = xzalloc_array(struct psr_cat_socket_info, nr_sockets);
> -    cdp_socket_enable = xzalloc_array(unsigned long, 
> BITS_TO_LONGS(nr_sockets));
> +    socket_alloc_info = xzalloc_array(struct psr_socket_alloc_info, 
> nr_sockets);
> +
> +    if ( !socket_alloc_info )
> +    {
> +        printk(XENLOG_INFO "Fail to alloc socket_alloc_info!\n");
> +        return;
> +    }
>  
> -    if ( !cat_socket_enable || !cat_socket_info )
> -        psr_cat_free();
> +    for ( i = 0; i < nr_sockets; i++ )
> +    {
> +        socket_alloc_info[i].pFeat =  xzalloc(struct feat_list);

What's the point of this dummy list element?

> +        if ( NULL == socket_alloc_info[i].pFeat )
> +        {
> +            printk(XENLOG_INFO "Fail to alloc pFeat!\n");

How would the reader of the log easily associate such a message
with something having gone wrong in PSR initialization?

> +            return;
> +        }
> +        socket_alloc_info[i].pFeat->pNext = NULL;

This is redundant with the use of xzalloc().

>  static void psr_cpu_init(void)
>  {
> -    if ( cat_socket_info )
> -        cat_cpu_init();
> +    unsigned int socket;
> +    struct psr_socket_alloc_info *info;
> +
> +    if ( socket_alloc_info )
> +    {
> +        socket = cpu_to_socket(smp_processor_id());
> +        info = socket_alloc_info + socket;
> +        info->cos_ref = temp_cos_ref;
> +        temp_cos_ref = NULL;
> +        spin_lock_init(&info->mask_lock);
> +
> +        internal_cpu_init();

Why is everything ahead of this call not part of that function?

> --- a/xen/include/asm-x86/psr.h
> +++ b/xen/include/asm-x86/psr.h
> @@ -46,10 +46,10 @@ struct psr_cmt {
>      struct psr_cmt_l3 l3;
>  };
>  
> -enum cbm_type {
> -    PSR_CBM_TYPE_L3,
> -    PSR_CBM_TYPE_L3_CODE,
> -    PSR_CBM_TYPE_L3_DATA,
> +enum mask_type {
> +    PSR_MASK_TYPE_L3_CBM,
> +    PSR_MASK_TYPE_L3_CODE,
> +    PSR_MASK_TYPE_L3_DATA,
>  };

The enum tag is now too generic a name - please make this
psr_mask_type or something similar.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.