[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 7/8] pdx: introduce a new compression algorithm based on region offsets



On Tue, Aug 05, 2025 at 02:28:22PM +0200, Jan Beulich wrote:
> On 05.08.2025 11:52, Roger Pau Monne wrote:
> > --- a/xen/common/pdx.c
> > +++ b/xen/common/pdx.c
> > @@ -24,6 +24,7 @@
> >  #include <xen/param.h>
> >  #include <xen/pfn.h>
> >  #include <xen/sections.h>
> > +#include <xen/sort.h>
> >  
> >  /**
> >   * Maximum (non-inclusive) usable pdx. Must be
> > @@ -40,6 +41,12 @@ bool __mfn_valid(unsigned long mfn)
> >  
> >  #ifdef CONFIG_PDX_MASK_COMPRESSION
> >      invalid |= mfn & pfn_hole_mask;
> > +#elif defined(CONFIG_PDX_OFFSET_COMPRESSION)
> > +{
> > +    unsigned long base = pfn_bases[PFN_TBL_IDX(mfn)];
> > +
> > +    invalid |= mfn < base || mfn >= base + pdx_region_size;
> > +}
> >  #endif
> 
> Hmm, didn't notice this earlier on: Brace placement looks odd here. I think
> they want to be indented by one level, as they aren't starting a function
> body.

Right, I can adjust.  Since they are inside of the ifdef block it did
look kind of OK to me, and avoided having to indent the content one
extra level.

> > @@ -294,7 +308,245 @@ void __init pfn_pdx_compression_reset(void)
> >      nr_ranges = 0;
> >  }
> >  
> > -#endif /* CONFIG_PDX_COMPRESSION */
> > +#elif defined(CONFIG_PDX_OFFSET_COMPRESSION) /* 
> > CONFIG_PDX_MASK_COMPRESSION */
> > +
> > +unsigned int __ro_after_init pfn_index_shift;
> > +unsigned int __ro_after_init pdx_index_shift;
> > +
> > +unsigned long __ro_after_init pfn_pdx_lookup[CONFIG_PDX_NR_LOOKUP];
> > +unsigned long __ro_after_init pdx_pfn_lookup[CONFIG_PDX_NR_LOOKUP];
> > +unsigned long __ro_after_init pfn_bases[CONFIG_PDX_NR_LOOKUP];
> > +unsigned long __ro_after_init pdx_region_size = ~0UL;
> 
> For cache locality, might this last one better also move ahead of the arrays?

Oh, yes, this was a late addition and I clearly didn't place enough
attention when adding it.

> > +bool pdx_is_region_compressible(paddr_t base, unsigned long npages)
> > +{
> > +    unsigned long pfn = PFN_DOWN(base);
> > +    unsigned long pfn_base = pfn_bases[PFN_TBL_IDX(pfn)];
> > +
> > +    return pfn >= pfn_base &&
> > +           pfn + npages <= pfn_base + pdx_region_size;
> > +}
> > +
> > +static int __init cf_check cmp_node(const void *a, const void *b)
> > +{
> > +    const struct pfn_range *l = a;
> > +    const struct pfn_range *r = b;
> > +
> > +    if ( l->base_pfn > r->base_pfn )
> > +        return 1;
> > +    if ( l->base_pfn < r->base_pfn )
> > +        return -1;
> > +
> > +    return 0;
> > +}
> > +
> > +static void __init cf_check swp_node(void *a, void *b)
> > +{
> > +    SWAP(a, b);
> > +}
> 
> This hasn't changed from v3, and still looks wrong to me.

Oh, I did recall a comment to that regard, but somehow forgot to apply
it, I'm sorry, I've now fixed it.

> > +bool __init pfn_pdx_compression_setup(paddr_t base)
> > +{
> > +    unsigned long mask = PFN_DOWN(pdx_init_mask(base)), idx_mask = 0;
> > +    unsigned long pages = 0;
> > +    unsigned int i;
> > +
> > +    if ( !nr_ranges )
> > +    {
> > +        printk(XENLOG_DEBUG "PFN compression disabled%s\n",
> > +               pdx_compress ? ": no ranges provided" : "");
> > +        return false;
> > +    }
> > +
> > +    if ( nr_ranges > ARRAY_SIZE(ranges) )
> > +    {
> > +        printk(XENLOG_WARNING
> > +               "Too many PFN ranges (%u > %zu), not attempting PFN 
> > compression\n",
> > +               nr_ranges, ARRAY_SIZE(ranges));
> > +        return false;
> > +    }
> > +
> > +    /* Sort ranges by start address. */
> > +    sort(ranges, nr_ranges, sizeof(*ranges), cmp_node, swp_node);
> > +
> > +    for ( i = 0; i < nr_ranges; i++ )
> > +    {
> > +        unsigned long start = ranges[i].base_pfn;
> > +
> > +        /*
> > +         * Align range base to MAX_ORDER.  This is required so the PDX 
> > offset
> > +         * for the bits below MAX_ORDER matches the MFN offset, and pages
> > +         * greater than the minimal order can be used to populate the
> > +         * directmap.
> > +         */
> > +        ranges[i].base_pfn = start & ~((1UL << MAX_ORDER) - 1);
> > +        ranges[i].pages = start + ranges[i].pages - ranges[i].base_pfn;
> > +
> > +        /*
> > +         * Only merge overlapped regions now, leave adjacent regions 
> > separated.
> > +         * They would be merged later if both use the same index into the
> > +         * lookup table.
> > +         */
> > +        if ( !i ||
> > +             ranges[i].base_pfn >=
> > +             (ranges[i - 1].base_pfn + ranges[i - 1].pages) )
> > +        {
> > +            mask |= pdx_region_mask(ranges[i].base_pfn, ranges[i].pages);
> > +            continue;
> > +        }
> > +
> > +        ranges[i - 1].pages = ranges[i].base_pfn + ranges[i].pages -
> > +                              ranges[i - 1].base_pfn;
> > +
> > +        if ( i + 1 < nr_ranges )
> > +            memmove(&ranges[i], &ranges[i + 1],
> > +                    (nr_ranges - (i + 1)) * sizeof(ranges[0]));
> > +        else /* last range */
> > +            mask |= pdx_region_mask(ranges[i].base_pfn, ranges[i].pages);
> > +        nr_ranges--;
> > +        i--;
> > +    }
> > +
> > +    /*
> > +     * Populate a mask with the non-equal bits of the different ranges, do 
> > this
> > +     * to calculate the maximum PFN shift to use as the lookup table index.
> > +     */
> > +    for ( i = 0; i < nr_ranges; i++ )
> > +        for ( unsigned int j = 0; j < nr_ranges; j++ )
> > +            idx_mask |= (ranges[i].base_pfn & ~mask) ^
> > +                        (ranges[j].base_pfn & ~mask);
> 
> "mask" is loop invariant - can't the AND-ing be pulled out, after the loop?

I've applied both of the above, thanks for the help.

Regards, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.