Xen project Mailing List

Re: [PATCH v4 7/8] pdx: introduce a new compression algorithm based on region offsets

From: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Date: Tue, 5 Aug 2025 16:37:59 +0200

Cc: Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx>, Community Manager <community.manager@xxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx

Delivery-date: Tue, 05 Aug 2025 14:38:13 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Tue, Aug 05, 2025 at 02:28:22PM +0200, Jan Beulich wrote: > On 05.08.2025 11:52, Roger Pau Monne wrote: > > --- a/xen/common/pdx.c > > +++ b/xen/common/pdx.c > > @@ -24,6 +24,7 @@ > > #include <xen/param.h> > > #include <xen/pfn.h> > > #include <xen/sections.h> > > +#include <xen/sort.h> > > > > /** > > * Maximum (non-inclusive) usable pdx. Must be > > @@ -40,6 +41,12 @@ bool __mfn_valid(unsigned long mfn) > > > > #ifdef CONFIG_PDX_MASK_COMPRESSION > > invalid |= mfn & pfn_hole_mask; > > +#elif defined(CONFIG_PDX_OFFSET_COMPRESSION) > > +{ > > + unsigned long base = pfn_bases[PFN_TBL_IDX(mfn)]; > > + > > + invalid |= mfn < base || mfn >= base + pdx_region_size; > > +} > > #endif > > Hmm, didn't notice this earlier on: Brace placement looks odd here. I think > they want to be indented by one level, as they aren't starting a function > body. Right, I can adjust. Since they are inside of the ifdef block it did look kind of OK to me, and avoided having to indent the content one extra level. > > @@ -294,7 +308,245 @@ void __init pfn_pdx_compression_reset(void) > > nr_ranges = 0; > > } > > > > -#endif /* CONFIG_PDX_COMPRESSION */ > > +#elif defined(CONFIG_PDX_OFFSET_COMPRESSION) /* > > CONFIG_PDX_MASK_COMPRESSION */ > > + > > +unsigned int __ro_after_init pfn_index_shift; > > +unsigned int __ro_after_init pdx_index_shift; > > + > > +unsigned long __ro_after_init pfn_pdx_lookup[CONFIG_PDX_NR_LOOKUP]; > > +unsigned long __ro_after_init pdx_pfn_lookup[CONFIG_PDX_NR_LOOKUP]; > > +unsigned long __ro_after_init pfn_bases[CONFIG_PDX_NR_LOOKUP]; > > +unsigned long __ro_after_init pdx_region_size = ~0UL; > > For cache locality, might this last one better also move ahead of the arrays? Oh, yes, this was a late addition and I clearly didn't place enough attention when adding it. > > +bool pdx_is_region_compressible(paddr_t base, unsigned long npages) > > +{ > > + unsigned long pfn = PFN_DOWN(base); > > + unsigned long pfn_base = pfn_bases[PFN_TBL_IDX(pfn)]; > > + > > + return pfn >= pfn_base && > > + pfn + npages <= pfn_base + pdx_region_size; > > +} > > + > > +static int __init cf_check cmp_node(const void *a, const void *b) > > +{ > > + const struct pfn_range *l = a; > > + const struct pfn_range *r = b; > > + > > + if ( l->base_pfn > r->base_pfn ) > > + return 1; > > + if ( l->base_pfn < r->base_pfn ) > > + return -1; > > + > > + return 0; > > +} > > + > > +static void __init cf_check swp_node(void *a, void *b) > > +{ > > + SWAP(a, b); > > +} > > This hasn't changed from v3, and still looks wrong to me. Oh, I did recall a comment to that regard, but somehow forgot to apply it, I'm sorry, I've now fixed it. > > +bool __init pfn_pdx_compression_setup(paddr_t base) > > +{ > > + unsigned long mask = PFN_DOWN(pdx_init_mask(base)), idx_mask = 0; > > + unsigned long pages = 0; > > + unsigned int i; > > + > > + if ( !nr_ranges ) > > + { > > + printk(XENLOG_DEBUG "PFN compression disabled%s\n", > > + pdx_compress ? ": no ranges provided" : ""); > > + return false; > > + } > > + > > + if ( nr_ranges > ARRAY_SIZE(ranges) ) > > + { > > + printk(XENLOG_WARNING > > + "Too many PFN ranges (%u > %zu), not attempting PFN > > compression\n", > > + nr_ranges, ARRAY_SIZE(ranges)); > > + return false; > > + } > > + > > + /* Sort ranges by start address. */ > > + sort(ranges, nr_ranges, sizeof(*ranges), cmp_node, swp_node); > > + > > + for ( i = 0; i < nr_ranges; i++ ) > > + { > > + unsigned long start = ranges[i].base_pfn; > > + > > + /* > > + * Align range base to MAX_ORDER. This is required so the PDX > > offset > > + * for the bits below MAX_ORDER matches the MFN offset, and pages > > + * greater than the minimal order can be used to populate the > > + * directmap. > > + */ > > + ranges[i].base_pfn = start & ~((1UL << MAX_ORDER) - 1); > > + ranges[i].pages = start + ranges[i].pages - ranges[i].base_pfn; > > + > > + /* > > + * Only merge overlapped regions now, leave adjacent regions > > separated. > > + * They would be merged later if both use the same index into the > > + * lookup table. > > + */ > > + if ( !i || > > + ranges[i].base_pfn >= > > + (ranges[i - 1].base_pfn + ranges[i - 1].pages) ) > > + { > > + mask |= pdx_region_mask(ranges[i].base_pfn, ranges[i].pages); > > + continue; > > + } > > + > > + ranges[i - 1].pages = ranges[i].base_pfn + ranges[i].pages - > > + ranges[i - 1].base_pfn; > > + > > + if ( i + 1 < nr_ranges ) > > + memmove(&ranges[i], &ranges[i + 1], > > + (nr_ranges - (i + 1)) * sizeof(ranges[0])); > > + else /* last range */ > > + mask |= pdx_region_mask(ranges[i].base_pfn, ranges[i].pages); > > + nr_ranges--; > > + i--; > > + } > > + > > + /* > > + * Populate a mask with the non-equal bits of the different ranges, do > > this > > + * to calculate the maximum PFN shift to use as the lookup table index. > > + */ > > + for ( i = 0; i < nr_ranges; i++ ) > > + for ( unsigned int j = 0; j < nr_ranges; j++ ) > > + idx_mask |= (ranges[i].base_pfn & ~mask) ^ > > + (ranges[j].base_pfn & ~mask); > > "mask" is loop invariant - can't the AND-ing be pulled out, after the loop? I've applied both of the above, thanks for the help. Regards, Roger.

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.