[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 16/18] x86: introduce helper for recording degree of contiguity in page tables


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Mon, 20 Dec 2021 16:25:11 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=6r9yazLJ7J3E/Km7y/bxviJ5JNZbO7ImVN3JZKYD1PQ=; b=ABtRYRQwU80BXl6cfIO8laWhJO0OjeHPxSBcOhIJC+t7Q68IZRymlskfdW7EqX6K6JN9XJ41LJzja1jNEXCA+YseLbrhGYQfeeD2O/V1zRX4UEkog3rmloZqfZH53TMAMiayEhj5CEIetBTaSee7wK/8Hxx+oOhgf7LdXmcKY2/ZU8fk9eyKrzCkPtf84wyf9tm6X7uXr985s2mH4+X95IhEaJTyuBrek0WMRnl+G50yHrmTURNOkDX/er8Oe47QAguaj91nk8FVC2EkNhdldm2nMtY/c8/6TPIcCm2lS/joQCFa1PK/pAtucN44InVs3L5qmVqIjErjpBAnsAuD4g==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=UBYCt+Ysv6wihteTRvi1qnA0mfuYel1dWnYr2+vWvJezeezkIbAdAMdn67Acj7TLUZbGyIbbDRJABzC6wF0WgwhikFStD1Gu2hdnuSbZVZBjBh+j+lfCgxgQVaZqiqSxXOtLefHmwAshqFA28eD4peTCch5ixHpT+hws4sXURlcxsOwFsowx2J5yCCK2Q9k4RlJhgroDSyIuHZZEwLvH/qgIVcThV+I49GBTQG3UVSOIBg7NSdeeRB2bKp09w3jitZXvjLxDZlZXxCScecwpyG6sWbqhDB6ouRE5zxfZDKh4TMBMAf5Hbt1n3tBW2B+D4znjavr1YdDylL0VbudbjA==
  • Authentication-results: esa1.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "Andrew Cooper" <andrew.cooper3@xxxxxxxxxx>, Paul Durrant <paul@xxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Mon, 20 Dec 2021 15:26:48 +0000
  • Ironport-data: A9a23:SyKIF6soV/mmgMfbIYseTHq9eufnVI5ZMUV32f8akzHdYApBsoF/q tZmKWuEPPnbZmWnfdwjYIW28k9SvpLTmtdkSwtk/C43RSsT+JbJXdiXEBz9bniYRiHhoOOLz Cm8hv3odp1coqr0/0/1WlTZQP0VOZigHtIQMsadUsxKbVIiGHdJZS5LwbZj29cy24DhWmthh PupyyHhEA79s9JLGjp8B5Kr8HuDa9yr5Vv0FnRnDRx6lAe2e0s9VfrzFonoR5fMeaFGH/bSe gr25OrRElU1XfsaIojNfr7TKiXmS1NJVOSEoiI+t6OK2nCuqsGuu0qS2TV1hUp/0l20c95NJ NplmK2sZ1o1ZY72yOE2agUBNQYgO6FU0eqSSZS/mZT7I0zudnLtx7NlDV0sPJ1e8eFyaY1M3 aVGcnZXNEnF3r/ohuLgIgVvrp1LwM3DJoQQt2sm1TjEJf0nXYrCU+PB4towMDIY2JgeR62PP ZRxhTxHTQ/ueT9IBmouK4sgvc6smmu8UgVKkQfAzUYwyzeKl1EguFT3C/LNc8GObdVYmACfv G2u13v9KgEXMpqY0zXt2nCmi/LLnCj7cJkPD7D+/flv6HWDy2pWBBAIWF+TpfiillX4S99ZM 1YT+Cclse417kPDczXmd0Tm+jje5EdaAocOVb1hgO2Q9kbKyyuoGkZZUmViUowjn/YqHBoNi FmFjfq8UFSDr4apYX6a876Vqxa7Ni4UMXIOaEc4cOcV3zXwiNpt10ySF76PBIbw14SoQm+on 1hmuQBn3+1L5fPnwZlX6rwub9iEgpHSBjA46QzMNo5OxlMoPdX1D2BEBLWy0BqhEGp7ZgTY1 JTns5LHhAzrMX1qvHbSKNjh5Jnzu5643MT02DaD5aUJ+TW34GKEdotN+jx4L0oBGp9aIm64O x6P510LtcA70J6WgUhfOd3ZNijX5fK4SYSNug78M7KinaSdhCfYpXozNCZ8LkjmkVQ2kLFXB HtoWZ3EMJruMow+lGDeb75EidcDn3lirUuOFcGT50n2itK2OS/KIYrpxXPTN4jVGovf+16Lm zueXuPXoyhivBrWPnOKrNVNdA9SdhDWx/ne8qRqSwJKGSI/cEkJAP7N27IxPYtjmqVejODT+ X+hHERfzTLCabfvc1zihqlLZOy9UJBhg2g8OCBwb1+k12J6OdSk7bsFdotxdr4irbQxwflxR vgDWsOBHvUQFWiXp2VDNcHw/N54aRCmpQOSJC75MjIxSIFtGl7S8dj+cwqxqCRXVnirtdEzq qGL3x/ARcZRXBxrCcvbMar9z164sXUHtvh1Wk/EfotadEn2qdA4IC3tlP4nZcoLLEyblDed0 g+XBzYepPXM/NBpoIWY2/jcot7wQeVkH0dcE23K1pqMNHHXrji53ItNcOeUZjSBBmn6z7qvO LdOxPbmPfxZwFsT69hgE6xmxL4V7sf0o+MI1RxtGXjGYgj5Cr5kJXXaj8BDurcUm+1csAqyH EmO5sNbKfOCP8a8SAwdIw8sb+Ki0/AIm2aNsaRpcRuivCInrqCaVUhyPgWXjH0PJbR4B4op3 OM9tZNE8Ae4kBcrbo6Lgy08G75g9ZDcv3HLbq0nPbI=
  • Ironport-hdrordr: A9a23:Ws6C9KAKyWWun03lHeg2sceALOsnbusQ8zAXPh9KJyC9I/b2qy nxppgmPH/P6Ar4WBkb6La90Y27MA7hHPlOkPUs1NaZLXPbUQ6TTb2KgrGSpgEIdxeOktK1kJ 0QDJSWa+eAfWSS7/yKmDVQeuxIqLLsndHK9IXjJjVWPHpXgslbnnZE422gYzRLrWd9dP0E/M 323Ls4m9PsQwVcUu2LQl0+G8TTrdzCk5zrJTYAGh4c8QGLyRel8qTzHRS01goXF2on+8ZvzU H11yjCoomzufCyzRHRk0fV8pRtgdPkjv9OHtaFhMQ5IijlziyoeINicbufuy1dmpDj1H8a1P 335zswNcV67H3cOkmzvBvWwgHllA0j7nfzoGXoyEfLkIjcfnYXGsBBjYVWfl/y8Ew7puxx16 pNwiawq4dXJQmoplW92/H4EzVR0makq3srluAey1ZFV5EFVbNXpYsDuGtIDZY7Gj7g4oxPKp ghMCjl3ocUTbqmVQGagoE2q+bcG0jbXy32DXTqg/blkwS/xxtCvg8lLM92pAZ3yHtycegC2w 3+CNUbqFh5dL5gUUtMPpZzfSKJMB25ffvtChPbHb21LtBNB5ryw6SHlIndotvaPqA18A==
  • Ironport-sdr: 2QWbXfMpdHUxY5OH6WOhiwhtmRCtG2+XBb+5HNbwGe3JLLEt9XoMaHYTba7QFJEnSu05N777FG u/oIkDZTVKCWp7ISMbr/OOAbEdXPCPz0mkOWJhbywqM+eNWVs9iDMPtxpKxOgMR6/Ddd1Dy4p0 hFMOrO90GNxK38SfEYEt4bNGrcJGHYfgKS/qVRA9W2uiUfwSbcoNqa9oGlIXYFOqr8ENO0N5a7 5RHCDOIuimYxi5xyT7U7p8V8pWo/Rg8H6zV6XDSuGI4rqSLkQjdVHWnDhbTRzK6uxDsz8WffMO SI1r6nS6pwp/rFU+mpGVLEB/
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Thu, Dec 16, 2021 at 04:47:30PM +0100, Jan Beulich wrote:
> On 15.12.2021 14:57, Roger Pau Monné wrote:
> > On Fri, Sep 24, 2021 at 11:55:30AM +0200, Jan Beulich wrote:
> >> --- /dev/null
> >> +++ b/xen/include/asm-x86/contig-marker.h
> >> @@ -0,0 +1,105 @@
> >> +#ifndef __ASM_X86_CONTIG_MARKER_H
> >> +#define __ASM_X86_CONTIG_MARKER_H
> >> +
> >> +/*
> >> + * Short of having function templates in C, the function defined below is
> >> + * intended to be used by multiple parties interested in recording the
> >> + * degree of contiguity in mappings by a single page table.
> >> + *
> >> + * Scheme: Every entry records the order of contiguous successive entries,
> >> + * up to the maximum order covered by that entry (which is the number of
> >> + * clear low bits in its index, with entry 0 being the exception using
> >> + * the base-2 logarithm of the number of entries in a single page table).
> >> + * While a few entries need touching upon update, knowing whether the
> >> + * table is fully contiguous (and can hence be replaced by a higher level
> >> + * leaf entry) is then possible by simply looking at entry 0's marker.
> >> + *
> >> + * Prereqs:
> >> + * - CONTIG_MASK needs to be #define-d, to a value having at least 4
> >> + *   contiguous bits (ignored by hardware), before including this file,
> >> + * - page tables to be passed here need to be initialized with correct
> >> + *   markers.
> > 
> > Given this requirement I think it would make sense to place the page
> > table marker initialization currently placed in iommu_alloc_pgtable as
> > a helper here also?
> 
> I would be nice, yes, but it would also cause problems. I specifically do
> not want to make the function here "inline". Hence a source file including
> it would need to be given a way to suppress its visibility to the compiler.
> Which would mean #ifdef-ary I'd prefer to avoid. Yet by saying "prefer" I
> mean to leave open that I could be talked into doing what you suggest.

Could you mark those as __maybe_unused? Or would you rather like to
assert that they are used when included?

> >> + */
> >> +
> >> +#include <xen/bitops.h>
> >> +#include <xen/lib.h>
> >> +#include <xen/page-size.h>
> >> +
> >> +/* This is the same for all anticipated users, so doesn't need passing 
> >> in. */
> >> +#define CONTIG_LEVEL_SHIFT 9
> >> +#define CONTIG_NR          (1 << CONTIG_LEVEL_SHIFT)
> >> +
> >> +#define GET_MARKER(e) MASK_EXTR(e, CONTIG_MASK)
> >> +#define SET_MARKER(e, m) \
> >> +    ((void)(e = ((e) & ~CONTIG_MASK) | MASK_INSR(m, CONTIG_MASK)))
> >> +
> >> +enum PTE_kind {
> >> +    PTE_kind_null,
> >> +    PTE_kind_leaf,
> >> +    PTE_kind_table,
> >> +};
> >> +
> >> +static bool update_contig_markers(uint64_t *pt, unsigned int idx,
> > 
> > Maybe pt_update_contig_markers, so it's not such a generic name.
> 
> I can do that. The header may then want to be named pt-contig-marker.h
> or pt-contig-markers.h. Thoughts?

Seems fine to me.

> >> +                                  unsigned int level, enum PTE_kind kind)
> >> +{
> >> +    unsigned int b, i = idx;
> >> +    unsigned int shift = (level - 1) * CONTIG_LEVEL_SHIFT + PAGE_SHIFT;
> >> +
> >> +    ASSERT(idx < CONTIG_NR);
> >> +    ASSERT(!(pt[idx] & CONTIG_MASK));
> >> +
> >> +    /* Step 1: Reduce markers in lower numbered entries. */
> >> +    while ( i )
> >> +    {
> >> +        b = find_first_set_bit(i);
> >> +        i &= ~(1U << b);
> >> +        if ( GET_MARKER(pt[i]) > b )
> >> +            SET_MARKER(pt[i], b);
> >> +    }
> >> +
> >> +    /* An intermediate table is never contiguous with anything. */
> >> +    if ( kind == PTE_kind_table )
> >> +        return false;
> >> +
> >> +    /*
> >> +     * Present entries need in sync index and address to be a candidate
> >> +     * for being contiguous: What we're after is whether ultimately the
> >> +     * intermediate table can be replaced by a superpage.
> >> +     */
> >> +    if ( kind != PTE_kind_null &&
> >> +         idx != ((pt[idx] >> shift) & (CONTIG_NR - 1)) )
> > 
> > Don't you just need to check that the address is aligned to at least
> > idx, not that it's exactly aligned?
> 
> No, that wouldn't be sufficient. We're not after a general "is
> contiguous" here, but strictly after "is this slot meeting the
> requirements for the whole table eventually getting replaced by a
> superpage".

I see, makes sense. I didn't relate this check to the 'replaced by a
superpage' part of the comment.

> >> +        return false;
> >> +
> >> +    /* Step 2: Check higher numbered entries for contiguity. */
> >> +    for ( b = 0; b < CONTIG_LEVEL_SHIFT && !(idx & (1U << b)); ++b )
> >> +    {
> >> +        i = idx | (1U << b);
> >> +        if ( (kind == PTE_kind_leaf
> >> +              ? ((pt[i] ^ pt[idx]) & ~CONTIG_MASK) != (1ULL << (b + 
> >> shift))
> > 
> > Maybe this could be a macro, CHECK_CONTIG or some such? It's also used
> > below.
> 
> Hmm, yes, this might indeed help readability. There's going to be a
> lot of parameters though; not sure whether omitting all(?) parameters
> for such a locally used macro would be considered acceptable.
> 
> > I would also think this would be clearer as:
> > 
> > (pt[idx] & ~CONTIG_MASK) + (1ULL << (shift + b)) == (pt[i] & ~CONTIG_MASK)
> 
> By using + we'd consider entries contiguous which for our purposes
> shouldn't be considered so. Yes, the earlier check should already
> have caught that case, but I'd like the checks to be as tight as
> possible.
> 
> >> +              : pt[i] & ~CONTIG_MASK) ||
> > 
> > Isn't PTE_kind_null always supposed to be empty?
> 
> Yes (albeit this could be relaxed, but then the logic here would
> need to know where the "present" bit(s) is/are).
> 
> > (ie: wouldn't this check always succeed?)
> 
> No - "kind" describes pt[idx], not pt[i].
> 
> >> +             GET_MARKER(pt[i]) != b )
> >> +            break;
> >> +    }
> >> +
> >> +    /* Step 3: Update markers in this and lower numbered entries. */
> >> +    for ( ; SET_MARKER(pt[idx], b), b < CONTIG_LEVEL_SHIFT; ++b )
> >> +    {
> >> +        i = idx ^ (1U << b);
> >> +        if ( (kind == PTE_kind_leaf
> >> +              ? ((pt[i] ^ pt[idx]) & ~CONTIG_MASK) != (1ULL << (b + 
> >> shift))
> >> +              : pt[i] & ~CONTIG_MASK) ||
> >> +             GET_MARKER(pt[i]) != b )
> >> +            break;
> >> +        idx &= ~(1U << b);
> > 
> > There's an iteration where idx will be 0, and then there's no further
> > point in doing the & anymore?
> 
> Yes, but doing the & anyway is cheaper than adding a conditional.

I think it might be interesting to add some kind of unit testing to
this code in tools/tests. It's a standalone piece of code that could
be easily tested for correct functionality. Not that you should do it
here, in fact it might be interesting for me to do so in order to
better understand the code.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.