[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 16/20] xen/riscv: Implement superpage splitting for p2m mappings


  • To: Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Mon, 11 Aug 2025 13:59:01 +0200
  • Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL
  • Cc: Alistair Francis <alistair.francis@xxxxxxx>, Bob Eshleman <bobbyeshleman@xxxxxxxxx>, Connor Davis <connojdavis@xxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Julien Grall <julien@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Mon, 11 Aug 2025 11:59:18 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 31.07.2025 17:58, Oleksii Kurochko wrote:
> Add support for down large memory mappings ("superpages") in the RISC-V
> p2m mapping so that smaller, more precise mappings ("finer-grained entries")
> can be inserted into lower levels of the page table hierarchy.
> 
> To implement that the following is done:
> - Introduce p2m_split_superpage(): Recursively shatters a superpage into
>   smaller page table entries down to the target level, preserving original
>   permissions and attributes.
> - p2m_set_entry() updated to invoke superpage splitting when inserting
>   entries at lower levels within a superpage-mapped region.
> 
> This implementation is based on the ARM code, with modifications to the part
> that follows the BBM (break-before-make) approach, some parts are simplified
> as according to RISC-V spec:
>   It is permitted for multiple address-translation cache entries to co-exist
>   for the same address. This represents the fact that in a conventional
>   TLB hierarchy, it is possible for multiple entries to match a single
>   address if, for example, a page is upgraded to a superpage without first
>   clearing the original non-leaf PTE’s valid bit and executing an SFENCE.VMA
>   with rs1=x0, or if multiple TLBs exist in parallel at a given level of the
>   hierarchy. In this case, just as if an SFENCE.VMA is not executed between
>   a write to the memory-management tables and subsequent implicit read of the
>   same address: it is unpredictable whether the old non-leaf PTE or the new
>   leaf PTE is used, but the behavior is otherwise well defined.
> In contrast to the Arm architecture, where BBM is mandatory and failing to
> use it in some cases can lead to CPU instability, RISC-V guarantees
> stability, and the behavior remains safe — though unpredictable in terms of
> which translation will be used.
> 
> Additionally, the page table walk logic has been adjusted, as ARM uses the
> opposite number of levels compared to RISC-V.

As before, I think you mean "numbering".

> --- a/xen/arch/riscv/p2m.c
> +++ b/xen/arch/riscv/p2m.c
> @@ -539,6 +539,91 @@ static void p2m_free_subtree(struct p2m_domain *p2m,
>      p2m_free_page(p2m, pg);
>  }
>  
> +static bool p2m_split_superpage(struct p2m_domain *p2m, pte_t *entry,
> +                                unsigned int level, unsigned int target,
> +                                const unsigned int *offsets)
> +{
> +    struct page_info *page;
> +    unsigned long i;
> +    pte_t pte, *table;
> +    bool rv = true;
> +
> +    /* Convenience aliases */
> +    mfn_t mfn = pte_get_mfn(*entry);
> +    unsigned int next_level = level - 1;
> +    unsigned int level_order = XEN_PT_LEVEL_ORDER(next_level);
> +
> +    /*
> +     * This should only be called with target != level and the entry is
> +     * a superpage.
> +     */
> +    ASSERT(level > target);
> +    ASSERT(pte_is_superpage(*entry, level));
> +
> +    page = p2m_alloc_page(p2m->domain);
> +    if ( !page )
> +    {
> +        /*
> +         * The caller is in charge to free the sub-tree.
> +         * As we didn't manage to allocate anything, just tell the
> +         * caller there is nothing to free by invalidating the PTE.
> +         */
> +        memset(entry, 0, sizeof(*entry));
> +        return false;
> +    }
> +
> +    table = __map_domain_page(page);
> +
> +    /*
> +     * We are either splitting a second level 1G page into 512 first level
> +     * 2M pages, or a first level 2M page into 512 zero level 4K pages.
> +     */

Such a comment is at risk of (silently) going stale when support for 512G
mappings is added. I wonder if it's really that informative to have here.

> +    for ( i = 0; i < XEN_PT_ENTRIES; i++ )
> +    {
> +        pte_t *new_entry = table + i;
> +
> +        /*
> +         * Use the content of the superpage entry and override
> +         * the necessary fields. So the correct permission are kept.
> +         */

It's not just permissions though? The memory type field also needs
retaining (and is being retained this way). Maybe better say "attributes"?

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.