[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH RFC] x86/mm: split unmapping and marking-as-I/O in arch_init_memory()


  • To: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Thu, 7 Aug 2025 11:43:31 +0200
  • Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Delivery-date: Thu, 07 Aug 2025 09:43:42 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 07.08.2025 11:37, Jan Beulich wrote:
> The unmapping part also wants to cover UNUSABLE regions, and it will now
> be necessary for space outside the low 16Mb (wherever Xen is placed).
> 
> While there, limit the scopes of involved variables.
> 
> Fixes: e4dd91ea85a3 ("x86: Ensure RAM holes really are not mapped in Xen's 
> ongoing 1:1 physmap")
> Fixes: 7cd7f2f5e116 ("x86/boot: Remove the preconstructed low 16M superpage 
> mappings")
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> ---
> Originally I had the unmap cover the entire range up to 4Gb, which made
> this ACPI mapping issue more pronounced: Mappings done by
> acpi_dmar_init(), erst_init(), and acpi_hest_init() may wrongly be
> undone this way. Having limited things to the initial mapping range, the
> risk of an overlap with some area which needs to remain mapped is lower,
> but it's not gone.
> 
> As the excess mappings could, at least in principle, also cause other
> issues (like mapping a range WB which must strictly be non-cachable), I
> wonder whether we can actually get away with those excess mappings,
> despite properly tearing them down in arch_init_memory() (directmap) and
> __start_xen() (Xen image space). Options would appear to be to move
> _end[] to a 2Mb aligned boundary (or at least extend the PT_LOAD segment
> to end at the next 2Mb boundary), or to use 4k mappings for the tail
> part of .bss. That would then also eliminate the remaining concern here.
> 
> Extending the PT_LOAD segment (in mkelf32) would apparently allow to do
> away with the hackery introduced by 773ded42218d ("Move cpu0_stack out
> of Xen text section and into BSS"), to "work around" a supposed linker
> bug (which really was a bug in the linker script imo). The extra command
> line argument then wouldn't be needed anymore.

Thinking about it, this would then also simplify ...

> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -275,8 +275,6 @@ static void __init assign_io_page(struct
>  
>  void __init arch_init_memory(void)
>  {
> -    unsigned long i, pfn, rstart_pfn, rend_pfn, iostart_pfn, ioend_pfn;
> -
>      /*
>       * Basic guest-accessible flags:
>       *   PRESENT, R/W, USER, A/D, AVAIL[0,1,2], AVAIL_HIGH, NX (if 
> available).
> @@ -292,12 +290,55 @@ void __init arch_init_memory(void)
>       * case the low 1MB.
>       */
>      BUG_ON(pvh_boot && trampoline_phys != 0x1000);
> -    for ( i = 0; i < 0x100; i++ )
> +    for ( unsigned int i = 0; i < MB(1) >> PAGE_SHIFT; i++ )
>          assign_io_page(mfn_to_page(_mfn(i)));
>  
> -    /* Any areas not specified as RAM by the e820 map are considered I/O. */
> -    for ( i = 0, pfn = 0; pfn < max_page; i++ )
> +    /*
> +     * Any areas not specified as RAM by the e820 map want to have no 
> mappings.
> +     * We may have established some by mapping more than necessary in head.S,
> +     * due to the use of super-pages there.
> +     */
> +    for ( unsigned long i = 0, pfn = 0,
> +                        rlimit_pfn = PFN_DOWN(PAGE_ALIGN_2M(__pa(_end)));
> +          pfn < rlimit_pfn; i++ )
>      {
> +        unsigned long rstart_pfn, rend_pfn, start_pfn;
> +
> +        while ( i < e820.nr_map &&
> +                e820.map[i].type != E820_RAM )
> +            i++;
> +
> +        if ( i >= e820.nr_map )
> +        {
> +            /* No more RAM regions: Unmap right to upper boundary. */
> +            rstart_pfn = rend_pfn = rlimit_pfn;
> +        }
> +        else
> +        {
> +            /* Unmap just up as far as next RAM region. */
> +            rstart_pfn = min(rlimit_pfn, PFN_UP(e820.map[i].addr));
> +            rend_pfn   = max(rstart_pfn,
> +                             PFN_DOWN(e820.map[i].addr + e820.map[i].size));
> +        }
> +
> +        /* NB: _start is already 2Mb-aligned. */
> +        start_pfn = max(pfn, PFN_DOWN(__pa(_start)));
> +        if ( start_pfn < rstart_pfn )
> +            destroy_xen_mappings((unsigned long)mfn_to_virt(start_pfn),
> +                                 (unsigned long)mfn_to_virt(rstart_pfn));
> +
> +        /* Skip the RAM region. */
> +        pfn = rend_pfn;
> +    }

... this: If we're loaded in an area that's a multiple of 2Mb in size (and
2Mb aligned), we wouldn't need to walk the E820 map anymore. The boot loader
(or whatever else) must have placed the entire image inside a RAM region.
Hence we could simply unmap [PAGE_ALIGN_4K(_end), PAGE_ALIGN_2M(_end)),
outside of any loop.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.