[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86: allow non-BIGMEM configs to boot on >= 16Tb systems



On Mon, Dec 18, 2023 at 09:26:24AM +0100, Jan Beulich wrote:
> On 15.12.2023 15:54, Roger Pau Monné wrote:
> > On Wed, Jun 07, 2023 at 08:17:30AM +0200, Jan Beulich wrote:
> >> While frame table setup, directmap init, and boot allocator population
> >> respect all intended bounds, the logic passing memory to the heap
> >> allocator which wasn't passed to the boot allocator fails to respect
> >> max_{pdx,pfn}. This then typically triggers the BUG() in
> >> free_heap_pages() after checking page state, because of hitting a struct
> >> page_info instance which was set to all ~0.
> >>
> >> Of course all the memory above the 16Tb boundary is still going to
> >> remain unused; using it requires BIGMEM=y. And of course this fix
> >> similarly ought to help BIGMEM=y configurations on >= 123Tb systems
> >> (where all the memory beyond that boundary continues to be unused).
> >>
> >> Fixes: bac2000063ba ("x86-64: reduce range spanned by 1:1 mapping and 
> >> frame table indexes")
> >> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> > 
> > Acked-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> 
> Thanks.
> 
> >> --- a/xen/arch/x86/setup.c
> >> +++ b/xen/arch/x86/setup.c
> >> @@ -1722,15 +1722,16 @@ void __init noreturn __start_xen(unsigne
> >>  
> >>      if ( max_page - 1 > virt_to_mfn(HYPERVISOR_VIRT_END - 1) )
> >>      {
> >> -        unsigned long limit = virt_to_mfn(HYPERVISOR_VIRT_END - 1);
> >> +        unsigned long lo = virt_to_mfn(HYPERVISOR_VIRT_END - 1);
> >> +        unsigned long hi = pdx_to_pfn(max_pdx - 1) + 1;
> > 
> > Maybe use max_page to avoid the pdx_to_pfn() call?  (And is also more
> > in context with the condition on the outside if).
> 
> You mean
> 
>         unsigned long hi = min(pdx_to_pfn(max_pdx - 1) + 1, max_page);
> 
> ? I could switch to that, yes. I wouldn't feel well switching to using
> just max_page, especially with me having nowhere to (reasonably) test.

Isn't max_page derived from max_pdx (see setup_max_pdx()), and
hence we could avoid the pdx_to_pfn() conversion by just using it?

max_page = pdx_to_pfn(max_pdx - 1) + 1;

So hi == max_page in your proposed code.

Maybe there are further restrictions applied to max_pdx that are not
propagated into max_page, the meaning of all those variables is very
opaque, and hard to follow in the source code.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.