[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] PV domU with 255GB boot failure





Jan Beulich wrote:
>>>> Mukesh Rathor <mukesh.rathor@xxxxxxxxxx> 18.02.09 04:39 >>>
>
> First a general remark: You're doing this patch to support 256G domains,
> but by keeping extend_init_mapping() there'll continue to be no way to
> support domains with close to or above 512G (or, if making use of
> XEN_ELFNOTE_INIT_P2M, 1T). This function, rather than needing fixes,
> really just needs to go away.
>
> I've done this in our forward ported 2.6.27+ kernels, but unfortunately
> can't really contribute the changes to the 2.6.18 tree, as there are too
> many differences, and I'm simply lacking the time (and, honestly, interest)
> to work out all the issues. I could post the respective patch if you (or
> someone else) care(s).
>

I came up with this patch trying to fix the hang on less than 256 GB.
With 256 GB it's not even coming this far, pl see another thread. Since
256 GB was the original bug, we definitely need to support that. So please
post your patches with any relevant pointers and I'll take a crack at it...

thanks,
Mukesh



>> --- init-xen.c.orig    2009-02-17 18:58:58.716954000 -0800
>> +++ init-xen.c 2009-02-17 19:33:57.310074000 -0800
>> @@ -587,35 +587,45 @@ void __init xen_init_pt(void)
>> static void __init extend_init_mapping(unsigned long tables_space)
>> {
>>        unsigned long va = __START_KERNEL_map;
>> -      unsigned long phys, addr, *pte_page;
>> +      unsigned long phys, addr, *pte_page, *pmd_page;
>> +      pud_t *pud;
>>        pmd_t *pmd;
>>        pte_t *pte, new_pte;
>> -      unsigned long *page = (unsigned long *)init_level4_pgt;
>> +      unsigned long *pud_page = (unsigned long *)init_level4_pgt;
>
> This is certainly confusing. Please use typeless names here, or names
> that properly reflect what they're used for.
>
>> -      addr = page[pgd_index(va)];
>> -      addr_to_page(addr, page);
>> -      addr = page[pud_index(va)];
>> -      addr_to_page(addr, page);
>> -
>> -      /* Kill mapping of low 1MB. */
>> +      /* Kill low mappings */
>>        while (va < (unsigned long)&_text) {
>>                if (HYPERVISOR_update_va_mapping(va, __pte_ma(0), 0))
>>                        BUG();
>>                va += PAGE_SIZE;
>>        }
>> +      addr = pud_page[pgd_index(va)];    /* get pud entry from pgd tbl */
>> +      addr_to_page(addr, pud_page);      /* pud_page now va of pud tbl */
>
> If you follow above request for using proper (or type-neutral) names, you
> won't need extra comments here.
>
> Also, the code could stay at the place it was so far (since clearly 
pgd_index()
> can never be different for __START_KERNEL_map and _text).
>
>>        /* Ensure init mappings cover kernel text/data and initial tables. */
>>        while (va < (__START_KERNEL_map
>>                     + (start_pfn << PAGE_SHIFT)
>>                     + tables_space)) {
>> -              pmd = (pmd_t *)&page[pmd_index(va)];
>> +
>> + pud = (pud_t *)&pud_page[pud_index(va)]; /* get pud entry */
>> +                if (pud_none(*pud)) {
>> +                        pmd_page = alloc_static_page(&phys);
>> +                        early_make_page_readonly(
>> +                                pmd_page, XENFEAT_writable_page_tables);
>> +                        set_pud(pud, __pud(phys | _KERNPG_TABLE));
>> +                } else {
>> +                        addr = pud_page[pud_index(va)];
>> +                        addr_to_page(addr, pmd_page);
>> +                }
>> +
>> +              pmd = (pmd_t *)&pmd_page[pmd_index(va)];
>>                if (pmd_none(*pmd)) {
>>                        pte_page = alloc_static_page(&phys);
>>                        early_make_page_readonly(
>>                                pte_page, XENFEAT_writable_page_tables);
>>                        set_pmd(pmd, __pmd(phys | _KERNPG_TABLE));
>>                } else {
>> -                      addr = page[pmd_index(va)];
>> +                      addr = pmd_page[pmd_index(va)];
>>                        addr_to_page(addr, pte_page);
>>                }
>>                pte = (pte_t *)&pte_page[pte_index(va)];
>> @@ -630,7 +640,7 @@ static void __init extend_init_mapping(u
>>
>>        /* Finally, blow away any spurious initial mappings. */
>>        while (1) {
>
> Didn't you indicate in an earlier mail that we're lacking a pud_none() check
> here? Just by adding the pud allocation above, you're not excluding to
> run over a pud entry boundary here.
>
> Likewise would pmd_page need to be updated here when you cross a
> pud entry boundary.
>
>> -              pmd = (pmd_t *)&page[pmd_index(va)];
>> +              pmd = (pmd_t *)&pmd_page[pmd_index(va)];
>>                if (pmd_none(*pmd))
>>                        break;
>>                if (HYPERVISOR_update_va_mapping(va, __pte_ma(0), 0))
>> @@ -719,6 +729,7 @@ static void xen_finish_init_mapping(void
>>        table_end = start_pfn;
>> }
>>
>> +
>> /* Setup the direct mapping of the physical memory at PAGE_OFFSET.
>>    This runs before bootmem is initialized and gets pages directly from the
>>    physical memory. To access them they are temporarily mapped. */
>
> Please omit this last hunk altogether.
>
> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.