[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Arm boot regression with Xen 4.12



(+ Juergen)

Hi Amit,

On 3/18/19 3:12 PM, Amit Tomer wrote:
>> It will be difficult to help without any log. You probably want to try with
>> Stefano series instead. However ...
> 
> If we comment out GPU node(gpu@38000000) , we don't see this issue and
> Dom0 kernel is
> loaded into memory but we following crash:
> 
> Starting kernel ...
> 
> - UART enabled -
> - CPU 00000000 booting -
> - Current EL 00000008 -
> - Xen starting at EL2 -
> - Zero BSS -
> - Setting up control registers -
> - Turning on paging -
> - Ready -
> (XEN) Checking for initrd in /chosen
> (XEN) RAM: 0000000040000000 - 00000000bfffffff
> (XEN)
> (XEN) MODULE[0]: 00000000be511000 - 00000000be51d000 Device Tree
> (XEN) MODULE[1]: 0000000040480000 - 0000000042680000 Kernel
> (XEN)  RESVD[0]: 0000000043000000 - 000000004300c000
> (XEN)  RESVD[1]: 00000000be511000 - 00000000be51d000

[...]

> (XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input)
> (XEN) Data Abort Trap. Syndrome=0x6
> (XEN) Walking Hypervisor VA 0x8 on CPU0 via TTBR 0x0000000042114000
> (XEN) 0TH[0x0] = 0x0000000042113f7f
> (XEN) 1ST[0x0] = 0x0000000042110f7f
> (XEN) 2ND[0x0] = 0x0000000000000000
> (XEN) CPU0: Unexpected Trap: Data Abort
> (XEN) ----[ Xen-4.12.0-rc  arm64  debug=y   Not tainted ]----
> (XEN) CPU:    0
> (XEN) PC:     000000000021c220 page_alloc.c#free_heap_pages+0x3b0/0x58c

[...]

> (XEN) Xen call trace:
> (XEN)    [<000000000021c220>] page_alloc.c#free_heap_pages+0x3b0/0x58c (PC)
> (XEN)    [<000000000021c20c>] page_alloc.c#free_heap_pages+0x39c/0x58c (LR)
> (XEN)    [<000000000021e5f4>] page_alloc.c#init_heap_pages+0x334/0x4ec
> (XEN)    [<000000000021e840>] init_domheap_pages+0x94/0x9c
> (XEN)    [<000000000024e178>] free_init_memory+0xac/0xe0
> (XEN)    [<0000000000252580>] setup.c#init_done+0x14/0x20
> (XEN)    [<000000000029daa8>] 000000000029daa8
> (XEN)
> (XEN)
> (XEN) ****************************************
> (XEN) Panic on CPU 0:
> (XEN) CPU0: Unexpected Trap: Data Abort
> (XEN) ****************************************
> (XEN)
> (XEN) Reboot in five seconds...

Could you give a try to the below patch?

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 01ae2cccc0..2c34138bbd 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -1139,7 +1139,7 @@ void free_init_memory(void)
         *(p + i) = insn;
 
     set_pte_flags_on_range(__init_begin, len, mg_clear);
-    init_domheap_pages(pa, pa + len);
+    dt_unreserved_regions(pa, pa + len, init_domheap_pages, 0);
     printk("Freed %ldkB init memory.\n", (long)(__init_end-__init_begin)>>10);
 }

diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 444857a967..8dbc4f819b 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -764,18 +764,18 @@ void __init start_xen(unsigned long boot_phys_offset,
               "Please check your bootloader.\n",
               fdt_paddr);
 
-    fdt_size = boot_fdt_info(device_tree_flattened, fdt_paddr);
-
-    cmdline = boot_fdt_cmdline(device_tree_flattened);
-    printk("Command line: %s\n", cmdline);
-    cmdline_parse(cmdline);
-
     /* Register Xen's load address as a boot module. */
     xen_bootmodule = add_boot_module(BOOTMOD_XEN,
                              (paddr_t)(uintptr_t)(_start + boot_phys_offset),
                              (paddr_t)(uintptr_t)(_end - _start + 1), false);
     BUG_ON(!xen_bootmodule);
 
+    fdt_size = boot_fdt_info(device_tree_flattened, fdt_paddr);
+
+    cmdline = boot_fdt_cmdline(device_tree_flattened);
+    printk("Command line: %s\n", cmdline);
+    cmdline_parse(cmdline);
+
     setup_pagetables(boot_phys_offset);
 
     setup_mm(fdt_paddr, fdt_size);

Now the long answer.

Unfortunately, in a recent page, I removed the log telling where
Xen lives in memory,  so I am not 100% sure this is your problem.

From my own testing, I think the problem is Xen will try to hand reserved
memory (the old fashion /memreserve/ and not /reserved-regions) to the
allocator. This happen when freeing the init regions (see free_init_memory).

We do handle correctly all the others modules (see discard_initial_modules).

On my setup this does not crash Xen, instead it happily hand the page to
the allocator which is not good. The difference in behavior may be because
on how the PDX is setup (I need to investigate that). So by luck, I have
a struct page_info backing the reserved-memory region. This does not
mean it is better :).

This regression was introduced by commit f60658c6ae "xen/arm: Stop
relocating Xen". Before hand, Xen was always relocated so the original
Xen was left untouched. The relocated version would always live in
non-reserved area.

On my setup, Xen was not in the reserved region area by default. I had
to modify the Device-Tree. I don't know how many platform are putting
Xen in /memreserve/ region. Amit, assuming the patch above works for you,
could you tell who created the /memreserve/?

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.