WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] BUG: unable to handle kernel paging request - balloon_in

To: Scott Garron <xen-devel@xxxxxxxxxxxxxxxxxx>
Subject: Re: [Xen-devel] BUG: unable to handle kernel paging request - balloon_init - xen-4.1.0 - 2.6.32.39
From: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Date: Thu, 28 Apr 2011 14:30:19 -0400
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Thu, 28 Apr 2011 11:31:09 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4DB8AAA6.4050808@xxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4DB60C04.6050802@xxxxxxxxxxxxxxxxxx> <20110426031545.GB20779@xxxxxxxxxxxx> <4DB6522A.9000304@xxxxxxxxxxxxxxxxxx> <20110427200937.GA19853@xxxxxxxxxxxx> <4DB8AAA6.4050808@xxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.20 (2009-06-14)
On Wed, Apr 27, 2011 at 07:45:42PM -0400, Scott Garron wrote:
> On 04/27/2011 04:09 PM, Konrad Rzeszutek Wilk wrote:
> >Duh! I meant this one:
> >
> >[    0.316665] RIP: e030:[<ffffffff819a8aea>]  [<ffffffff819a8aea>]
> >balloon_init+0x20b/0x25e
> >
> >Sorry about that. Can you also run your kernel with 'initcall_debug
> >loglevel=8' please?
> 
>      Ok, I've put what I came up with here:
> 
> http://www.pridelands.org/~simba/xen-debug/debugnotes.txt
> 
>      I also added a few pr_info() lines around the offending code to try
> to get more of a handle of how far it is getting and what it's working
> on at the time of failure:

This looks quite odd. We had a flurry of issues like these before
were we "forgot" to set the P2M table correctly. So that during
[    0.000000] init_memory_mapping: 0000000100000000-00000001d9ff0000

it would crash b/c for PFNs above the 'dom0_mem' paramater we would
return INVALID value and the machine would crash - but only if the
value was not aligned (git commit f06e457cb729d58430d1385014fab367b2d4e7c2)

But that isn't the case here (dom0_mem=512M).

And you say it boots fine under DomU - so there is some P2M, E820 funkiness
happening here I think. Had you tried booting the kernel as Dom0 with different
sizes of dom0_mem ("dom0_mem=max:2GB?") Or without the dom0_mem parameter at 
all?

What is your CONFIG_XEN_MAX_DOMAIN_MEMORY set to?

> 
> ********
> 
> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
> index a065fda..b5f0650 100644
> --- a/drivers/xen/balloon.c
> +++ b/drivers/xen/balloon.c
> @@ -488,10 +488,13 @@ static int __init balloon_init(void)
>          */
>         extra_pfn_end = min(min(max_pfn, e820_end_of_ram_pfn()),
>                             (unsigned long)PFN_DOWN(xen_extra_mem_start
> + xen_ex
> +       pr_info("extra_pfn_end: 0x%x", extra_pfn_end); /* debug */
>         for (pfn = PFN_UP(xen_extra_mem_start);
>              pfn < extra_pfn_end;
>              pfn += balloon_npages) {
> +               pr_info("pfn: 0x%x", pfn); /* debug */
>                 page = pfn_to_page(pfn);
> +               pr_info("page: 0x%p", page);  /* debug */
>                 /* totalram_pages doesn't include the boot-time
>                    balloon extension, so don't subtract from it. */
>                 __balloon_append(page);
> 
> 
> ********
> 
>      The new serial console output, with "initcall_debug loglevel=8" and
> the pr_info() additions to the code can be found here:
> 
> http://www.pridelands.org/~simba/xen-debug/hailstorm-fullserial20110427.txt
> 
> ... but I'll paste the part closest to the crash here for your convenience:
> 
> [    1.016663] calling  balloon_init+0x0/0x280 @ 1
> [    1.016663] xen_balloon: Initialising balloon driver with page order 0.
> [    1.033446] last_pfn = 0x1d9ff0 max_arch_pfn = 0x400000000
> [    1.036663] extra_pfn_end: 0x1d9ff0
> [    1.036663] pfn: 0x100000
> [    1.036663] page: 0xffffea0003800000
> [    1.036663] BUG: unable to handle kernel paging request at
> ffffea0003800028
> [    1.036663] IP: [<ffffffff819a8b1f>] balloon_init+0x240/0x280
> [    1.036663] PGD 18402067 PUD 18403067 PMD 0
> 
> 
>      So the crash is happening within the first iteration of that for()
> loop, presumably while calling __balloon_append(page).  That's as far as
> I dove into it so far, but I figured I'd give you an update as to what
> I've found and tried.
> 
>      Just for more information sake, I also tried booting this kernel as
> a paravirt domU under the Debian Stable 2.6.32-5-xen-amd64 stock kernel
> and Xen 4.1.0.  It booted without incident (aside from a ridiculously
> long spew of printk's from my additions to that for() loop), so the
> failure is specific to the kernel booting as a dom0.  That probably
> doesn't narrow down much, but I figured it was noteworthy.
> 
> -- 
> Scott Garron

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel