This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


RE: [Xen-devel] Xen-unstable panic: FATAL PAGE FAULT

To: <keir.fraser@xxxxxxxxxxxxx>, <jbeulich@xxxxxxxxxx>
Subject: RE: [Xen-devel] Xen-unstable panic: FATAL PAGE FAULT
From: MaoXiaoyun <tinnycloud@xxxxxxxxxxx>
Date: Wed, 1 Sep 2010 17:06:14 +0800
Cc: xen devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Wed, 01 Sep 2010 02:06:51 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
Importance: Normal
In-reply-to: <C8A3D220.21A02%keir.fraser@xxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4C7E24BE02000078000139EC@xxxxxxxxxxxxxxxxxx>, <C8A3D220.21A02%keir.fraser@xxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thanks Keir.
I will run the test and keep you updated.
> Date: Wed, 1 Sep 2010 09:49:18 +0100
> Subject: Re: [Xen-devel] Xen-unstable panic: FATAL PAGE FAULT
> From: keir.fraser@xxxxxxxxxxxxx
> To: JBeulich@xxxxxxxxxx
> CC: tinnycloud@xxxxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxx
> On 01/09/2010 09:02, "Jan Beulich" <JBeulich@xxxxxxxxxx> wrote:
> >> Well I agree with your logic anyway. So I don't see that this can be the
> >> cause of MaoXiaoyun's bug. At least not directly. But then I'm stumped as to
> >> why the page arithmetic and checks in free_heap_pages are (apparently)
> >> resulting in a page pointer way outside the frame-table region and actually
> >> in the directmap region.
> >
> > There must be some unchecked use of PAGE_LIST_NULL, i.e.
> > running off a list end without taking notice (0xffff8315ffffffe4
> > exactly corresponds with that).
&g t; Okay, my next guess then is that we are deleting a chunk from the wrong list
> head. I don't see any check that the adjacent chunks we are considering to
> merge are from the same node and zone. I suppose the zone logic does just
> work as we're dealing with 2**x aligned and sized regions. But, shouldn't
> the merging logic in free_heap_pages be checking that the merging candidate
> is from the same NUMA node? I see I have an ASSERTion later in the same
> function, but it's too weak and wishful I suspect.
> MaoXiaoyun: can you please test with the attached patch? If I'm right, you
> will crash on one of the BUG_ON checks that I added, rather than crashing on
> a pointer dereference. You may even crash during boot. Anyhow, what is
> interesting is whether this patch always makes you crash on BUG_ON before
> you would normally crash on pointer dereference. If so this is trivial to
> fix.
> Thanks,
> Keir
Xen-devel mailing list