This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


RE: [Xen-devel] Xen-unstable panic: FATAL PAGE FAULT

To: <keir.fraser@xxxxxxxxxxxxx>, <jbeulich@xxxxxxxxxx>
Subject: RE: [Xen-devel] Xen-unstable panic: FATAL PAGE FAULT
From: MaoXiaoyun <tinnycloud@xxxxxxxxxxx>
Date: Wed, 1 Sep 2010 15:17:07 +0800
Cc: xen devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Wed, 01 Sep 2010 00:18:00 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
Importance: Normal
In-reply-to: <C8A2F47D.2191E%keir.fraser@xxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <C8A2EDE0.21912%keir.fraser@xxxxxxxxxxxxx>, <C8A2F47D.2191E%keir.fraser@xxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
As I go through the chunk merge code in free_heap_pages, one thing I'd like
to mention is, previously, I printted out all domain pages when allocated,
and I found the order in assgin_pages in /xen-4.0.0/xen/common/page_alloc.c:1087,
the order either be 0, or 9, and later I know that is because domain U populate physmap
2M Bytes everytime.
And here in the while statement,  the order is compare with MAX_ORDER, which is 20.
I wonder if it might have some clues.
 532     /* Merge chunks as far as possible. */
 533     while ( order < MAX_ORDER )
 534     {
 535         mask = 1UL << order;
> Date: Tue, 31 Aug 2010 18:03:41 +0100
> Subject: Re: [Xen-devel] Xen-unstable panic: FATAL PAGE FAULT
> From: keir.fraser@xxxxxxxxxxxxx
> To: JBeulich@xxxxxxxxxx
> CC: tinnycloud@xxxxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxx
> On 31/08/2010 17:35, "Keir Fraser" <keir.fraser@xxxxxxxxxxxxx> wrote:
> >> That's somewhat implicit: srat_parse_regions() gets passed an
> >> address that is at least BOOTSTRAP_DIRECTMAP_END (i.e. 4G).
> >> Thus srat_parse_regions() starts off with a mask with the lower
> >> 32 bits all set (only more bits can get set subsequently). Thus
> >> the ear liest zero bit pfn_pdx_hole_setup() can find is bit 20
> >> (due to the >> PAGE_SHIFT in the invocation). Consequently
> >> the smallest chunk where arithmetic is valid really is 4Gb, not
> >> 256Mb as I first wrote.
> >
> > Well, that's a bit too implicit for me. How about we initialise 'j' to
> > MAX_ORDER in pfn_pdx_hole_setup() with a comment about supporting page_info
> > pointer arithmetic within allocatable multi-page regions?
> Well I agree with your logic anyway. So I don't see that this can be the
> cause of MaoXiaoyun's bug. At least not directly. But then I'm stumped as to
> why the page arithmetic and checks in free_heap_pages are (apparently)
> resulting in a page pointer way outside the frame-table region and actually
> in the directmap region.
> I think an obvious next step wpuld be to get your boot output, MaoXiaoyun.
> Ca n you please post it? And you may as well stop your memtest if you haven't
> already. If you've seen the issue on more than one machine then it certainly
> isn't due to that kind of hardware failure.
> -- Keir
Xen-devel mailing list