WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] Xen-unstable panic: FATAL PAGE FAULT

To: <keir.fraser@xxxxxxxxxxxxx>, xen devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: RE: [Xen-devel] Xen-unstable panic: FATAL PAGE FAULT
From: MaoXiaoyun <tinnycloud@xxxxxxxxxxx>
Date: Mon, 30 Aug 2010 16:47:44 +0800
Cc:
Delivery-date: Mon, 30 Aug 2010 01:48:30 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
Importance: Normal
In-reply-to: <C89BEE49.1F349%keir.fraser@xxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <BAY121-W8690831081CF2FAE58E38DA850@xxxxxxx>, <C89BEE49.1F349%keir.fraser@xxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Hi Keir:
 
       You are right about the if statement execution. After I rerun the reproduce, I never saw the
output log. Obviously, I made a mistake before and I apologize.
 
        Here is more what I found.
 
        1) We kicked off 2 reproduces, one successfully run more than 3days, the server is idle(idle
means only tests are run on the server, on other workload), and one succeed run 3days idle,
but when we do some other stuff(I remember is compiling kernel), the bug show up. Wired is
on normally the bug will show up in less than 24 hours based on our former test.
 
        2) Take previous failures of our test,  the bug show up may not solely related to VM reboot,
some other operations(such as tapdisk) might trigger the unexpected operation on domain's pages.
So when the VM is destroyed, pages are walked by free_heap_pages, which will finally go to panic.
(Which also indicates frequently reboot help to expose the bug earlier).  Is this possible?
 
       3) Every panic pointer to the same address:  ffff8315ffffffe4, which is not a valid page address.
I printted pages of the domain in assign_pages, which all looks like ffff82f60bd64000,  at least
ffff82f60 is the same.
      
       A bit of lost direction to go further.  Thanks.
 
> Date: Thu, 26 Aug 2010 10:11:21 +0100
> Subject: Re: [Xen-devel] Xen-unstable panic: FATAL PAGE FAULT
> From: keir.fraser@xxxxxxxxxxxxx
> To: tinnycloud@xxxxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxx
>
> On 26/08/2010 09:59, "MaoXiaoyun" <tinnycloud@xxxxxxxxxxx> wrote:
>
> > Appreciate for the detail.
> >
> > I notice the spin_lock for the code I referred, which as you mentioned will
> > introduce a deadlock.
> > In fact, during the 48 hours long run, there was a VM hung, and from the xm
> > list command,
> > the cpu time is quite high, to ten thousands, but the other VMS worked fine. I
> > don't know whether
> > it related to the potential deadlock, since Xen still worked.
> >
> > So a quick question is if we repla ce the spin_lock with spin_lock_recursive,
> > could we avoid this deadlock?
>
> Yes. But we don't understand why this change to MMUEXT_PIN_xxx would fix
> your observed bug, and without that understanding I wouldn't accept the
> change into the tree.
>
> > The if statement was executed during the test since I happend put the log and
> > got the output log.
>
> Tell us more. Like, for example, the domain id's of 'd' and 'pg_owner', and
> whether they are PV or HVM domains.
>
> > As a matter of fact, HVMS(all windowns 2003) under my test all are have PV
> > driver installed. I think that's why the patch take effects.
>
> Nope. That hypercall is to do with PV pagetable management. An HVM guest
> with PV drivers still has HVM pagetable management.
>
> > Besides, I have been working on this issue for sometime, it is not possible I
> > made a build mistake
> > since I have been carefully all the time.
> >
> > Anyway, I plan to kick off two reproduce on two physical servers, one has this
> > patch enabled(use spin_lock_recursive
> > instead of spin_lock) and the other with no change, completely on clean code.
> > It would be useful if u have some
> > trace to be added into the test. I will keep you informed.
>
> Whether this fixes your problem is a good data point, but without full
> understanding of the bug and why this is the correct and best fix, it will
> not be accepted I'm afraid.
>
> -- Keir
>
> > In addtion, my kernel is
> > 2.6.31.13-pvops-patch #1 SMP Tue Aug 24 11:23:51 CST 2010 x86_64 x86_64 x86_64
> > GNU/Linux
> > Xen is
> > 4.0.0
> >
> > Thanks.
> >
> >
> >
> >
> >> Date: Thu, 26 Aug 2010 08:39:03 +0100
> >> Subject: Re: [Xen-devel] Xen-unstable panic: FATAL PAGE FAULT
> >> From: keir.fraser@xxxxxxxxxxxxx
> >> To: tinnycloud@xxxxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxx
> >>
> >> On 26/08/2010 05:49, "MaoXiaoyun" <tinnycloud@xxxxxxxxxxx> wrote:
> >>
> >>> Hi:
> >>>
> >>> This issue can be easily reproduced by continuous and almost concurrently
> >>> reboot 12 Xen HVM VMS on a single physic server. The reproduce hit the back
> >>> trace about 6 to 14 hours after it started. I have several similar Xen back
> >>> traces, please refer to the end of the mail. The first three back traces
> >>> almost the same, they happened in domain_kill, while the last backtrace
> >>> happened in do_multicall.
> >>>
> >>> As go through th e Xen code, in /xen-4.0.0/xen/arch/x86/mm.c, it shows
> >>> that the author aware of the race competition between
> >>> domain_relinquish_resources and presented code. It occurred me to simply
> >>> move
> >>> line 2765 and 2766 before 2764, that is move put_page_and_type(page) into
> >>> the
> >>> spin_lock to avoid competition.
> >>
> >> Well, thanks for the detailed bug report: it is good to have a report that
> >> includes an attempt at a fix!
> >>
> >> In the below code, the put_page_and_type() is outside the locked region for
> >> good reason. Put_page_and_type() -> put_page() -> free_domheap_pages() which
> >> acquires d->page_alloc_lock. Because we do not use spin_lock_recursive() in
> >> the below code, this recursive acquisition of the lock in
> >> free_domheap _pages() would deadlock!
> >>
> >> Now, I do not think this fix really affected your testing anyway, because
> >> the below code is part of the MMUEXT_PIN_... hypercalls, and further is only
> >> triggered when a domain executes one of those hypercalls on *another*
> >> domain's memory. The *only* time that should happen is when dom0 builds a
> >> *PV* VM. So since all your testing is on HVM guests I wouldn't expect the
> >> code in the if() statement below to be executed ever. Well, maybe unless you
> >> are using qemu stub domains, or pvgrub.
> >>
> >> But even if the below code is being executed, I don't think your change is a
> >> fix, or anything that should greatly affect the system apart from
> >> introducing a deadlock. Is it instead possible that you somehow were testing
> >> a broken build of Xen before, and si mply re-building Xen with your change is
> >> what fixed things? I wonder if the bug stays gone away if you revert your
> >> change and re-build?
> >>
> >> If it still appears that your fix is good, I would add tracing to the below
> >> code and find out a bit more about when/why it is being executed.
> >>
> >> -- Keir
> >>
> >>> 2753 /* A page is dirtied when its pin status is set. */
> >>> 2754 paging_mark_dirty(pg_owner, mfn);
> >>> 2755
> >>> 2756 /* We can race domain destruction
> >>> (domain_relinquish_resources). */
> >>> 2757 if ( unlikely(pg_owner != d) )
> >>> 2758 {
> >>> 2759 int drop_ref;
> >>> 2760 spin_lock(&pg_owner->page_alloc_lock);
> >>> 2761 drop_ref = (pg_owner->is_dying &&
> >>& gt; 2762 test_and_clear_bit(_PGT_pinned,
> >>> 2763
> >>> &page->u.inuse.type_info));
> >>> 2764 spin_unlock(&pg_owner->page_alloc_lock);
> >>> 2765 if ( drop_ref )
> >>> 2766 put_page_and_type(page);
> >>> 2767 }
> >>> 2768
> >>> 2769 break;
> >>> 2770 }
> >>>
> >>> Form the result of reproduce on patched code, it appears the patch
> >>> worked well since the reproduce succeed during a 48hours long run. But I am
> >>> not sure of the side effects it brings.
> >>> Appreciate in advance if someone could give more clauses, thx.
> >>>
> >>> =============Trace 1: =============
> >>>
> >>> (XEN) ----[ Xen-4.0.0 x86_64 debug=y Not tainted ]----
> >>> (XEN) CPU: 0
> >>> (XEN) RIP: e008:[<ffff82c48011617c>] free_heap_pages+0x55a/0x575
> >>> (XEN) RFLAGS: 0000000000010286 CONTEXT: hypervisor
> >>> (XEN) rax: 0000001fffffffe0 rbx: ffff82f60b8bbfc0 rcx: ffff83063fe01a20
> >>> (XEN) rdx: ffff8315ffffffe0 rsi: ffff8315ffffffe0 rdi: 00000000ffffffff
> >>> (XEN) rbp: ffff82c48037fc98 rsp: ffff82c48037fc58 r8: 0000000000000000
> >>> (XEN) r9: ffffffffffffffff r10: ffff82c48020e770 r11: 0000000000000282
> >>> (XEN) r12: 00007d0a00000000 r13: 0000000000000000 r14: ffff82f60b8bbfe0
> >>> (XEN) r15: 0000000000000001 cr0: 000000008005003b cr4: 00000000000026f0
> >>> (XEN) cr3: 0000000232914000 cr2: ffff8315ffffffe4
> >>> (XEN) ds: 0000 es: 0000 fs: 0063 gs: 0000 ss: e010 cs: e008
> >>> (XEN) Xen stack trace from rsp=ffff82c48037fc58:
> >>> (XEN) 0000000000000016 0000000000000000 0000000 0000001a2 ffff8304afc40000
> >>> (XEN) 0000000000000000 ffff82f60b8bbfe0 00000000000330fe ffff82f60b8bc000
> >>> (XEN) ffff82c48037fcd8 ffff82c48011647e 0000000100000000 ffff82f60b8bbfe0
> >>> (XEN) ffff8304afc40020 0000000000000000 ffff8304afc40000 0000000000000000
> >>> (XEN) ffff82c48037fcf8 ffff82c480160caf ffff8304afc40000 ffff82f60b8bbfe0
> >>> (XEN) ffff82c48037fd68 ffff82c48014deaf 0000000000000ca3 ffff8304afc40fd8
> >>> (XEN) ffff8304afc40fd8 ffff8304afc40fd8 4000000000000000 ffff82c48037ff28
> >>> (XEN) 0000000000000000 ffff8304afc40000 ffff8304afc40000 000000000099e000
> >>> (XEN) 00000000ffffffda 0000000000000001 ffff82c48037fd98 ffff82c4801504de
> >>> (XEN) ffff8304afc40000 0000000000000000 000000000099e000 00000000ffffffda
> >>> (XEN) ffff82c48037fdb8 ffff82c4801062ee 000000000099e000 fffffffffffffff3
> >& gt;> (XEN) ffff82c48037ff08 ffff82c480104cd7 ffff82c40000f800 0000000000000286
> >>> (XEN) 0000000000000286 ffff8300bf76c000 000000ea864b1814 ffff8300bf76c030
> >>> (XEN) ffff83023ff1ded8 ffff83023ff1ded0 ffff82c48037fe38 ffff82c48011c9f5
> >>> (XEN) ffff82c48037ff08 ffff82c480272100 ffff8300bf76c000 ffff82c48037fe48
> >>> (XEN) ffff82c48011f557 ffff82c480272100 0000000600000002 000000004700000a
> >>> (XEN) 000000004700bf2c 0000000000000000 000000004700c158 0000000000000000
> >>> (XEN) 00002b3b59e7d050 0000000000000000 0000007f00b14140 00002b3b5f257a80
> >>> (XEN) 0000000000996380 00002aaaaaad0830 00002b3b5f257a80 00000000009bb690
> >>> (XEN) 00002aaaaaad0830 000000398905abf3 000000000078de60 00002b3b5f257aa4
> >>> (XEN) Xen call trace:
> >>> (XEN) [<ffff82c48011617c>] free_heap_pages+0x55a/0x575
> >>> (X EN) [<ffff82c48011647e>] free_domheap_pages+0x2e7/0x3ab
> >>> (XEN) [<ffff82c480160caf>] put_page+0x69/0x70
> >>> (XEN) [<ffff82c48014deaf>] relinquish_memory+0x36e/0x499
> >>> (XEN) [<ffff82c4801504de>] domain_relinquish_resources+0x1ac/0x24c
> >>> (XEN) [<ffff82c4801062ee>] domain_kill+0x93/0xe4
> >>> (XEN) [<ffff82c480104cd7>] do_domctl+0xa1c/0x1205
> >>> (XEN) [<ffff82c4801f71bf>] syscall_enter+0xef/0x149
> >>> (XEN)
> >>> (XEN) Pagetable walk from ffff8315ffffffe4:
> >>> (XEN) L4[0x106] = 00000000bf589027 5555555555555555
> >>> (XEN) L3[0x057] = 0000000000000000 ffffffffffffffff
> >>> (XEN)
> >>> (XEN) ****************************************
> >>> (XEN) Panic on CPU 0:
> >>> (XEN) FATAL PAGE FAULT
> >>> (X EN) [error_code=0002]
> >>> (XEN) Faulting linear address: ffff8315ffffffe4
> >>> (XEN) ****************************************
> >>> (XEN)
> >>> (XEN) Manual reset required ('noreboot' specified)
> >>>
> >>> =============Trace 2: =============
> >>>
> >>> (XEN) Xen call trace:
> >>> (XEN) [<ffff82c4801153c3>] free_heap_pages+0x283/0x4a0
> >>> (XEN) [<ffff82c480115732>] free_domheap_pages+0x152/0x380
> >>> (XEN) [<ffff82c48014aa89>] relinquish_memory+0x169/0x500
> >>> (XEN) [<ffff82c48014b2cd>] domain_relinquish_resources+0x1ad/0x280
> >>> (XEN) [<ffff82c480105fe0>] domain_kill+0x80/0xf0
> >>> (XEN) [<ffff82c4801043ce>] do_domctl+0x1be/0x1000
> >>> (XEN) [<ffff82c48010739b>] evtchn_set_pending+0xab/0x1b0
&g t; >>> (XEN) [<ffff82c4801e3169>] syscall_enter+0xa9/0xae
> >>> (XEN)
> >>> (XEN) Pagetable walk from ffff8315ffffffe4:
> >>> (XEN) L4[0x106] = 00000000bf569027 5555555555555555
> >>> (XEN) L3[0x057] = 0000000000000000 ffffffffffffffff
> >>> (XEN) stdvga.c:147:d60 entering stdvga and caching modes
> >>> (XEN)
> >>> (XEN) ****************************************
> >>> (XEN) HVM60: VGABios $Id: vgabios.c,v 1.67 2008/01/27 09:44:12 vruppert Exp
> >>> $
> >>> (XEN) Panic on CPU 0:
> >>> (XEN) FATAL PAGE FAULT
> >>> (XEN) [error_code=0002]
> >>> (XEN) Faulting linear address: ffff8315ffffffe4
> >>> (XEN) ****************************************
> >>> (XEN)
> >>> (XEN) Manual reset required ('noreboot' specified)
> >> ;>
> >>> =============Trace 3: =============
> >>>
> >>>
> >>> (XEN) Xen call trace:
> >>> (XEN) [<ffff82c4801153c3>] free_heap_pages+0x283/0x4a0
> >>> (XEN) [<ffff82c480115732>] free_domheap_pages+0x152/0x380
> >>> (XEN) [<ffff82c48014aa89>] relinquish_memory+0x169/0x500
> >>> (XEN) [<ffff82c48014b2cd>] domain_relinquish_resources+0x1ad/0x280
> >>> (XEN) [<ffff82c480105fe0>] domain_kill+0x80/0xf0
> >>> (XEN) [<ffff82c4801043ce>] do_domctl+0x1be/0x1000
> >>> (XEN) [<ffff82c480117804>] csched_acct+0x384/0x430
> >>> (XEN) [<ffff82c4801e3169>] syscall_enter+0xa9/0xae
> >>>
> >>
> >>
> >
>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel