Xen project Mailing List

RE: [Xen-devel] Xen-unstable panic: FATAL PAGE FAULT

To: <keir.fraser@xxxxxxxxxxxxx>, xen devel <xen-devel@xxxxxxxxxxxxxxxxxxx>

From: MaoXiaoyun <tinnycloud@xxxxxxxxxxx>

Date: Thu, 26 Aug 2010 16:59:59 +0800

Cc:

Delivery-date: Thu, 26 Aug 2010 02:01:30 -0700

Importance: Normal

List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Appreciate for the detail.

I notice the spin_lock for the code I referred, which as you mentioned will introduce a deadlock.
In fact, during the 48 hours long run, there was a VM hung, and from the xm list command,
the cpu time is quite high, to ten thousands, but the other VMS worked fine. I don't know whether
it related to the potential deadlock, since Xen still worked.

So a quick question is if we replace the spin_lock with spin_lock_recursive, could we avoid this deadlock?

The if statement was executed during the test since I happend put the log and got the output log.
As a matter of fact, HVMS(all windowns 2003) under my test all are have PV driver installed. I think that's
why the patch take effects.

Besides, I have been working on this issue for sometime, it is not possible I made a build mistake
since I have been carefully all the time.

Anyway, I plan to kick off two reproduce on two physical servers, one has this patch enabled(use spin_lock_recursive
instead of spin_lock) and the other with no change, completely on clean code. It would be useful if u have some
trace to be added into the test. I will keep you informed.

In addtion, my kernel is
2.6.31.13-pvops-patch #1 SMP Tue Aug 24 11:23:51 CST 2010 x86_64 x86_64 x86_64 GNU/Linux
Xen is
4.0.0

Thanks.

> Date: Thu, 26 Aug 2010 08:39:03 +0100
> Subject: Re: [Xen-devel] Xen-unstable panic: FATAL PAGE FAULT
> From: keir.fraser@xxxxxxxxxxxxx
> To: tinnycloud@xxxxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxx
>
> On 26/08/2010 05:49, "MaoXiaoyun" <tinnycloud@xxxxxxxxxxx> wrote:
>
> > Hi:
> >
> > This issue can be easily reproduced by continuous and almost concurrently
> > reboot 12 Xen HVM VMS on a single physic server. The reproduce hit the back
> > trace about 6 to 14 hours after it started. I have several similar Xen back
> > traces, please refer to the end of the mail. The first three back traces
> > almost the same, they happened in domain_kill, while the last backtrace
> > happened in do_multicall.
> >
> > As go through the Xen code, in /xen-4.0.0/xen/arch/x86/mm.c, it shows
> > that the author aware of the race competiti on between
> > domain_relinquish_resources and presented code. It occurred me to simply move
> > line 2765 and 2766 before 2764, that is move put_page_and_type(page) into the
> > spin_lock to avoid competition.
>
> Well, thanks for the detailed bug report: it is good to have a report that
> includes an attempt at a fix!
>
> In the below code, the put_page_and_type() is outside the locked region for
> good reason. Put_page_and_type() -> put_page() -> free_domheap_pages() which
> acquires d->page_alloc_lock. Because we do not use spin_lock_recursive() in
> the below code, this recursive acquisition of the lock in
> free_domheap_pages() would deadlock!
>
> Now, I do not think this fix really affected your testing anyway, because
> the below code is part of the MMUEXT_PIN_... hypercalls, and further is only
> triggered when a domain executes one of those hypercalls on *another*
> domain's memory. The *only* time that should happen is when dom0 builds a
> *PV* VM. So since all your testing is on HVM guests I wouldn't expect the
> code in the if() statement below to be executed ever. Well, maybe unless you
> are using qemu stub domains, or pvgrub.
>
> But even if the below code is being executed, I don't think your change is a
> fix, or anything that should greatly affect the system apart from
> introducing a deadlock. Is it instead possible that you somehow were testing
> a broken build of Xen before, and simply re-building Xen with your change is
> what fixed things? I wonder if the bug stays gone away if you revert your
> change and re-build?
>
> If it still appears that your fix is good, I would add tracing to the below
> code and find out a bit more about when/why it is being executed.
>
> -- Keir
>
> > 2753 /* A page i s dirtied when its pin status is set. */
> > 2754 paging_mark_dirty(pg_owner, mfn);
> > 2755
> > 2756 /* We can race domain destruction
> > (domain_relinquish_resources). */
> > 2757 if ( unlikely(pg_owner != d) )
> > 2758 {
> > 2759 int drop_ref;
> > 2760 spin_lock(&pg_owner->page_alloc_lock);
> > 2761 drop_ref = (pg_owner->is_dying &&
> > 2762 test_and_clear_bit(_PGT_pinned,
> > 2763
> > &page->u.inuse.type_info));
> > 2764 spin_unlock(&pg_owner->page_alloc_lock);
> > 2765 if ( drop_ref )
> > 2766 put_page_and_type(page);
> > 2767 }
> > 2768
> > 2769 break;
> > 2770 }
> >
> > Form the result of reproduce on patched code, it appears the patch
> > worked well since the reproduce succeed during a 48hours long run. But I am
> > not sure of the side effects it brings.
> > Appreciate in advance if someone could give more clauses, thx.
> >
> > =============Trace 1: =============
> >
> > (XEN) ----[ Xen-4.0.0 x86_64 debug=y Not tainted ]----
> > (XEN) CPU: 0
> > (XEN) RIP: e008:[<ffff82c48011617c>] free_heap_pages+0x55a/0x575
> > (XEN) RFLAGS: 0000000000010286 CONTEXT: hypervisor
> > (XEN) rax: 0000001fffffffe0 rbx: ffff82f60b8bbfc0 rcx: ffff83063fe01a20
> > (XEN) rdx: ffff8315ffffffe0 rsi: ffff8315ffffffe0 rdi: 00000000ffffffff
> > (XEN) rbp: ffff82c48037fc98 rsp: ffff82c48037fc58 r8: 0000000000000000
> > (XEN) r9: ffffffffffffffff r10: ffff82c48020e770 r11: 0000000000000282
> > (XEN) r12: 00007d0a00000000 r13: 0000000000000000 r14: ffff82f60b8bbfe0
> > (XEN) r15: 0000000000000001 cr0: 000000008005003b cr4: 00000000000026f0
> > (XEN) cr3: 0000000232914000 cr2: ffff8315ffff ffe4
> > (XEN) ds: 0000 es: 0000 fs: 0063 gs: 0000 ss: e010 cs: e008
> > (XEN) Xen stack trace from rsp=ffff82c48037fc58:
> > (XEN) 0000000000000016 0000000000000000 00000000000001a2 ffff8304afc40000
> > (XEN) 0000000000000000 ffff82f60b8bbfe0 00000000000330fe ffff82f60b8bc000
> > (XEN) ffff82c48037fcd8 ffff82c48011647e 0000000100000000 ffff82f60b8bbfe0
> > (XEN) ffff8304afc40020 0000000000000000 ffff8304afc40000 0000000000000000
> > (XEN) ffff82c48037fcf8 ffff82c480160caf ffff8304afc40000 ffff82f60b8bbfe0
> > (XEN) ffff82c48037fd68 ffff82c48014deaf 0000000000000ca3 ffff8304afc40fd8
> > (XEN) ffff8304afc40fd8 ffff8304afc40fd8 4000000000000000 ffff82c48037ff28
> > (XEN) 0000000000000000 ffff8304afc40000 ffff8304afc40000 000000000099e000
> > (XEN) 00000000ffffffda 0000000000000001 ffff82c48037fd98 ffff82c4801504de
> > (XEN) ffff8304afc40000 0000000000000000 000000000099e0 00 00000000ffffffda
> > (XEN) ffff82c48037fdb8 ffff82c4801062ee 000000000099e000 fffffffffffffff3
> > (XEN) ffff82c48037ff08 ffff82c480104cd7 ffff82c40000f800 0000000000000286
> > (XEN) 0000000000000286 ffff8300bf76c000 000000ea864b1814 ffff8300bf76c030
> > (XEN) ffff83023ff1ded8 ffff83023ff1ded0 ffff82c48037fe38 ffff82c48011c9f5
> > (XEN) ffff82c48037ff08 ffff82c480272100 ffff8300bf76c000 ffff82c48037fe48
> > (XEN) ffff82c48011f557 ffff82c480272100 0000000600000002 000000004700000a
> > (XEN) 000000004700bf2c 0000000000000000 000000004700c158 0000000000000000
> > (XEN) 00002b3b59e7d050 0000000000000000 0000007f00b14140 00002b3b5f257a80
> > (XEN) 0000000000996380 00002aaaaaad0830 00002b3b5f257a80 00000000009bb690
> > (XEN) 00002aaaaaad0830 000000398905abf3 000000000078de60 00002b3b5f257aa4
> > (XEN) Xen call trace:
> > (XEN) [<ffff82c48011617c>] free_heap_pages+0x5 5a/0x575
> > (XEN) [<ffff82c48011647e>] free_domheap_pages+0x2e7/0x3ab
> > (XEN) [<ffff82c480160caf>] put_page+0x69/0x70
> > (XEN) [<ffff82c48014deaf>] relinquish_memory+0x36e/0x499
> > (XEN) [<ffff82c4801504de>] domain_relinquish_resources+0x1ac/0x24c
> > (XEN) [<ffff82c4801062ee>] domain_kill+0x93/0xe4
> > (XEN) [<ffff82c480104cd7>] do_domctl+0xa1c/0x1205
> > (XEN) [<ffff82c4801f71bf>] syscall_enter+0xef/0x149
> > (XEN)
> > (XEN) Pagetable walk from ffff8315ffffffe4:
> > (XEN) L4[0x106] = 00000000bf589027 5555555555555555
> > (XEN) L3[0x057] = 0000000000000000 ffffffffffffffff
> > (XEN)
> > (XEN) ****************************************
> > (XEN) Panic on CPU 0:
> > (XEN) FATAL PAGE FAULT
> > (XEN) [error_code=0002]
> > (XEN) Faulting linear address: ffff8315ffffffe4
> > (XEN) ****************************************
> > (XEN)
> > (XEN) Manual reset required ('noreboot' specified)
> >
> > =============Trace 2: =============
> >
> > (XEN) Xen call trace:
> > (XEN) [<ffff82c4801153c3>] free_heap_pages+0x283/0x4a0
> > (XEN) [<ffff82c480115732>] free_domheap_pages+0x152/0x380
> > (XEN) [<ffff82c48014aa89>] relinquish_memory+0x169/0x500
> > (XEN) [<ffff82c48014b2cd>] domain_relinquish_resources+0x1ad/0x280
> > (XEN) [<ffff82c480105fe0>] domain_kill+0x80/0xf0
> > (XEN) [<ffff82c4801043ce>] do_domctl+0x1be/0x1000
> > (XEN) [<ffff82c48010739b>] evtchn_set_pending+0xab/0x1b0
> > (XEN) [<ffff82c4801e3169>] syscall_enter+0xa9/0xae
> > (XEN)
> > (XEN) Pagetable walk from ffff8315ffffffe4:
> > (XEN) L4[0x106] = 00000000bf569027 5555555555555555
> & gt; (XEN) L3[0x057] = 0000000000000000 ffffffffffffffff
> > (XEN) stdvga.c:147:d60 entering stdvga and caching modes
> > (XEN)
> > (XEN) ****************************************
> > (XEN) HVM60: VGABios $Id: vgabios.c,v 1.67 2008/01/27 09:44:12 vruppert Exp $
> > (XEN) Panic on CPU 0:
> > (XEN) FATAL PAGE FAULT
> > (XEN) [error_code=0002]
> > (XEN) Faulting linear address: ffff8315ffffffe4
> > (XEN) ****************************************
> > (XEN)
> > (XEN) Manual reset required ('noreboot' specified)
> >
> > =============Trace 3: =============
> >
> >
> > (XEN) Xen call trace:
> > (XEN) [<ffff82c4801153c3>] free_heap_pages+0x283/0x4a0
> > (XEN) [<ffff82c480115732>] free_domheap_pages+0x152/0x380
> > (XEN) [<ffff82c48014aa89>] relinquish_memory+0x169/0x500
> > (XEN) [<fff f82c48014b2cd>] domain_relinquish_resources+0x1ad/0x280
> > (XEN) [<ffff82c480105fe0>] domain_kill+0x80/0xf0
> > (XEN) [<ffff82c4801043ce>] do_domctl+0x1be/0x1000
> > (XEN) [<ffff82c480117804>] csched_acct+0x384/0x430
> > (XEN) [<ffff82c4801e3169>] syscall_enter+0xa9/0xae
> >
>
>

_______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.