WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] RE: kernel panic when enable x2apic

To: Jan Beulich <JBeulich@xxxxxxxxxx>
Subject: [Xen-devel] RE: kernel panic when enable x2apic
From: "Zhang, Yang Z" <yang.z.zhang@xxxxxxxxx>
Date: Thu, 18 Nov 2010 12:53:32 +0800
Accept-language: en-US
Acceptlanguage: en-US
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, "Han, Weidong" <weidong.han@xxxxxxxxx>
Delivery-date: Wed, 17 Nov 2010 20:55:33 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4CE40FAE0200007800022D71@xxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <749B9D3DBF0F054390025D9EAFF47F22301755C5@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <4CE3B1640200007800022BA5@xxxxxxxxxxxxxxxxxx> <749B9D3DBF0F054390025D9EAFF47F223017565C@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <4CE40FAE0200007800022D71@xxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcuGc9rDLrynXyIOT6a39yBWl1QUxgAY2s0g
Thread-topic: kernel panic when enable x2apic
With investigation, it seems the heap was broken(not sure, just guess).
>From the calltrace, it called arch_domain_destroy, and I want to see why it 
>fail. After take a look at code, it show the cpupool equal null which returned 
>by cpupool_find_by_id()
Then I add some debuginfo in the cpupool_find_by_id() to see why the creation 
dom0 is fail:
for_each_cpupool(q){
        printk("cpupool_id=%x\n", (*q)->cpupool_id)
}
Unfortunately, it raise another panic:

(XEN) cpupool_id = 7f034000
(XEN) ----[ Xen-4.1-unstable  x86_64  debug=y  Tainted:    C ]----
(XEN) CPU:    0
(XEN) RIP:    e008:[<ffff82c4801012c9>] cpupool_find_by_id+0x39/0xcd
(XEN) RFLAGS: 0000000000010282   CONTEXT: hypervisor
(XEN) rax: ffffffff0fff0001   rbx: ffff83007f0f7fa8   rcx: ffff82c4802d2390
(XEN) rdx: 000000000000000a   rsi: 000000000000000a   rdi: ffff82c48024e2e8
(XEN) rbp: ffff82c480297d78   rsp: ffff82c480297d58   r8:  0000000000000000
(XEN) r9:  0000000000000004   r10: 0000000000000008   r11: 0000000000000008
(XEN) r12: 0000000000000000   r13: 0000000000000001   r14: 0000000000000000
(XEN) r15: 000000000000003f   cr0: 000000008005003b   cr4: 00000000000026f0
(XEN) cr3: 000000007f29c000   cr2: ffffffff0fff0001
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
(XEN) Xen stack trace from rsp=ffff82c480297d58:
(XEN)    0000000000000213 0000000000000000 ffff83007f034000 0000000000000004
(XEN)    ffff82c480297d98 ffff82c4801017dc ffff83007f034000 0000000000000000
(XEN)    ffff82c480297dd8 ffff82c480104f41 ffff82c480289d38 0000000000000080
(XEN)    0000000000000080 0000000000000007 0000000000000008 0000000000000007
(XEN)    ffff82c480297f08 ffff82c480277afb 0000000000000000 0000000000000000
(XEN)    ffff82c4802596a5 0000000000259640 00f1400000000000 0000000000000000
(XEN)    ffff83000007bc50 ffff83000007bfb0 ffff83000007bef0 0000000000f14000
(XEN)    0000000000000000 0000000000000000 0000000020000000 0000000000000000
(XEN)    0000000000000000 ffffffffffffffff ffff83000007bef0 000000000007bef0
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN)    ffff82c480287788 0000000001000000 ffffffff00000000 ffff82c480259640
(XEN)    0000000800000000 000000010000006e 0000000000000003 00000000000002f8
(XEN)    0000000000000000 0000000000000000 000000007c223900 000000007de39018
(XEN)    0000000000000000 0000000000000001 0000000000067ebc ffff82c4801000b5
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) Xen call trace:
(XEN)    [<ffff82c4801012c9>] cpupool_find_by_id+0x39/0xcd
(XEN)    [<ffff82c4801017dc>] cpupool_add_domain+0x52/0xb9
(XEN)    [<ffff82c480104f41>] domain_create+0x41f/0x59e
(XEN)    [<ffff82c480277afb>] __start_xen+0x5660/0x5935
(XEN)
(XEN) Pagetable walk from ffffffff0fff0001:

>From this output, it shows the cpupool_id = 7f034000, I don't know why it was 
>7f034000. I think the first cpupool_id should be 0?Am I right?

Also the fail with write mtrr MSR, the value also is very strange: 
ffff83007f0f7670, it totally different with the SDM says.
(XEN) MTRR: CPU 0: Writing MSR 200 to ffff83007f0f7670 failed

So, I am think that maybe the heap is broken? 

best regards
yang

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@xxxxxxxxxx]
> Sent: Thursday, November 18, 2010 12:24 AM
> To: Zhang, Yang Z
> Cc: Han, Weidong; xen-devel@xxxxxxxxxxxxxxxxxxx
> Subject: RE: kernel panic when enable x2apic
> 
> >>> On 17.11.10 at 14:16, "Zhang, Yang Z" <yang.z.zhang@xxxxxxxxx> wrote:
> > In fact, there have other error info before the crash and I didn't see it
> > before:
> > (XEN) traps.c:2938: GPF (0000): ffff82c4801a0a73 -> ffff82c48020f0d2
> > (XEN) MTRR: CPU 0: Writing MSR 200 to ffff83007f0f7670 failed
> > (XEN) traps.c:2938: GPF (0000): ffff82c4801a0a73 -> ffff82c48020f0d2
> > (XEN) MTRR: CPU 0: Writing MSR 201 to f00000010 failed
> 
> Hmm, these values are totally bogus (and hence it is quite clear
> that the CPU would fault on them being written to the actual MSRs).
> The question is where these bogus values originate, and how this is
> connected to said patch (I can't see any relation between the two).
> 
> Wouldn't it be possible that you simple send the whole log?
> 
> Would you be able to do some more debugging on this to at
> least narrow where things start going wrong?
> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel