WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] Biweekly VMX status report. Xen: #21438 & Xen0: #a3e7c7.

To: Keir Fraser <keir.fraser@xxxxxxxxxxxxx>, "Xu, Jiajun" <jiajun.xu@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: RE: [Xen-devel] Biweekly VMX status report. Xen: #21438 & Xen0: #a3e7c7...
From: "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx>
Date: Wed, 2 Jun 2010 17:24:27 +0800
Accept-language: en-US
Acceptlanguage: en-US
Cc:
Delivery-date: Wed, 02 Jun 2010 02:26:45 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <C82BCE5B.166B4%keir.fraser@xxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <789F9655DD1B8F43B48D77C5D30659731E7ECE62@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <C82BCE5B.166B4%keir.fraser@xxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: Acr7WP6tLI1/MbF5Rx6MiSRtri82UQACJ+lgACA1w7AAAgVQywAAD39BAVwtyIAABGU8GAAAFUsgAC8XDq8AArjzkA==
Thread-topic: [Xen-devel] Biweekly VMX status report. Xen: #21438 & Xen0: #a3e7c7...
BTW, I get following failure after loop cpu o*l for about 95 times. 

(XEN) Xen call trace:
(XEN)    [<ffff82c48014b3e9>] clear_page_sse2+0x9/0x30
(XEN)    [<ffff82c4801b9922>] vmx_cpu_up_prepare+0x43/0x88
(XEN)    [<ffff82c4801a13fa>] cpu_callback+0x4a/0x94
(XEN)    [<ffff82c480112d95>] notifier_call_chain+0x68/0x84
(XEN)    [<ffff82c480100e5b>] cpu_up+0x7b/0x12f
(XEN)    [<ffff82c480173b7d>] arch_do_sysctl+0x770/0x833
(XEN)    [<ffff82c480121672>] do_sysctl+0x992/0x9ec
(XEN)    [<ffff82c4801fa3cf>] syscall_enter+0xef/0x149
(XEN)
(XEN) Pagetable walk from ffff83022fe1d000:
(XEN)  L4[0x106] = 00000000cfc8d027 5555555555555555
(XEN)  L3[0x008] = 00000000cfef9063 5555555555555555
(XEN)  L2[0x17f] = 000000022ff2a063 5555555555555555
(XEN)  L1[0x01d] = 000000022fe1d262 5555555555555555

I really can't imagine how this can happen considering the vmx_alloc_vmcs() is 
so straight-forward. My test machine is really magic.

Another fault as following:

(XEN) Xen call trace:
(XEN)    [<ffff82c480173459>] memcpy+0x11/0x1e
(XEN)    [<ffff82c4801722bf>] cpu_smpboot_callback+0x207/0x235
(XEN)    [<ffff82c480112d95>] notifier_call_chain+0x68/0x84
(XEN)    [<ffff82c480100e5b>] cpu_up+0x7b/0x12f
(XEN)    [<ffff82c480173c1d>] arch_do_sysctl+0x770/0x833
(XEN)    [<ffff82c480121712>] do_sysctl+0x992/0x9ec
(XEN)    [<ffff82c4801fa46f>] syscall_enter+0xef/0x149
(XEN)
(XEN) Pagetable walk from ffff830228ce5000:
(XEN)  L4[0x106] = 00000000cfc8d027 5555555555555555
(XEN)  L3[0x008] = 00000000cfef9063 5555555555555555
(XEN)  L2[0x146] = 000000022fea3063 5555555555555555
(XEN)  L1[0x0e5] = 0000000228ce5262 000000000001fd49
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 1:
(XEN) FATAL PAGE FAULT
(XEN) [error_code=0002]
(XEN) Faulting linear address: ffff830228ce5000
(XEN) ****************************************
(XEN)

--jyh

>-----Original Message-----
>From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx]
>Sent: Wednesday, June 02, 2010 4:01 PM
>To: Jiang, Yunhong; Xu, Jiajun; xen-devel@xxxxxxxxxxxxxxxxxxx
>Subject: Re: [Xen-devel] Biweekly VMX status report. Xen: #21438 & Xen0:
>#a3e7c7...
>
>On 02/06/2010 08:28, "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx> wrote:
>
>>> Which xen-unstable changeset are you testing? All timers should be
>>> automatically migrated off a dead CPU and onto CPU0 by changeset 21424. Is
>>> that not working okay for you?
>>
>> We are testing on 21492.
>>
>> After more investigation, the root cause is the periodic_timer is stopped
>> before take_cpu_down (in schedule()), so that it is not covred by 21424.
>> When v->periodic_period ==0, next vcpu's p_timer is not updated by the
>> schedule(), thus, later in next schedule round, it will cause trouble for
>> stop_timer().
>>
>> With following small patch, it works, but I'm not sure if this is good
>> solution.
>
>I forgot about inactive timers in c/s 21424. Hm, I will fix this in the
>timer subsystem and get back to you.
>
> -- Keir
>


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel