WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Alignment check on domU (2.6.32)

To: Natalie Protasevich <protasnb@xxxxxxxxx>
Subject: Re: [Xen-devel] Alignment check on domU (2.6.32)
From: Jeremy Fitzhardinge <jeremy@xxxxxxxx>
Date: Tue, 30 Mar 2010 09:53:49 -0700
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Tue, 30 Mar 2010 09:55:18 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <32209efe1003292008y5880e9bfib238c089377b4ba7@xxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <32209efe1003292008y5880e9bfib238c089377b4ba7@xxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.8) Gecko/20100301 Fedora/3.0.3-1.fc12 Lightning/1.0b2pre Thunderbird/3.0.3
On 03/29/2010 08:08 PM, Natalie Protasevich wrote:
Hello,
We are getting alignment check with 2.6.32 kernel running as a domU on an AMD system,

Which 2.6.32 is it? Is it stock kernel.org, from xen.git, a distro, elsewhere?

while dom0 is a 2.6.18 kernel.
As far as I know we should not have run into such problem, since this is x86_64 kernel. I am aware of the fact that for alignment check trap AC bit needs to be set in eflags and AM should be set in CR0. I tracked cr0 and AM was getting set, and problem was occurring when something was setting AC flag at the time of calling memcpy_c(). I cheated and cleared the AM flag in cr0 (as one can see in this trace) but this didn't help. I haven't figured out what sets the AM flag...

Do you have any other domains running at the time?

What CPU is this?

Does it run the same kernel native OK?

    J


Here is the trace:

[   80.342300] alignment check: 0000 [#1] SMP
[   80.342323] last sysfs file: /sys/devices/virtual/vc/vcsa7/dev
[   80.342330] CPU 1
[ 80.342339] Pid: 3875, comm: loas_check Not tainted 2.6.32.10+drm33.1 #12 [ 80.342347] RIP: e030:[<ffffffff813eb2bb>] [<ffffffff813eb2bb>] memcpy_c+0xb/0x20
[   80.342365] RSP: e02b:ffff88015556d9b0  EFLAGS: 00050246
[ 80.342371] RAX: ffff88017360cc8c RBX: ffff880176d91900 RCX: 0000000000000002 [ 80.342379] RDX: 0000000000000000 RSI: ffff880176d91958 RDI: ffff88017360cc8c [ 80.342388] RBP: ffff88015556d9e8 R08: ffffffff81570260 R09: ffffffff81ae8840 [ 80.342395] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 80.342403] R13: 000000000000000e R14: ffff880173f3fc00 R15: ffff880176d91958 [ 80.342417] FS: 00007f0f4dafd6e0(0000) GS:ffff880028047000(0000) knlGS:0000000000000000
[   80.342425] CS:  e033 DS: 002b ES: 002b CR0: 000000008001003b
[ 80.342432] CR2: 00007f0f4db1a000 CR3: 000000017362d000 CR4: 0000000000000660 [ 80.342440] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 80.342448] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000 [ 80.342457] Process loas_check (pid: 3875, threadinfo ffff88015556c000, task ffff8801556cada0)
[   80.342465] Stack:
[ 80.342469] ffffffff8157038f ffffffff81ae8840 ffff880173f3fc00 0000000000000000 [ 80.342483] <0> ffff880173f3fc00 ffff880161a36400 0000000000000000 ffff88015556da08 [ 80.342500] <0> ffffffff815705d6 ffff880180000000 ffff880173f3fc00 ffff88015556da28
[   80.342518] Call Trace:
[   80.342528]  [<ffffffff8157038f>] ? ip_finish_output+0x12f/0x2f0
[   80.342538]  [<ffffffff815705d6>] ip_output+0x86/0xd0
[   80.342546]  [<ffffffff8156f600>] ip_local_out+0x20/0x30
[   80.342555]  [<ffffffff8156fed3>] ip_queue_xmit+0x223/0x3f0
[   80.342565]  [<ffffffff81584214>] ? tcp_send_active_reset+0x24/0x180
[   80.342576]  [<ffffffff8100dcfd>] ? xen_force_evtchn_callback+0xd/0x10
[   80.342586]  [<ffffffff8100e532>] ? check_events+0x12/0x20
[   80.342595]  [<ffffffff81583dd2>] tcp_transmit_skb+0x402/0x780
[   80.342604]  [<ffffffff81584279>] tcp_send_active_reset+0x89/0x180
[   80.342614]  [<ffffffff815770bc>] tcp_disconnect+0x6c/0x3c0
[   80.342622]  [<ffffffff81576e34>] tcp_close+0x3e4/0x480
[   80.342632]  [<ffffffff81598b92>] inet_release+0x42/0x70
[   80.342643]  [<ffffffff814ce5d8>] sock_release+0x18/0x60
[   80.342652]  [<ffffffff814ceab2>] sock_close+0x12/0x30
[   80.342663]  [<ffffffff8110e28e>] __fput+0xee/0x200
[   80.342671]  [<ffffffff8100dcfd>] ? xen_force_evtchn_callback+0xd/0x10
[   80.342681]  [<ffffffff8110e3b7>] fput+0x17/0x20
[   80.342690]  [<ffffffff8110a378>] filp_close+0x58/0x90
[   80.342698]  [<ffffffff8100e51f>] ? xen_restore_fl_direct_end+0x0/0x1
[   80.342709]  [<ffffffff8105691c>] put_files_struct+0xcc/0xe0
[   80.342718]  [<ffffffff81056980>] exit_files+0x50/0x60
[   80.342726]  [<ffffffff81058587>] do_exit+0x1b7/0x7f0
[   80.342735]  [<ffffffff81065cb6>] ? __dequeue_signal+0x16/0x160
[   80.342745]  [<ffffffff81058bfc>] do_group_exit+0x3c/0xa0
[   80.342754]  [<ffffffff81068328>] get_signal_to_deliver+0x1b8/0x380
[   80.342764]  [<ffffffff810106a9>] do_notify_resume+0xc9/0x8a0
[   80.342775]  [<ffffffff8100b8bb>] ? xen_mc_flush+0x11b/0x1d0
[ 80.342786] [<ffffffff8102cb52>] ? paravirt_end_context_switch+0x12/0x30
[   80.342798]  [<ffffffff81047afb>] ? finish_task_switch+0x5b/0xb0
[   80.342808]  [<ffffffff8101134e>] int_signal+0x12/0x17
[ 80.342815] Code: 81 ea d8 1f 00 00 48 3b 42 20 73 07 48 8b 50 f9 31 c0 c3 31 d2 48 c7 c0 f2 ff ff ff c3 90 90 90 48 89 f8 89 d1 c1 e9 03 83 e2 07 <f3> 48 a5 89 d1 f3 a4 c3 66 66 66 66 2e 0f 1f 84 00 00 00 00 00
[   80.342952] RIP  [<ffffffff813eb2bb>] memcpy_c+0xb/0x20
[   80.342962]  RSP <ffff88015556d9b0>
[   80.342969] ---[ end trace 1442aa6e9e3d337d ]---
[   80.342976] Fixing recursive fault but reboot is needed!

This happens 2 out of 3 times.
I don't seem to find any similar recent reports and relevant commits so far, and we haven't had such problem running 2.6.24 domU (Ubuntu hardy) on the 2.6.18 dom0. I'm hoping someone can give a hand.
Thanks,
--Natalie
P.S. Just in case - here is the "original" trace before I tried to modify the cr0:

[   64.544616] alignment check: 0000 [#1] SMP
[   64.544640] last sysfs file: /sys/devices/virtual/vc/vcsa7/dev
[   64.544647] CPU 1
[ 64.544655] Pid: 3737, comm: loas_check Not tainted 2.6.32.10+drm33.1 #8 [ 64.544663] RIP: e030:[<ffffffff813eb23b>] [<ffffffff813eb23b>] memcpy_c+0xb/0x20
[   64.544681] RSP: e02b:ffff880152e7d9b0  EFLAGS: 00050246
[ 64.544687] RAX: ffff8801731e4c8c RBX: ffff880178185400 RCX: 0000000000000002 [ 64.544696] RDX: 0000000000000000 RSI: ffff880178185458 RDI: ffff8801731e4c8c [ 64.544703] RBP: ffff880152e7d9e8 R08: ffffffff81570110 R09: ffffffff81ae6840 [ 64.544711] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 64.544718] R13: 000000000000000e R14: ffff88017332d800 R15: ffff880178185458 [ 64.544732] FS: 00007fa8de9336e0(0000) GS:ffff880028047000(0000) knlGS:0000000000000000
[   64.544741] CS:  e033 DS: 002b ES: 002b CR0: 000000008005003b
[ 64.544748] CR2: 00000000081f1320 CR3: 0000000001001000 CR4: 0000000000000660 [ 64.544756] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 64.544764] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000 [ 64.544772] Process loas_check (pid: 3737, threadinfo ffff880152e7c000, task ffff880152e72da0)
[   64.544781] Stack:
[ 64.544785] ffffffff81570214 ffffffff81ae6840 ffff88017332d800 0000000000000000 [ 64.544798] <0> ffff88017332d800 ffff880152e13200 0000000000000000 ffff880152e7da08 [ 64.544814] <0> ffffffff81570486 ffff880180000000 ffff88017332d800 ffff880152e7da28
[   64.544832] Call Trace:
[   64.544842]  [<ffffffff81570214>] ? ip_finish_output+0x104/0x2f0
[   64.544853]  [<ffffffff81570486>] ip_output+0x86/0xd0
[   64.544862]  [<ffffffff8156f4b0>] ip_local_out+0x20/0x30
[   64.544870]  [<ffffffff8156fd83>] ip_queue_xmit+0x223/0x3f0
[   64.544880]  [<ffffffff8100dcfd>] ? xen_force_evtchn_callback+0xd/0x10
[   64.544889]  [<ffffffff8100e532>] ? check_events+0x12/0x20
[   64.544900]  [<ffffffff81583c82>] tcp_transmit_skb+0x402/0x780
[   64.544909]  [<ffffffff81584129>] tcp_send_active_reset+0x89/0x180
[   64.544920]  [<ffffffff8111e16a>] ? __d_free+0x3a/0x60
[   64.544929]  [<ffffffff81576f6c>] tcp_disconnect+0x6c/0x3c0
[   64.544938]  [<ffffffff81576ce4>] tcp_close+0x3e4/0x480
[   64.544946]  [<ffffffff81598a42>] inet_release+0x42/0x70
[   64.544956]  [<ffffffff814ce558>] sock_release+0x18/0x60
[   64.544964]  [<ffffffff814cea32>] sock_close+0x12/0x30
[   64.544974]  [<ffffffff8110e20e>] __fput+0xee/0x200
[   64.544982]  [<ffffffff8100dcfd>] ? xen_force_evtchn_callback+0xd/0x10
[   64.544991]  [<ffffffff8110e337>] fput+0x17/0x20
[   64.545000]  [<ffffffff8110a2f8>] filp_close+0x58/0x90
[   64.545009]  [<ffffffff8100e51f>] ? xen_restore_fl_direct_end+0x0/0x1
[   64.545019]  [<ffffffff8105689c>] put_files_struct+0xcc/0xe0
[   64.545028]  [<ffffffff81056900>] exit_files+0x50/0x60
[   64.545036]  [<ffffffff81058507>] do_exit+0x1b7/0x7f0
[   64.545046]  [<ffffffff81065c36>] ? __dequeue_signal+0x16/0x160
[   64.545055]  [<ffffffff81058b7c>] do_group_exit+0x3c/0xa0
[   64.545064]  [<ffffffff810682a8>] get_signal_to_deliver+0x1b8/0x380
[   64.545073]  [<ffffffff81010649>] do_notify_resume+0xc9/0x880
[   64.545084]  [<ffffffff8100b8bb>] ? xen_mc_flush+0x11b/0x1d0
[ 64.545095] [<ffffffff8102cad2>] ? paravirt_end_context_switch+0x12/0x30
[   64.545106]  [<ffffffff81047a7b>] ? finish_task_switch+0x5b/0xb0
[   64.545115]  [<ffffffff810112ce>] int_signal+0x12/0x17
[ 64.545121] Code: 81 ea d8 1f 00 00 48 3b 42 20 73 07 48 8b 50 f9 31 c0 c3 31 d2 48 c7 c0 f2 ff ff ff c3 90 90 90 48 89 f8 89 d1 c1 e9 03 83 e2 07 <f3> 48 a5 89 d1 f3 a4 c3 66 66 66 66 2e 0f 1f 84 00 00 00 00 00
[   64.545252] RIP  [<ffffffff813eb23b>] memcpy_c+0xb/0x20
[   64.545262]  RSP <ffff880152e7d9b0>
[   64.545269] ---[ end trace 11cf940a2c626919 ]---
[   64.545276] Fixing recursive fault but reboot is needed!


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>