WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [PATCH] xen: correctly restore pfn_to_mfn_list_list afte

To: xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Xen-devel] [PATCH] xen: correctly restore pfn_to_mfn_list_list after resume
From: Bartosz Lis <bartoszl@xxxxxxxxxxxxx>
Date: Mon, 30 Nov 2009 11:17:40 +0100
Delivery-date: Mon, 30 Nov 2009 02:21:56 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <1258803169-17191-1-git-send-email-ian.campbell@xxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Organization: Politechnika Łódzka
References: <1258803169-17191-1-git-send-email-ian.campbell@xxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: KMail/1.12.3 (Linux/2.6.28.10-3; KDE/4.3.3; x86_64; ; )
Dnia sobota, 21 listopada 2009 o 12:32:49 Ian Campbell napisał(a):
> pvops kernels >= 2.6.30 can currently only be saved and restored once. The
> second attempt to save results in:
> 
>     ERROR Internal error: Frame# in pfn-to-mfn frame list is not in
>  pseudophys ERROR Internal error: entry 0: p2m_frame_list[0] is 0xf2c2c2c2,
>  max 0x120000 ERROR Internal error: Failed to map/save the p2m frame list
> 
> I finally narrowed it down to:
> 
>     commit cdaead6b4e657f960d6d6f9f380e7dfeedc6a09b
>         Author: Jeremy Fitzhardinge <jeremy.fitzhardinge@xxxxxxxxxx>
>         Date:   Fri Feb 27 15:34:59 2009 -0800
> 
>             xen: split construction of p2m mfn tables from registration
> 
>             Build the p2m_mfn_list_list early with the rest of the p2m
>  table, but register it later when the real shared_info structure is in
>  place.
> 
>             Signed-off-by: Jeremy Fitzhardinge
>  <jeremy.fitzhardinge@xxxxxxxxxx>
> 
> The unforeseen side-effect of this change was to cause the mfn list list to
>  not be rebuilt on resume. Prior to this change it would have been rebuilt
>  via xen_post_suspend() -> xen_setup_shared_info() ->
>  xen_setup_mfn_list_list().
> 
> Fix by explicitly calling xen_build_mfn_list_list() from
>  xen_post_suspend().
> 
[---]

Ian,

I have downloaded and compiled pvops kernel after your fixes a week ago 
(commit e14a6cdfdf5b40330297701b4e6963f9eff6d8df Sat, 21 Nov 2009 23:59:07 
+0000 (07:59 +0800)). Now, it has been running stable as xen0 for about 5 days 
on a dual AMD Opteron 248 and a dual Intel Xeon E5520.

1. Opteron 248 guest

For all that time I have been compiling linux kernel in a loop (~700 
compilation rouds) on a virtual machine with 2 vcpus. I have mgrated the 
machine from time to time there and back from one phisical machine to the 
other, both having Opterons 248. Save/restore/save/restore works fine: kernel 
continues to compile, even ssh session was not closed.

I have tested only 64 bit kernel/userlands in both xen0/U.

2. Xeon 5520 guest

For 64 bit kernel/userlands in both xen0/U Save/restore/save/restore works 
fine: kernel continues to compile, ssh session stays open. I used 2 vcpus in 
the guest.

I have no possibility to check live migration on Xeon E5520 (no SAN 
connection).

Unfortunately save/restore does not work for 64bit kernel/userland in dom0 and 
32bit kernel/userland in domU (tested with 1 and then with 2 vcpus). Save 
hangs. Save file is ~1.5kB long and I'm getting on guest's console:
----8<----
[   34.729250] BUG: unable to handle kernel paging request at c1527000          
                                                                   
[   34.729271] IP: [<c1006593>] xen_set_pmd+0x73/0xb0                           
[   34.729288] *pdpt = 0000000403162027                                         
[   34.729299] Oops: 0003 [#1] SMP                                              
[   34.729312] last sysfs file: /sys/module/ip_tables/initstate                 
[   34.729321] Modules linked in: sch_sfq xt_limit ipt_REJECT xt_tcpudp 
ipt_LOG xt_state xt_multiport iptable_filter iptable_nat nf_nat 
nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_mangle ip_tables 
x_tables xenfs dm_multipath scsi_dh dm_mod st sd_mod crc_t10dif lpfc qla2xxx 
scsi_transport_fc scsi_tgt qla1280 scsi_mod psmouse uhci_hcd ehci_hcd usbcore 
pcspkr xen_netfront evdev ext3 jbd mbcache                                      
                                      
[   34.729485]                                                                  
[   34.729493] Pid: 1686, comm: kstop/0 xid: #0 Not tainted 
(2.6.31.6x_xenUnogrsecuritypae-BL5.5 #1)                                        
                    
[   34.729504] EIP: 0061:[<c1006593>] EFLAGS: 00010046 CPU: 0                   
[   34.729513] EIP is at xen_set_pmd+0x73/0xb0                                  
[   34.729520] EAX: c1527000 EBX: 031f3067 ECX: 00000004 EDX: c179b000          
[   34.729529] ESI: 00000004 EDI: c1527000 EBP: ddd75eb0 ESP: ddd75ea0          
[   34.729538]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069                    
[   34.729547] Process kstop/0 (pid: 1686, ti=ddd74000 task=df8f48c0 
task.ti=ddd74000)                                                               
           
[   34.729557] Stack:                                                           
[   34.729563]  c1527000 031f3067 1fd61067 c1527000 ddd75ec4 c10cd650 00000000 
00000000                                                                        
 
[   34.729595] <0> 00200000 ddd75f20 c10ceb14 00000000 123ab067 00000000 
00000fff 00001000                                                               
       
[   34.729630] <0> 00000fff c1463000 c1474f60 01ba9067 00000000 ddd75f14 
c1006f3a c153001c                                                               
       
[   34.729670] Call Trace:                                                      
[   34.729682]  [<c10cd650>] ? __pte_alloc_kernel+0xa0/0xb0                     
[   34.729693]  [<c10ceb14>] ? apply_to_page_range+0x314/0x330                  
[   34.729705]  [<c1006f3a>] ? xen_force_evtchn_callback+0x1a/0x30              
[   34.729717]  [<c10079c6>] ? arch_gnttab_unmap+0x26/0x30                      
[   34.729729]  [<c1007950>] ? unmap_pte_fn+0x0/0x50                            
[   34.729742]  [<c1204591>] ? gnttab_suspend+0x41/0x50                         
[   34.729753]  [<c120756a>] ? xen_suspend+0x3a/0xf0                            
[   34.729765]  [<c108873d>] ? stop_cpu+0x8d/0xd0                               
[   34.729776]  [<c1054022>] ? worker_thread+0x112/0x220                        
[   34.729787]  [<c10886b0>] ? stop_cpu+0x0/0xd0                                
[   34.729798]  [<c10587e0>] ? autoremove_wake_function+0x0/0x40                
[   34.729810]  [<c1053f10>] ? worker_thread+0x0/0x220                          
[   34.729821]  [<c10584ec>] ? kthread+0x7c/0x90                                
[   34.729831]  [<c1058470>] ? kthread+0x0/0x90                                 
[   34.729843]  [<c100ad17>] ? kernel_thread_helper+0x7/0x10                    
[   34.729851] Code: 00 75 48 8b 45 f0 89 da 89 f1 83 05 fc 32 53 c1 01 e8 e2 
fe ff ff 8b 5d f4 8b 75 f8 8b 7d fc 89 ec 5d c3 90 8d 74 26 00 8b 45 f0 <89> 
18 89 70 04 eb e4 ba e0 32 53 c1 b9 33 00 00 00 31 c0 89 d7                     
     
[   34.730076] EIP: [<c1006593>] xen_set_pmd+0x73/0xb0 SS:ESP 0069:ddd75ea0     
[   34.730093] CR2: 00000000c1527000                                            
[   34.730102] ---[ end trace cd1b831872a4c87f ]---                             
[   34.730137] ------------[ cut here ]------------                             
[   34.730147] WARNING: at /root/rpm/BUILD/kernel-
xenUnogrsecuritypae-2.6.31.6x/linux-2.6.31/kernel/time/timekeeping.c:102 
getnstimeofday+0x102/0x110()         
[   34.730160] Modules linked in: sch_sfq xt_limit ipt_REJECT xt_tcpudp 
ipt_LOG xt_state xt_multiport iptable_filter iptable_nat nf_nat 
nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_mangle ip_tables 
x_tables xenfs dm_multipath scsi_dh dm_mod st sd_mod crc_t10dif lpfc qla2xxx 
scsi_transport_fc scsi_tgt qla1280 scsi_mod psmouse uhci_hcd ehci_hcd usbcore 
pcspkr xen_netfront evdev ext3 jbd mbcache
[   34.730316] Pid: 0, comm: swapper xid: #0 Tainted: G      D    
2.6.31.6x_xenUnogrsecuritypae-BL5.5 #1
[   34.730326] Call Trace:
[   34.730338]  [<c1333d7a>] ? printk+0x18/0x1e
[   34.730349]  [<c1040fcd>] warn_slowpath_common+0x6d/0xa0
[   34.730360]  [<c106b7d2>] ? getnstimeofday+0x102/0x110
[   34.730370]  [<c106b7d2>] ? getnstimeofday+0x102/0x110
[   34.730381]  [<c1041015>] warn_slowpath_null+0x15/0x20
[   34.730392]  [<c106b7d2>] getnstimeofday+0x102/0x110
[   34.730403]  [<c105c716>] ktime_get_ts+0x26/0x60
[   34.730413]  [<c105c766>] ktime_get+0x16/0x40
[   34.730425]  [<c107056c>] tick_nohz_stop_sched_tick+0x6c/0x390
[   34.730437]  [<c1009187>] cpu_idle+0x27/0x80
[   34.730449]  [<c1323e25>] rest_init+0x55/0x60
[   34.730461]  [<c14a186c>] start_kernel+0x2fb/0x301
[   34.730472]  [<c14a138e>] ? unknown_bootoption+0x0/0x1ad
[   34.730483]  [<c14a108d>] i386_start_kernel+0x7c/0x83
[   34.730494]  [<c14a418e>] xen_start_kernel+0x517/0x51f
[   34.730502] ---[ end trace cd1b831872a4c880 ]---
----8<----

I'm going to try newer commits.

Regards,

-- 
Bartosz Lis @ Inst. of Information Technology, Technical Univ. of Lodz, Poland
   bartoszl @ ics.p.lodz.pl

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>