WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

[Xen-users] "xm save" only works once...

To: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-users] "xm save" only works once...
From: Ralph Passgang <ralph@xxxxxxxxxxxxx>
Date: Mon, 15 Aug 2005 20:38:31 +0200
Delivery-date: Mon, 15 Aug 2005 18:43:34 +0000
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: KMail/1.8.1
Hi,

I am using Xen-2.0.7 on a Dual Intel Xeon 2.8GHz system with 4GB of ram. I am 
using 2.6.11 as kernel for my domain 0. Domain 0 uses Debian Sarge with a 
backported Xen 2.0.7 package (only litte changes to the debian 2.0.6 package, 
nothing important enough to get metioned). All kernels were compiled against 
vanilla kernels with xen-patch. The domain U's are using 2.6.11 or 2.4.30 
(debian, suse).

I have no problems within domains and everything is running very smoothly, 
exepct one thing (which was also not working correctly in xen-2.0.6 for me):
I can save a domain with "xm save <domainname> <suspendfile>" once and I can 
restore this domain again, but if I try a second "xm save ..." it simply 
seems to hang. Nothing happens and the last thing in the logs are these 
lines:

==> /var/log/xend.log <==
[2005-08-15 20:12:27 xend] INFO (XendMigrate:380) Save BEGIN: ['save', ['id', 
'1'], ['state', 'begin'], ['domain', '5'], ['file', '/suspend/vm-ralph']]
[2005-08-15 20:12:27 xend] INFO (XendRoot:113) EVENT> xend.domain.save 
['vm-ralph', '5', 'begin', ['save', ['id', '1'], ['state', 'begin'], 
['domain', '5'], ['file', '/suspend/vm-ralph']]]

==> /var/log/xfrd.log <==
3808 [INF] XFRD> Accepted connection from 127.0.0.1:3905 on 2
4165 [INF] XFRD> Xfr service for 127.0.0.1:3905
[DEBUG] Conn_init> flags=1
[DEBUG] Conn_init> write stream...
[DEBUG] stream_init>mode=w flags=1 compress=0
[DEBUG] stream_init> unbuffer...
[DEBUG] stream_init< err=0
[DEBUG] Conn_init> read stream...
[DEBUG] stream_init>mode=r flags=1 compress=0
[DEBUG] stream_init> unbuffer...
[DEBUG] stream_init< err=0
[DEBUG] Conn_sxpr>
(xfr.hello 1 0)[DEBUG] Conn_sxpr< err=0
[DEBUG] Conn_sxpr>
(xfr.save 5 "(domain (id 5) (name vm-ralph) (memory 127) (maxmem 128) (state 
-b---) (cpu 3) (cpu_time 1.583158713) (up_time 1401.25794005) (start_time 
1124128146.12) (console (status listening) (id 12) (domain 5) (local_port 12) 
(remote_port 1) (console_port 9605)) (devices (vif (idx 0) (vif 0) (mac 
aa:00:00:00:00:22) (vifname vif5.0) (ip 212.79.XXX.XXX/32) (evtchn 17 4) 
(index 0)) (vbd (idx 0) (vdev 2049) (device 65030) (mode w) (dev sda1) (uname 
phy:xen-volumes/vm-ralph) (node xen-volumes/vm-ralph) (index 0)) (vbd (idx 1) 
(vdev 2050) (device 65031) (mode w) (dev sda2) (uname 
phy:xen-volumes/swap-ralph) (node xen-volumes/swap-ralph) (index 1))) (config 
(vm (name vm-ralph) (memory 128) (cpu 3) (image (linux 
(kernel /boot/xen-linux-2.6.11-domu-tops1) 
(ramdisk /boot/xen-linux-2.6.11-domu-tops1-modules) (root '/dev/sda1 ro'))) 
(device (vbd (uname phy:xen-volumes/vm-ralph) (dev sda1) (mode w))) (device 
(vbd (uname phy:xen-volumes/swap-ralph) (dev sda2) (mode w))) (device (vif 
(mac aa:00:00:00:00:22) (ip 212.79.XXX.XXX/32))))))" /suspend/vm-ralph)
[DEBUG] Conn_sxpr< err=0
[1124129547.387983] xc_linux_save start 5

xc_linux_save start 5
                     

I can strace the "xm save" process, but there is not much acction:

xen:/var/log# ps fax |grep xm
 4164 pts/0    S+     0:00  |               \_ python /usr/sbin/xm save 
vm-ralph /suspend/vm-ralph
xen:/var/log# strace -p 4164
Process 4164 attached - interrupt to quit
recv(3, 

Even an xfrd thrad seems to be spawned, but there is more or less the same as 
in the xm save process:

xen:/var/log# ps fax |grep xfrd
 3808 ?        S      0:00 xfrd
 4165 ?        SL     0:00  \_ xfrd
xen:/var/log# strace -p 4165
Process 4165 attached - interrupt to quit
read(3,                                      

I can press ctrl-c and the "xm save" aborts with the following error (I waited 
over 3min):

Traceback (most recent call last):
  File "/usr/sbin/xm", line 9, in ?
    main.main(sys.argv)
  File "/usr/lib/python2.3/site-packages/xen/xm/main.py", line 808, in main
    xm.main(args)
  File "/usr/lib/python2.3/site-packages/xen/xm/main.py", line 106, in main
    self.main_call(args)
  File "/usr/lib/python2.3/site-packages/xen/xm/main.py", line 124, in 
main_call
    p.main(args[1:])
  File "/usr/lib/python2.3/site-packages/xen/xm/main.py", line 276, in main
    server.xend_domain_save(dom, savefile)
  File "/usr/lib/python2.3/site-packages/xen/xend/XendClient.py", line 244, in 
xend_domain_save
    {'op'      : 'save',
  File "/usr/lib/python2.3/site-packages/xen/xend/XendClient.py", line 148, in 
xendPost
    return self.client.xendPost(url, data)
  File "/usr/lib/python2.3/site-packages/xen/xend/XendProtocol.py", line 79, 
in xendPost
    return self.xendRequest(url, "POST", args)
  File "/usr/lib/python2.3/site-packages/xen/xend/XendProtocol.py", line 143, 
in xendRequest
    resp = conn.getresponse()
  File "/usr/lib/python2.3/httplib.py", line 781, in getresponse
    response.begin()
  File "/usr/lib/python2.3/httplib.py", line 273, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python2.3/httplib.py", line 231, in _read_status
    line = self.fp.readline()
  File "/usr/lib/python2.3/socket.py", line 323, in readline
    data = recv(1)
KeyboardInterrupt

After that it doesn't matter if I shutdown and recreate the domain before I 
try to save the domain for the second time. It happens every time after the 
first successfull save&restore. Sometimes even on the first "xm save" 
attempt.

It even seems that xen let's the "half-saved" domain in a broken state, 
because I cannot shutdown the domain correctly after the second "xm save" 
attempt. I can ssh into it and type "halt" and it shutdowns, but xen (xm 
list) still things that the domain is running. even a xm destroy <domainname> 
doesn't help. I have to reboot the phy. machine to get the domain working 
correctly.

Because this should get a production system very soon I would appreciate help 
very much. More information (like xm dmesg) available on request... ;-PP

--Ralph

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users