[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Re: Xen-unstable save error



On 06/21/2010 03:24 PM, Keir Fraser wrote:
On 21/06/2010 14:18, "Michal Novotny"<minovotn@xxxxxxxxxx>  wrote:

So there's no xs_suspend_evtchn_port (or anything xs_*) function being
exported by /usr/lib64/libxenctrl.so (which is the symlink to
/usr/lib64/libxenctrl.so.4.0.0), therefore:

#objdump -x /usr/lib64/libxenctrl.so.4.0.0 | grep xs_
#objdump -x /xen-unstable.hg/tools/libxc/libxenctrl.so.4.0.0  | grep xs_
Libxenstore, libxenstore, libxen**STORE**.

#nm /my/path/to/libxenstore.so.3.0.0 | grep xs_sus
000000000000346a T xs_suspend_evtchn_port

  -- Keir


Oh, sorry for that and thanks for noticing my mistake. I overlooked this one, nevertheless it's still not working:

[2010-06-21 17:27:39 4305] DEBUG (XendCheckpoint:126) [xc_save]: /usr/lib64/xen/bin/xc_save 56 1 0 0 4 [2010-06-21 17:27:39 4305] INFO (XendCheckpoint:410) xc_save: failed to get the suspend evtchn port
[2010-06-21 17:27:39 4305] INFO (XendCheckpoint:410)
[2010-06-21 17:27:39 4305] DEBUG (XendCheckpoint:381) suspend
[2010-06-21 17:27:39 4305] DEBUG (XendCheckpoint:129) In saveInputHandler suspend
[2010-06-21 17:27:39 4305] DEBUG (XendCheckpoint:131) Suspending 1 ...
[2010-06-21 17:27:39 4305] DEBUG (XendDomainInfo:521) XendDomainInfo.shutdown(suspend) [2010-06-21 17:27:39 4305] DEBUG (XendDomainInfo:1877) XendDomainInfo.handleShutdownWatch [2010-06-21 17:27:39 4305] INFO (XendDomainInfo:538) HVM save:remote shutdown dom 1!
[2010-06-21 17:27:39 4305] INFO (XendCheckpoint:137) Domain 1 suspended.
[2010-06-21 17:27:39 4305] INFO (XendDomainInfo:2074) Domain has shutdown: name=migrating-rhel5-32fv-stubdom id=1 reason=suspend. [2010-06-21 17:27:40 4305] INFO (image:538) signalDeviceModel:restore dm state to running
[2010-06-21 17:27:40 4305] DEBUG (XendCheckpoint:146) Written done
[2010-06-21 17:27:46 4305] DEBUG (XendDomainInfo:3067) XendDomainInfo.destroy: domid=1 [2010-06-21 17:27:46 4305] DEBUG (XendDomainInfo:2397) Destroying device model [2010-06-21 17:27:47 4305] INFO (image:615) migrating-rhel5-32fv-stubdom device model terminated


# ls -al rhel5-32fv.sav
-rwxr-xr-x 1 root root 54657427 Jun 21 17:27 rhel5-32fv.sav

# xm restore rhel5-32fv.sav
Error: /usr/lib64/xen/bin/xc_restore 4 2 2 3 1 1 1 0 failed
Usage: xm restore <CheckpointFile> [-p]

Restore a domain from a saved state.
  -p, --paused                   Do not unpause domain after restoring it

# tail /var/log/xen/xend.log
[2010-06-21 17:29:21 4305] INFO (image:822) Need to create platform device.[domid:2] [2010-06-21 17:29:21 4305] DEBUG (XendCheckpoint:273) restore:shadow=0x9, _static_max=0x40000000, _static_min=0x0, [2010-06-21 17:29:21 4305] DEBUG (XendCheckpoint:292) [xc_restore]: /usr/lib64/xen/bin/xc_restore 4 2 2 3 1 1 1 0 [2010-06-21 17:29:22 4305] INFO (XendCheckpoint:410) xc: error: Error when reading batch size (0 = Success): Internal error [2010-06-21 17:29:22 4305] INFO (XendCheckpoint:410) xc: error: error when buffering batch, finishing (0 = Success): Internal error [2010-06-21 17:29:22 4305] INFO (XendCheckpoint:410) xc: error: error zeroing magic pages (22 = Invalid argument): Internal error [2010-06-21 17:29:22 4305] DEBUG (XendDomainInfo:3067) XendDomainInfo.destroy: domid=2 [2010-06-21 17:29:22 4305] ERROR (XendDomainInfo:3081) XendDomainInfo.destroy: domain destruction failed.
Traceback (most recent call last):
File "usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 3074, in destroy
    xc.domain_pause(self.domid)
Error: (3, 'No such process')
[2010-06-21 17:29:22 4305] DEBUG (XendDomainInfo:2402) No device model
[2010-06-21 17:29:22 4305] DEBUG (XendDomainInfo:2404) Releasing devices
[2010-06-21 17:29:22 4305] DEBUG (XendDomainInfo:2410) Removing vif/0
[2010-06-21 17:29:22 4305] DEBUG (XendDomainInfo:1272) XendDomainInfo.destroyDevice: deviceClass = vif, device = vif/0
[2010-06-21 17:29:22 4305] DEBUG (XendDomainInfo:2410) Removing vbd/768
[2010-06-21 17:29:22 4305] DEBUG (XendDomainInfo:1272) XendDomainInfo.destroyDevice: deviceClass = vbd, device = vbd/768
[2010-06-21 17:29:22 4305] DEBUG (XendDomainInfo:2410) Removing vbd/2048
[2010-06-21 17:29:22 4305] DEBUG (XendDomainInfo:1272) XendDomainInfo.destroyDevice: deviceClass = vbd, device = vbd/2048
[2010-06-21 17:29:22 4305] DEBUG (XendDomainInfo:2410) Removing vfb/0
[2010-06-21 17:29:22 4305] DEBUG (XendDomainInfo:1272) XendDomainInfo.destroyDevice: deviceClass = vfb, device = vfb/0
[2010-06-21 17:29:22 4305] DEBUG (XendDomainInfo:2410) Removing console/0
[2010-06-21 17:29:22 4305] DEBUG (XendDomainInfo:1272) XendDomainInfo.destroyDevice: deviceClass = console, device = console/0 [2010-06-21 17:29:22 4305] ERROR (XendCheckpoint:344) /usr/lib64/xen/bin/xc_restore 4 2 2 3 1 1 1 0 failed
Traceback (most recent call last):
File "/usr/lib64/python2.4/site-packages/xen/xend/XendCheckpoint.py", line 296, in restore
    forkHelper(cmd, fd, handler.handler, True)
File "/usr/lib64/python2.4/site-packages/xen/xend/XendCheckpoint.py", line 398, in forkHelper
    raise XendError("%s failed" % string.join(cmd))
XendError: /usr/lib64/xen/bin/xc_restore 4 2 2 3 1 1 1 0 failed
[2010-06-21 17:29:22 4305] ERROR (XendDomain:1182) Restore failed
Traceback (most recent call last):
File "usr/lib64/python2.4/site-packages/xen/xend/XendDomain.py", line 1166, in domain_restore_fd dominfo = XendCheckpoint.restore(self, fd, paused=paused, relocating=relocating) File "/usr/lib64/python2.4/site-packages/xen/xend/XendCheckpoint.py", line 345, in restore
    raise exn
XendError: /usr/lib64/xen/bin/xc_restore 4 2 2 3 1 1 1 0 failed

So now it seems to be linked with the correct library but it can't get the suspend port by now:
>> xc_save: failed to get the suspend evtchn port

This is being called from xc_save.c (snippet from line 192):
...
       port = xs_suspend_evtchn_port(si.domid);

        if (port < 0)
            warnx("failed to get the suspend evtchn port\n");
        else {
            ... suspend...
        }

I had a look at the code for xenstore/xs.c I saw it's reading the value at:

/local/domain/%d/device/suspend/event-channel

but when I try to get it using:

xenstore-ls /local/domain/3/device/suspendupstream

where 3 is my domid I saw nothing, I saw just:

# xenstore-ls /local/domain/3/device
vfb = ""
 0 = ""
  state = "1"
  backend-id = "0"
  backend = "/local/domain/0/backend/vfb/3/0"
vbd = ""
 768 = ""
  backend-id = "0"
  virtual-device = "768"
  device-type = "disk"
  state = "1"
  backend = "/local/domain/0/backend/vbd/3/768"
 2048 = ""
  backend-id = "0"
  virtual-device = "2048"
  device-type = "disk"
  state = "1"
  backend = "/local/domain/0/backend/vbd/3/2048"
vif = ""
 0 = ""
  state = "1"
  backend-id = "0"
  backend = "/local/domain/0/backend/vif/3/0"
console = ""
 0 = ""
  state = "1"
  backend-id = "0"
  backend = "/local/domain/0/backend/console/3/0"
#

My guest is RHEL-5 i386 guest but this seems that the suspend port is missing. AFAIK, you started using the SUSPEND_CANCEL some time ago which requires the modified kernel.

Isn't it possible that's the issue or how is it with the SUSPEND_CANCEL functionality?

Thanks,
Michal

--
Michal Novotny<minovotn@xxxxxxxxxx>, RHCE
Virtualization Team (xen userspace), Red Hat


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.