[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Error in XendCheckpoint: failed to flush file




Hi Keir,

here are some of the symptoms I get.

----------------

on x86-32 with changeset 14142 (this is on a blade) after a fresh 'hg clone' and build:

In the xm-test suite for example the 'restore' test cases fail:

make -C tests/restore check-TESTS

REASON: Domain still running after save!
FAIL: 01_restore_basic_pos.test
PASS: 02_restore_badparm_neg.test
PASS: 03_restore_badfilename_neg.test

REASON: Failed to create domain
FAIL: 04_restore_withdevices_pos.test


similar errors in the save test case:

REASON: Domain still running after save!
FAIL: 01_save_basic_pos.test
PASS: 02_save_badparm_neg.test
PASS: 03_save_bogusfile_neg.test


Is also see this here in 'xm dmesg'.

(XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch input to Xen).
(XEN) platform_hypercall.c:142: Domain 0 says that IO-APIC REGSEL is good
(XEN) grant_table.c:286:d0 Bad flags (0) or dom (0). (expected dom 0)
(XEN) grant_table.c:251:d0 Bad ref (2097664).
(XEN) grant_table.c:286:d0 Bad flags (0) or dom (0). (expected dom 0)

When doing a 'reboot' with the 'reboot' command that blade does not actually reboot but hangs after completely shutting down domain-0. I do not see this problem on other machines, though.

------------

on x86-64 (this is also a blade) after a fresh 'hg clone' and build:
Intel-Xeon 3.2Ghz
2 physical processor with hyperthreading each -> 4 logical processors
domain-0 has dom0_mem=10240000


The 'save' tests just crashed that machine (twice). :-/

I'll post a migration test that exposes the following error on x86-64 (only!) inside the guest when running that test 02_migrate_localhost_loop. To see these messages I modified the 'debugMe' variable in xm-test/lib/XmTestLib/Console.py line 68 and set it to 'True'.

@%@%> XENBUS error -12 while reading message
XENBUS error -12 while reading message
XENBUS unexpected type [1325400064], expected [4]
XENBUS error -12 while reading message
XENBUS error -12 while reading message
[...]

XENBUS error -12 while reading message
XENBUS: Unable to read cpu state
XENBUS: Unable to read cpu state

When building the sources with 'make -j 16' that blade's VNC output freezes at some point. Pinging it still works, but ssh'ing into it does not respond within reasonable time. Building the sources with non-parallel 'make' works fine.

  Stefan

xen-devel-bounces@xxxxxxxxxxxxxxxxxxx wrote on 02/28/2007 02:04:22 AM:

> I'm not sure the two are related. Fsync, lseek(), fadvise() will all fail if
> the fd maps to a socket. The failure is harmless and the error return code
> is ignored. The error to xend.log is overly noisy and needs cleaning up but
> unfortunately the suspend/resume problems probably lie elsewhere. What
> failure symptoms do you see?
>
>  -- Keir
>
> On 28/2/07 04:46, "Stefan Berger" <stefanb@xxxxxxxxxx> wrote:
>
> > I get these errors pretty often lately. This is on a x86-32 machine with
> > changes 14142. Does anyone else these this? Local migration and
> > suspend/resume fail quite frequently.
> >
> > [2007-02-27 23:39:56 20114] DEBUG (XendCheckpoint:236)
> > [xc_restore]: /usr/lib/xen/bin/xc_restore 23 262 18432 1 2 0 0 0
> > [2007-02-27 23:39:56 20114] INFO (XendCheckpoint:343) xc_linux_restore
> > start: max_pfn = 4800
> > [2007-02-27 23:39:56 20114] INFO (XendCheckpoint:343) Reloading memory
> > pages: 0%
> > [2007-02-27 23:39:56 20114] INFO (XendCheckpoint:343) Saving memory
> > pages: iter 1  37%ERROR Internal error: Failed to flush file: Invalid
> > argument (22 = Invalid argument)
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.