xen-devel
Re: [Xen-devel] Live Migration Error
Hi Ian,
I got a fresh code image this morning. Live migration works fine, even
after un-tweaking the timer back to its default value. I have tested,
not necessarily thoroughly, but I haven't run into trouble yet. I guess
this closes this chapter.
For whatever it may be worth, I have some comments regarding the
"previous" (Friday May 13) xfrd version:
- Even though timeout increase would allow live migration to complete
succesfully this was not always the case; there was actually a 50%
chance of success.
- On all successful migrations, the number of skipped pages after the
last iteration and before domain suspend was always zero:
Saving memory pages: iter 3 0%
3: sent 0, skipped 0,
3: sent 0, skipped 0, [DEBUG] Conn_sxpr>
(AndresNfsDomain 8)[DEBUG] Conn_sxpr< err=0
[1116255361.997192] SUSPEND flags 00020004 shinfo 00000beb eip c01068fe
esi 0002de60
- On all failed migrations, there was a nonzero number of said skipped
pages (sometimes 12, sometimes 4)
Hope this somehow helps.
Keep up the excellent work
Andres
Ian Pratt wrote:
Teemu saves the day!!!
I actually set the timeout to 100 for no particular reason
(originally it was 10, 20 didn't work either) Thanks Ian for
your suggestion as well
I'd be really surprised if increasing the timeout actually made a difference.
Are you sure you're not just using the shadow mode fix that was checked in a
couple of hours ago?
Best,
Ian
Cheers!!
Andres
At 02:45 PM 5/13/2005, Teemu Koponen wrote:
On May 13, 2005, at 20:07, Andres Lagar Cavilla wrote:
Andres,
I try to do a live migration in the same physical host, i.e. xm
migrate --live 'whatever' localhost It fails with 'Error: errors:
suspend, failed, Callbak timed out'.
It seems like transfer of memory pages works until the
point when the
domain needs to be suspended to do the final transfer.
Funny thing is
it used to work before, gloriously, and I haven't made any
software/hardware changes. At some point a xm save command
failed with
timeout, and from there on live migration fails with this message.
Non-live migration works perfectly, also between different physical
hosts. save/restore also works flawlessly.
I had similar timeout errors previously, when I was using a
bit slower
servers. I overcame the problem by slightly increasing the timeout
value in controller.py. It seemed to provide a remedy.
Teemu
--
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|