[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] reliable live migration of large and busy guests


  • To: Olaf Hering <olaf@xxxxxxxxx>, <xen-devel@xxxxxxxxxxxxx>
  • From: Keir Fraser <keir.xen@xxxxxxxxx>
  • Date: Tue, 06 Nov 2012 20:45:57 +0000
  • Delivery-date: Tue, 06 Nov 2012 20:46:33 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xen.org>
  • Thread-index: Ac28X7iuJoEeR0oq00CwgKZ5PLR1MQ==
  • Thread-topic: [Xen-devel] reliable live migration of large and busy guests

On 06/11/2012 20:28, "Olaf Hering" <olaf@xxxxxxxxx> wrote:

> We got a customer report about long-lasting and then failing live
> migration of busy guests.
> 
> The guest has 64G memory, is busy with its set of applications and as a
> result there will be always dirty pages to transfer. While some of this
> can be solved with faster network connection, the underlying issue is
> that tools/libxc/xc_domain_save.c:xc_domain_save will suspend a domain
> after a given number of iterations to transfer the remaining dirty
> pages. From what I understand this pausing of the guest (I dont know how
> long it is actually paused) is causing issues within the guest, the
> applications start to fail (again, no details).
> 
> Their suggestion is to add some knob to the overall live migration
> process to avoid the suspend. If the guest could not be transfered with
> the parameters passed to xc_domain_save(), abort the migration and let
> it running on the old host.
> 
> 
> My questions are:
> Was such issue ever seen elsewhere?

It's known that if you have a workload that is dirtying lots of pages
quickly, the final stop-and-copy phase will necessarily be large. A VM that
is busy dirtying lots of pages can dirty pages much quicker than they can be
transferred over the LAN.

> Should 'xm migrate --live' and 'xl migrate' get something like a
> --no-suspend option?

Well, it is not really possible to avoid the suspend altogether, there is
always going to be some minimal 'dirty working set'. But could provide
parameters to require the dirty working set to be smaller than X pages
within Y rounds of dirty page copying.

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.