[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] BUG: failed to save x86 HVM guest with 1TB ram



On 07/09/15 09:09, wangxin (U) wrote:
Hi,

I'm tring to hibernate an x86 HVM guest with 1TB ram,
   [1.VM config]
   builder = "hvm"
   name = "suse12_sp3"
   memory = 1048576
   vcpus = 16
   boot = "c"
   disk = [ '/mnt/sda10/vm/SELS_ide_disk.img,raw,xvda,rw' ]
   device_model_version = "qemu-xen"
   vnc = 1
   vnclisten = '9.51.3.174'
   vncdisplay = 0

but I get the error messages(see below) from XC:
   [2.VM saving] xl save -p suse12_sp3 suse12_sp3.save
   Saving to suse12_sp3.save new xl format (info 0x1/0x0/1309)
   xc: error: Cannot save this big a guest: Internal error
   libxl: error: libxl_dom.c:1875:libxl__xc_domain_save_done: saving domain: \
   domain did not respond to suspend request: Argument list too long
   libxl: error: libxl_dom.c:2032:remus_teardown_done: Remus: failed to \
   teardown device for guest with domid 3, rc -8
   Failed to save domain, resuming domain
   xc: error: Dom 3 not suspended: (shutdown 0, reason 255): Internal error
   libxl: error: libxl.c:508:libxl__domain_resume: xc_domain_resume failed \
   for domain 3: Invalid argument

The error in function xc_domain_save in xc_domain_save.c,
     /* Get the size of the P2M table */
     dinfo->p2m_size = xc_domain_maximum_gpfn(xch, dom) + 1;

     if ( dinfo->p2m_size > ~XEN_DOMCTL_PFINFO_LTAB_MASK )
     {
         errno = E2BIG;
         ERROR("Cannot save this big a guest");
         goto out;
     }

it may be 1TB ram plus pci-hole space make the MFN wider than limit size.

If I want to save a VM with 1TB ram or larger, what shoud I do? Did anyone
have tried this before and have some configuration I can refer to?

This is clearly not from Xen 4.6, but the same issue will be present.

The check serves a dual purpose. In the legacy case, it is to avoid clobbering the upper bits of pfn information with pfn type information for 32bit toolstacks; any PFN above 2^28 would have type information clobbering the upper bits. This has been mitigated somewhat in migration v2, as pfns are strictly 64bit values, still using the upper 4 bits for type information, allowing 60 bits for the PFN itself.

The second purpose is just as a limit on toolstack resources. Migration requires allocating structures which scale linearly with the size of the VM; the biggest of which would be ~1GB for the p2m. Added to this is >1GB for the m2p, and suddenly a 32bit toolstack process is looking scarce on RAM.

During the development of migration v2, I didn't spend any time considering if or how much it was sensible to lift the restriction by, so the check was imported wholesale from the legacy code.

For now, I am going to say that it simply doesn't work. Simply upping the limit is only a stopgap measure; an HVM guest can still mess this up by playing physmap games and mapping a page of ram at a really high (guest) physical address. Longterm, we need hypervisor support for getting a compressed view of guest physical address space, so toolstack side resources are proportional to the amount of RAM given to the guest, not to how big a guest decides to make its physmap.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.