[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: domU reboot claim failed


  • To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Jason Andryuk <jason.andryuk@xxxxxxx>
  • Date: Thu, 11 Sep 2025 17:20:03 -0400
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=citrix.com smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0)
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=lYxhfaTKwgGAktgCxBHpAENAvJBMyx4rARatCt1E9C8=; b=LJEwSjpPqo3NaBvkkMry30D7nLpC/yJoV94oaZAQOAZ4RzXs5q+sQ1Kh0+CMI3pIAHz0fUklvDlhD/F5bw5EvBkC1duYrQGP7s2CxUlsp5picD+wooqS6QJjH+vqCPo1PuI13m7+lU5YK4Z7xwo3zPKUZOrWQaX1yvRRKJPD9fNjmTAJbhUJD31JihXfIVYr0XQ8vxYKBBMpb5HDtYv/e3w96rdeNwY5JjNDUgZQA7a47StSbtzXmEZ686XBAtAZRpWP1VXzku2fNoApevg58AkDn+3VjxNtJtUvqKzQrjYeobmiz0cnMN/6gFruLvhysbHh2QGRt36Trx2dzjvbeA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=kjkl4ui0p2OD1cLImV0zwLq4GnTiMpHluAX0eBG88OeEb+72lNer1dRSML/8eC9peb4T7QXjVchXvMUPJ6vWkZVeS8L+YUfu+oz5Fflui7o+DwHWm1yoHN4DamMdh+5cGbb/zrQblIKCZqOYBIaQfFXG7TLnO2bDXs7qEfQFcDQm1XSlXJbUXPATzHTsN1zaQmvc5m/B4/DGvnDep8YRAFxsv5bhC4Q+hrH2TMnBEE+WhWo8iB3e1+S/sxJbZpzkyJWOBbcF6v+oMiYZuoG5TjYHwavuxfMNGqZycA8wXcS8AjXE3RFruwnNRjPK9s44aK2YcZ7wCCh/Hf6LA74gJA==
  • Delivery-date: Thu, 11 Sep 2025 21:20:32 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Thanks, everyone.

On 2025-09-10 17:57, Andrew Cooper wrote:
On 10/09/2025 7:58 pm, Jason Andryuk wrote:
Hi,

We're running Android as a guest and it's running the Compatibility
Test Suite.  During the CTS, the Android domU is rebooted multiple times.

In the middle of the CTS, we've seen reboot fail.  xl -vvv shows:
domainbuilder: detail: Could not allocate memory for HVM guest as we
cannot claim memory!
xc: error: panic: xg_dom_boot.c:119: xc_dom_boot_mem_init: can't
allocate low memory for domain: Out of memory
libxl: error: libxl_dom.c:581:libxl__build_dom: xc_dom_boot_mem_init
failed: Cannot allocate memory
domainbuilder: detail: xc_dom_release: called

So the claim failed.  The system has enough memory since we're just
rebooting the same VM.  As a work around, I added sleep(1) + retry,
which works.

The curious part is the memory allocation.  For d2 to d5, we have:
domainbuilder: detail: range: start=0x0 end=0xf0000000
domainbuilder: detail: range: start=0x100000000 end=0x1af000000
xc: detail: PHYSICAL MEMORY ALLOCATION:
xc: detail:   4KB PAGES: 0x0000000000000000
xc: detail:   2MB PAGES: 0x00000000000006f8
xc: detail:   1GB PAGES: 0x0000000000000003

But when we have to retry the claim for d6, there are no 1GB pages used:
domainbuilder: detail: range: start=0x0 end=0xf0000000
domainbuilder: detail: range: start=0x100000000 end=0x1af000000
domainbuilder: detail: HVM claim failed! attempt 0
xc: detail: PHYSICAL MEMORY ALLOCATION:
xc: detail:   4KB PAGES: 0x0000000000002800
xc: detail:   2MB PAGES: 0x0000000000000ce4
xc: detail:   1GB PAGES: 0x0000000000000000

But subsequent reboots for d7 and d8 go back to using 1GB pages.

Does the change in memory allocation stick out to anyone?

Unfortunately, I don't have insight into what the failing test is doing.

Xen doesn't seem set up to track the claim across reboot.  Retrying
the claim works in our scenario since we have a controlled configuration.

This looks to me like a known phenomenon.  Ages back, a change was made
in how Xen scrubs memory, from being synchronous in domain_kill(), to
being asynchronous in the idle loop.

The consequence being that, on an idle system, you can shutdown and
reboot the domain faster, but on a busy system you end up trying to
allocate the new domain while memory from the old domain is still dirty.

It is a classic example of a false optimisation, which looks great on an
idle system only because the idle CPUs are swallowing the work.

This impacts the ability to find a 1G aligned block of free memory to
allocate a superpage with, and by the sounds of it, claims (which
predate this behaviour change) aren't aware of the "to be scrubbed"
queue and fail instead.

Claims check total_avail_pages and outstanding_claims. It looks like free_heap_pages() sets PGC_need_scrub and then increments total_avail_pages. But then it's not getting through the accounting far enough to stake a claim?

Also free_heap_page() looks like it's trying to merge chunks - I thought that would handle larger allocations. Are they not truly usable until they've been scrubbed, which leads to the lack of 1GB pages?

Clearly I need to learn more here.

I thought OpenXT had a revert of this.  IIRC it was considered a
material regression in being able to know when a domain has gone away.

OpenXT wants to scrub the memory ASAP so there is no remnant data. They is a patch for that.

Thanks,
Jason



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.