[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: domU reboot claim failed


  • To: Jan Beulich <jbeulich@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Jason Andryuk <jason.andryuk@xxxxxxx>
  • From: Alejandro Vallejo <alejandro.garciavallejo@xxxxxxx>
  • Date: Thu, 11 Sep 2025 17:46:08 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=suse.com smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0)
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=1GKoceEeRTsg1YvJtSF0BHGnKeYr1esTx0wBbEOPykQ=; b=iCDBYxuQVOzEtH5rosJwVYxGPWPLurWk9z8E2usiuYo+mmB9qZAGQmlSilSI42bhbTPiE4mpmfjEAy/fDXqDVPyKIScxouvImVxRYB0eDBeJ8rU1WneC6DcwBpHxtxgAr+XZ21IhdnnRKDHEaN8adcrC6yM/33lOnMJdI12tYuFfGGe3TCYr07DBuIvxuDrYHo4IWjfgvtgV+MRieAi8CkkdR5C4GiPly4iwkq0Rl0Cb+cStRTsxCHgTY9OB1uVvO8cjW6SdfbqcKnJQPCKLcr7PU8vl6J7qv1PPvShTQMWytaO5d83lua2x0Mhs8mkqDNmqNh4C/qyHjPJodkKa9A==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=e1u3LApDYszOTEma88lL46zxbHQhLkO0HRq583UzgeIfnMyRcPUFDYuS+UWUmfXvRsP15/WsAKJC+APpeoElUtM78X2j+YMAfQ/lJyYqi3hYpEUQCjEuy3/Cew2O3mIRgzTcnuLuMGXxIaXexVQQ2Qi8MaNZajFtH713XwUo+O/QUR4jmdT7EhuAKL81zzfa9I7HFIjBggmIPlhVTWt4FtF1/6NjDMdz8EuKGEfaU4oxnx08CRxQEi8rATyYGovX590zJOk1Shm+6KXI1ZY5dacx7e7pVAjwFPptplZqlXrNmgpC3+O7fyaS6pEJOQNAsB6K/uGxL41lpX+nLkllJw==
  • Cc: Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Xen-devel <xen-devel-bounces@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Thu, 11 Sep 2025 15:46:23 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Thu Sep 11, 2025 at 9:55 AM CEST, Jan Beulich wrote:
> On 10.09.2025 23:57, Andrew Cooper wrote:
>> On 10/09/2025 7:58 pm, Jason Andryuk wrote:
>>> Hi,
>>>
>>> We're running Android as a guest and it's running the Compatibility
>>> Test Suite.  During the CTS, the Android domU is rebooted multiple times.
>>>
>>> In the middle of the CTS, we've seen reboot fail.  xl -vvv shows:
>>> domainbuilder: detail: Could not allocate memory for HVM guest as we
>>> cannot claim memory!
>>> xc: error: panic: xg_dom_boot.c:119: xc_dom_boot_mem_init: can't
>>> allocate low memory for domain: Out of memory
>>> libxl: error: libxl_dom.c:581:libxl__build_dom: xc_dom_boot_mem_init
>>> failed: Cannot allocate memory
>>> domainbuilder: detail: xc_dom_release: called
>>>
>>> So the claim failed.  The system has enough memory since we're just
>>> rebooting the same VM.  As a work around, I added sleep(1) + retry,
>>> which works.
>>>
>>> The curious part is the memory allocation.  For d2 to d5, we have:
>>> domainbuilder: detail: range: start=0x0 end=0xf0000000
>>> domainbuilder: detail: range: start=0x100000000 end=0x1af000000
>>> xc: detail: PHYSICAL MEMORY ALLOCATION:
>>> xc: detail:   4KB PAGES: 0x0000000000000000
>>> xc: detail:   2MB PAGES: 0x00000000000006f8
>>> xc: detail:   1GB PAGES: 0x0000000000000003
>>>
>>> But when we have to retry the claim for d6, there are no 1GB pages used:
>>> domainbuilder: detail: range: start=0x0 end=0xf0000000
>>> domainbuilder: detail: range: start=0x100000000 end=0x1af000000
>>> domainbuilder: detail: HVM claim failed! attempt 0
>>> xc: detail: PHYSICAL MEMORY ALLOCATION:
>>> xc: detail:   4KB PAGES: 0x0000000000002800
>>> xc: detail:   2MB PAGES: 0x0000000000000ce4
>>> xc: detail:   1GB PAGES: 0x0000000000000000
>>>
>>> But subsequent reboots for d7 and d8 go back to using 1GB pages.
>>>
>>> Does the change in memory allocation stick out to anyone?
>>>
>>> Unfortunately, I don't have insight into what the failing test is doing.
>>>
>>> Xen doesn't seem set up to track the claim across reboot.  Retrying
>>> the claim works in our scenario since we have a controlled configuration.
>> 
>> This looks to me like a known phenomenon.  Ages back, a change was made
>> in how Xen scrubs memory, from being synchronous in domain_kill(), to
>> being asynchronous in the idle loop.
>> 
>> The consequence being that, on an idle system, you can shutdown and
>> reboot the domain faster, but on a busy system you end up trying to
>> allocate the new domain while memory from the old domain is still dirty.
>> 
>> It is a classic example of a false optimisation, which looks great on an
>> idle system only because the idle CPUs are swallowing the work.
>
> I wouldn't call this a "false optimization", but rather one ...
>
>> This impacts the ability to find a 1G aligned block of free memory to
>> allocate a superpage with, and by the sounds of it, claims (which
>> predate this behaviour change) aren't aware of the "to be scrubbed"
>> queue and fail instead.
>
> ... which isn't sufficiently integrated with the rest of the allocator.
>
> Jan

That'd depend on the threat model. At the very least there ought to be a
Kconfig knob to control it. You can't really tell a customer "your data is
gone from our systems" unless it really is gone. I'm guessing part of the
rationale was speeding up the obnoxiously slow destroydomain, since it hogs
a dom0 vCPU until it's done and it can take many minutes in large domains.

IOW: It's a nice optimisation, but there's multiple use cases for specifically
not wanting something like that.

Cheers,
Alejandro



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.