[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/pod: Do not fragment PoD memory allocations


  • To: Elliott Mitchell <ehem+xen@xxxxxxx>
  • From: George Dunlap <George.Dunlap@xxxxxxxxxx>
  • Date: Thu, 28 Jan 2021 22:56:28 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=bVJm24GNRGiLDJsySIQpJE1+emgBYdDnzk3xLDFhf5g=; b=FxPi5OSDYVDfO+ZRXZFLd9UClx71UzCEkF4Gh/07fZBJ1coovfUU3/mx/Puoo1bfhhMrqivtMK7Y01R8C3VJo3/R9M6cTgFU4KargkHUzAkBbL7KO0WUi9dGJ2rNemc7zdnXJcR6SmwzdSeHxtRd/lMLn97EX8s9RRCAhtwlhc3DrPnSgT/GyAjan6REmyVkffwZLoBZerUZcdokoFeAzgyL5TlIKMw5iS5QfSoRD5X2sBSYbGA2FNQgMCZWeGXU0TXgAChioYVgXCrqi1qas5XZNftKROZU+3bxo0IPzXfAccU7JR4O7iaaWQMYerspR8ImJ7Jvh+WCkymO5HB4Ww==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Cma5WynCHAb7y0d8EmIDaCVegXnpzWWazzQbEeGz8D9zsLH0PE5v0hDcxGuYK04AYR1U0J30Vn6M0XPKTYdtcanEP9ITdDpWpIMHn6kH4uCFzx6YX8R10tMg5o4Ly1FgbksTOrAct1udIFqJxmwLm90ZkRUp5TibeTRckB7cfbPCDwacBNUnVN4Rct7aqGvZWjDWhhDbRtW0UrbiTDW0sfX5+s4X36pnRX8cUqTfGCrM8hdMqj5NYJjRvt7haB4SFOxWzLTCOrMmRGoNNUx89/XHjYqLEV+A5zYFq252A/f87P91nqbCVTR1YPzRsow1poV/dYZFtAX5ivqn6syWew==
  • Authentication-results: esa1.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: Jan Beulich <jbeulich@xxxxxxxx>, Wei Liu <wl@xxxxxxx>, Roger Pau Monne <roger.pau@xxxxxxxxxx>, "open list:X86" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>
  • Delivery-date: Thu, 28 Jan 2021 22:56:36 +0000
  • Ironport-sdr: BW0qtozWtXA9KC8RaMAgloGzoX/qZYFDncK5YIjaLvaVIHwczmTH8aOTZdRltvTe4KML11vV+7 GFui53FquKOhJ/jIE1+41Rec/4/wNrMHioPpMj2Bs2aESj9HLY3UW8zKY9uB0sP/9DBvNNEpFb 5u5r/fSthbXmLTNUhnh3ongleMyYpbDaTC0iWWZEqccMmM8tjORzfOQkhSjdPb3QjihG/Njghf FLYk5kpzq32IE05PDuF3L81IK8NZnXv5119G/M59Pays0sv9XixT1DS11cvIk0ryb0BHxi/fPK jps=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHW8qXhfghyfjZCb0iw2DUJHGLmxKo4G2OAgACDU4CAASMUgIAAcJSAgAELJICAAK7PgIAADiAAgAAXqgCAAMbHgIAAiAKAgABHgwCAAAPrgA==
  • Thread-topic: [PATCH] x86/pod: Do not fragment PoD memory allocations


> On Jan 28, 2021, at 10:42 PM, George Dunlap <george.dunlap@xxxxxxxxxx> wrote:
> 
> 
> 
>> On Jan 28, 2021, at 6:26 PM, Elliott Mitchell <ehem+xen@xxxxxxx> wrote:
>> 
>> On Thu, Jan 28, 2021 at 11:19:41AM +0100, Jan Beulich wrote:
>>> On 27.01.2021 23:28, Elliott Mitchell wrote:
>>>> On Wed, Jan 27, 2021 at 09:03:32PM +0000, Andrew Cooper wrote:
>>>>> So.?? What *should* happen is that if QEMU/OVMF dirties more memory than
>>>>> exists in the PoD cache, the domain gets terminated.
>>>>> 
>>>>> Irrespective, Xen/dom0 dying isn't an expected consequence of any normal
>>>>> action like this.
>>>>> 
>>>>> Do you have a serial log of the crash??? If not, can you set up a crash
>>>>> kernel environment to capture the logs, or alternatively reproduce the
>>>>> issue on a different box which does have serial?
>>>> 
>>>> No, I don't.  I'm setup to debug ARM failures, not x86 ones.
>>> 
>>> Then alternatively can you at least give conditions that need to
>>> be met to observe the problem, for someone to repro and then
>>> debug? (The less complex the better, of course.)
>> 
>> Multiple prior messages have included statements of what I believed to be
>> the minimal case to reproduce.  Presently I believe the minimal
>> constraints are, maxmem >= host memory, memory < free Xen memory, type
>> HVM.  A minimal kr45hme.cfg file:
>> 
>> type = "hvm"
>> memory = 1024
>> maxmem = 1073741824
>> 
>> I suspect maxmem > free Xen memory may be sufficient.  The instances I
>> can be certain of have been maxmem = total host memory *7.
> 
> Can you include your Xen version and dom0 command-line?
> 
> For me, domain creation fails with an error like this:
> 
> root@immortal:/images# xl create c6-01.cfg maxmem=1073741824
> Parsing config from c6-01.cfg
> xc: error: panic: xc_dom_boot.c:120: xc_dom_boot_mem_init: can't allocate low 
> memory for domain: Out of memory
> libxl: error: libxl_dom.c:593:libxl__build_dom: xc_dom_boot_mem_init failed: 
> Cannot allocate memory
> libxl: error: libxl_create.c:1576:domcreate_rebuild_done: Domain 9:cannot 
> (re-)build domain: -3
> libxl: error: libxl_domain.c:1182:libxl__destroy_domid: Domain 9:Non-existant 
> domain
> libxl: error: libxl_domain.c:1136:domain_destroy_callback: Domain 9:Unable to 
> destroy guest
> libxl: error: libxl_domain.c:1063:domain_destroy_cb: Domain 9:Destruction of 
> domain failed
> 
> This is on staging-4.14 from a month or two ago (i.e., what I happened to 
> have on a random test  box), and `dom0_mem=1024M,max:1024M` in my 
> command-line.  That rune will give dom0 only 1GiB of RAM, but also prevent it 
> from auto-ballooning down to free up memory for the guest.

Hmm, but with that line removed, I get this:

root@immortal:/images# xl create c6-01.cfg maxmem=1073741824
Parsing config from c6-01.cfg
libxl: error: libxl_mem.c:279:libxl_set_memory_target: New target 0 for dom0 is 
below the minimum threshold
failed to free memory for the domain

Maybe the issue you’re probably facing is that “minimum threshold” safety catch 
either isn’t triggering, or is set low enough that your dom0 is OOMing trying 
to make enough memory for your VM?

That 1TiB of empty space isn’t actually free after all, even for Xen — you have 
to actually allocate p2m memory for the domain to hold all of those PoD entries.

 -George

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.