[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Increasing domain memory beyond initial maxmem



On Thu, Mar 31, 2022 at 08:41:19AM +0200, Juergen Gross wrote:
> On 31.03.22 05:51, Marek Marczykowski-Górecki wrote:
> > Hi,
> > 
> > I'm trying to make use of CONFIG_XEN_BALLOON_MEMORY_HOTPLUG=y to increase
> > domain memory beyond initial maxmem, but I hit few issues.
> > 
> > A little context: domains in Qubes OS start with rather little memory
> > (400MB by default) but maxmem set higher (4GB by default). Then, there is
> > qmemman daemon, that adjust balloon targets for domains, based on (among
> > other things) demand reported by the domains themselves. There is also a
> > little swap, to mitigate qmemman latency (few hundreds ms at worst).
> > This initial memory < maxmmem in case of PVH / HVM makes use of PoD
> > which I'm trying to get rid of. But also, IIUC Linux will waste some
> > memory for bookkeeping based on maxmem, not actually usable memory.
> > 
> > First issue: after using `xl mem-max`, `xl mem-set` still refuses to
> > increase memory more than initial maxmem. That's because xl mem-max does
> > not update 'memory/static-max' xenstore node. This one is easy to work
> > around.
> > 
> > Then, the actual hotplug fails on the domU side with:
> > 
> > [   50.004734] xen-balloon: vmemmap alloc failure: order:9, 
> > mode:0x4cc0(GFP_KERNEL|__GFP_RETRY_MAYFAIL), 
> > nodemask=(null),cpuset=/,mems_allowed=0
> > [   50.004774] CPU: 1 PID: 34 Comm: xen-balloon Not tainted 
> > 5.16.15-1.37.fc32.qubes.x86_64 #1
> > [   50.004792] Call Trace:
> > [   50.004799]  <TASK>
> > [   50.004808]  dump_stack_lvl+0x48/0x5e
> > [   50.004821]  warn_alloc+0x162/0x190
> > [   50.004832]  ? __alloc_pages+0x1fa/0x230
> > [   50.004842]  vmemmap_alloc_block+0x11c/0x1c5
> > [   50.004856]  vmemmap_populate_hugepages+0x185/0x519
> > [   50.004868]  vmemmap_populate+0x9e/0x16c
> > [   50.004878]  __populate_section_memmap+0x6a/0xb1
> > [   50.004890]  section_activate+0x20a/0x278
> > [   50.004901]  sparse_add_section+0x70/0x160
> > [   50.004911]  __add_pages+0xc3/0x150
> > [   50.004921]  add_pages+0x12/0x60
> > [   50.004931]  add_memory_resource+0x12b/0x320
> > [   50.004943]  reserve_additional_memory+0x10c/0x150
> > [   50.004958]  balloon_thread+0x206/0x360
> > [   50.004968]  ? do_wait_intr_irq+0xa0/0xa0
> > [   50.004978]  ? decrease_reservation.constprop.0+0x2e0/0x2e0
> > [   50.004991]  kthread+0x16b/0x190
> > [   50.005001]  ? set_kthread_struct+0x40/0x40
> > [   50.005011]  ret_from_fork+0x22/0x30
> > [   50.005022]  </TASK>
> > 
> > Full dmesg: 
> > https://gist.github.com/marmarek/72dd1f9dbdd63cfe479c94a3f4392b45
> > 
> > After the above, `free` reports correct size (1GB in this case), but
> > that memory seems to be unusable really. "used" is kept low, and soon
> > OOM-killer kicks in.
> > 
> > I know the initial 400MB is not much for a full Linux, with X11 etc. But
> > I wouldn't expect it to fail this way when _adding_ memory.
> > 
> > I've tried also with initial 800MB. In this case, I do not get "alloc
> > failure" any more, but monitoring `free`, the extra memory still doesn't
> > seem to be used.
> > 
> > Any ideas?
> > 
> 
> I can't reproduce that.
> 
> I started a guest with 8GB of memory, in the guest I'm seeing:
> 
> # uname -a
> Linux linux-d1cy 5.17.0-rc5-default+ #406 SMP PREEMPT Mon Feb 21 09:31:12
> CET 2022 x86_64 x86_64 x86_64 GNU/Linux
> # free
>         total     used      free   shared  buff/cache   available
> Mem:  8178260    71628   8023300     8560       83332     8010196
> Swap: 2097132        0   2097132
> 
> Then I'm raising the memory for the guest in dom0:
> 
> # xl list
> Name                ID   Mem VCPUs      State   Time(s)
> Domain-0             0  2634     8     r-----    1016.5
> Xenstore             1    31     1     -b----       0.9
> sle15sp1             3  8190     6     -b----     184.6
> # xl mem-max 3 10000
> # xenstore-write /local/domain/3/memory/static-max 10240000
> # xl mem-set 3 10000
> # xl list
> Name                ID   Mem VCPUs      State   Time(s)
> Domain-0             0  2634     8     r-----    1018.5
> Xenstore             1    31     1     -b----       1.0
> sle15sp1             3 10000     6     -b----     186.7
> 
> In the guest I get now:
> 
> # free
>         total     used     free   shared  buff/cache   available
> Mem: 10031700   110904  9734172     8560      186624     9814344
> Swap: 2097132        0  2097132
> 
> And after using lots of memory via a ramdisk:
> 
> # free
>         total     used     free   shared  buff/cache   available
> Mem: 10031700   116660  1663840  7181776     8251200     2635372
> Swap: 2097132        0  2097132
> 
> You can see buff/cache is now larger than the initial total memory
> and free is lower than the added memory amount.

Hmm, I have a different behavior:

I'm starting with 800M

# uname -a
Linux personal 5.16.15-1.37.fc32.qubes.x86_64 #1 SMP PREEMPT Tue Mar 22 
12:59:53 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
# free -m
              total        used        free      shared  buff/cache   available
Mem:            740         209         278           2         252         415
Swap:          1023           0        1023

Then raising to ~2GB:

[root@dom0 ~]# xl list
Name                                        ID   Mem VCPUs      State   Time(s)
Domain-0                                     0  4082     6     r-----  184271.3
(...)
personal                                    21   800     2     -b----       4.8
[root@dom0 ~]# xl mem-max personal 2048
[root@dom0 ~]# xenstore-write /local/domain/$(xl domid 
personal)/memory/static-max $((2048*1024))
[root@dom0 ~]# xl mem-set personal 2000
[root@dom0 ~]# xenstore-ls -fp /local/domain/$(xl domid personal)/memory
/local/domain/21/memory/static-max = "2097152"   (n0,r21)
/local/domain/21/memory/target = "2048001"   (n0,r21)
/local/domain/21/memory/videoram = "-1"   (n0,r21)

And then observe inside domU:
[root@personal ~]# free -m
              total        used        free      shared  buff/cache   available
Mem:           1940         235        1452           2         252        1585
Swap:          1023           0        1023

So far so good. But when trying to actually use it, it doesn't work:

[root@personal ~]# free -m
              total        used        free      shared  buff/cache   available
Mem:           1940         196        1240         454         503        1206
Swap:          1023         472         551

As you can see, all the new memory is still in "free", and swap is used
instead.


There is also /proc/meminfo (state before filling ramdisk), if that
would give some hints:
[root@personal ~]# cat /proc/meminfo
MemTotal:        1986800 kB
MemFree:         1487116 kB
MemAvailable:    1624060 kB
Buffers:           26236 kB
Cached:           207268 kB
SwapCached:            0 kB
Active:            74828 kB
Inactive:         258724 kB
Active(anon):       1008 kB
Inactive(anon):   101668 kB
Active(file):      73820 kB
Inactive(file):   157056 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:       1048572 kB
SwapFree:        1048572 kB
Dirty:               216 kB
Writeback:             0 kB
AnonPages:        100184 kB
Mapped:           117472 kB
Shmem:              2628 kB
KReclaimable:      24960 kB
Slab:              52136 kB
SReclaimable:      24960 kB
SUnreclaim:        27176 kB
KernelStack:        3120 kB
PageTables:         4364 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     2041972 kB
Committed_AS:     825816 kB
VmallocTotal:   34359738367 kB
VmallocUsed:       10064 kB
VmallocChunk:          0 kB
Percpu:             1240 kB
HardwareCorrupted:     0 kB
AnonHugePages:         0 kB
ShmemHugePages:        0 kB
ShmemPmdMapped:        0 kB
FileHugePages:         0 kB
FilePmdMapped:         0 kB
CmaTotal:              0 kB
CmaFree:               0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:               0 kB
DirectMap4k:       79872 kB
DirectMap2M:     1132544 kB
DirectMap1G:     1048576 kB


-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

Attachment: signature.asc
Description: PGP signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.