Xen project Mailing List

Re: [PATCH] x86/pod: Do not fragment PoD memory allocations

Date: Tue, 26 Jan 2021 12:08:15 +0100

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Roger Pau Monn?? <roger.pau@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx

Delivery-date: Tue, 26 Jan 2021 11:08:22 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 25.01.2021 18:46, Elliott Mitchell wrote: > On Mon, Jan 25, 2021 at 10:56:25AM +0100, Jan Beulich wrote: >> On 24.01.2021 05:47, Elliott Mitchell wrote: >>> >>> --- >>> Changes in v2: >>> - Include the obvious removal of the goto target. Always realize you're >>> at the wrong place when you press "send". >> >> Please could you also label the submission then accordingly? I >> got puzzled by two identically titled messages side by side, >> until I noticed the difference. > > Sorry about that. Would you have preferred a third message mentioning > this mistake? No. But I'd have expected v2 to have its subject start with "[PATCH v2] ...", making it relatively clear that one might save looking at the one labeled just "[PATCH] ...". >>> I'm not including a separate cover message since this is a single hunk. >>> This really needs some checking in `xl`. If one has a domain which >>> sometimes gets started on different hosts and is sometimes modified with >>> slightly differing settings, one can run into trouble. >>> >>> In this case most of the time the particular domain is most often used >>> PV/PVH, but every so often is used as a template for HVM. Starting it >>> HVM will trigger PoD mode. If it is started on a machine with less >>> memory than others, PoD may well exhaust all memory and then trigger a >>> panic. >>> >>> `xl` should likely fail HVM domain creation when the maximum memory >>> exceeds available memory (never mind total memory). >> >> I don't think so, no - it's the purpose of PoD to allow starting >> a guest despite there not being enough memory available to >> satisfy its "max", as such guests are expected to balloon down >> immediately, rather than triggering an oom condition. > > Even Qemu/OVMF is expected to handle ballooning for a *HVM* domain? No idea how qemu comes into play here. Any preboot environment aware of possibly running under Xen of course is expected to tolerate running with maxmem > memory (or clearly document its inability, in which case it may not be suitable for certain use cases). For example, I don't see why a preboot environment would need to touch all of the memory in a VM, except maybe for the purpose of zeroing it (which PoD can deal with fine). >>> For example try a domain with the following settings: >>> >>> memory = 8192 >>> maxmem = 2147483648 >>> >>> If type is PV or PVH, it will likely boot successfully. Change type to >>> HVM and unless your hardware budget is impressive, Xen will soon panic. >> >> Xen will panic? That would need fixing if so. Also I'd consider >> an excessively high maxmem (compared to memory) a configuration >> error. According to my experiments long, long ago I seem to >> recall that a factor beyond 32 is almost never going to lead to >> anything good, irrespective of guest type. (But as said, badness >> here should be restricted to the guest; Xen itself should limp >> on fine.) > > I'll confess I haven't confirmed the panic is in Xen itself. Problem is > when this gets triggered, by the time the situation is clear and I can > get to the console the computer is already restarting, thus no error > message has been observed. If the panic isn't in Xen itself, why would the computer be restarting? > This is most certainly a configuration error. Problem is this is a very > small delta between a perfectly valid configuration and the one which > reliably triggers a panic. > > The memory:maxmem ratio isn't the problem. My example had a maxmem of > 2147483648 since that is enough to exceed the memory of sub-$100K > computers. The crucial features are maxmem >= machine memory, > memory < free memory (thus potentially bootable PV/PVH) and type = "hvm". > > When was the last time you tried running a Xen machine with near zero > free memory? Perhaps in the past Xen kept the promise of never panicing > on memory exhaustion, but this feels like this hasn't held for some time. Such bugs needs fixing. Which first of all requires properly pointing them out. A PoD guest misbehaving when there's not enough memory to fill its pages (i.e. before it manages to balloon down) is expected behavior. If you can't guarantee the guest ballooning down quickly enough, don't configure it to use PoD. A PoD guest causing a Xen crash, otoh, is very likely even a security issue. Which needs to be treated as such: It needs fixing, not avoiding by "curing" one of perhaps many possible sources. As an aside - if the PoD code had proper 1Gb page support, would you then propose to only allocate in 1Gb chunks? And if there was a 512Gb page feature in hardware, in 512Gb chunks (leaving aside the fact that scanning 512Gb of memory to be all zero would simply take too long with today's computers)? Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.