On Mon, 2010-09-13 at 23:51 +0100, Jeremy Fitzhardinge wrote:
> On 09/13/2010 02:17 PM, Dan Magenheimer wrote:
> >> As a side-effect, it also works for dom0. If you set dom0_mem on the
> >> Xen command line, then nr_pages is limited to that value, but the
> >> kernel
> >> can still see the system's real E820 map, and therefore adds all the
> >> system's memory to its own balloon driver, potentially allowing dom0 to
> >> expand up to take all physical memory.
> >> However, this may caused bad side-effects if your system memory is much
> >> larger than your dom0_mem, especially if you use a 32-bit dom0. I may
> >> need to add a kernel command line option to limit the max initial
> >> balloon size to mitigate this...
> > I would call this dom0 functionality a bug. I think both Citrix
> > and Oracle use dom0_mem as a normal boot option for every
> > installation and, while I think both employ heuristics to choose
> > a larger dom0_mem for larger physical memory, I don't think it
> > grows large enough for, say, >256GB physical memory, to accommodate
> > the necessarily large number of page tables.
> > So, I'd vote for NOT allowing dom0 to balloon up to physical
> > memory if dom0_mem is specified, and possibly a kernel command
> > line option that allows it to grow beyond. Or, possibly, no
> > option and never allow dom0 memory to grow beyond dom0_mem
> > unless (possibly) it grows with hot-plug.
> Yes, its a bit of a problem. The trouble is that the kernel can't
> really distinguish the two cases; either way, it sees a Xen-supplied
> xen_start_info->nr_pages as the amount of initial memory available, and
> an E820 table referring to more RAM beyond that.
> I guess there are three options:
> 1. add a "xen_maxmem" (or something) kernel parameter to override
> space specified in the E820 table
> 2. ignore E820 if its a privileged domain
As it stands I don't think it is currently possible to boot any domain 0
kernel pre-ballooned other than by using the native mem= option.
I think the Right Thing to do would be for privileged domains to combine
the results of XENMEM_machine_memory_map (real e820) and
XENMEM_memory_map (pseudo-physical "e820") by clamping the result of
XENMEM_machine_memory_map at the maximum given in XENMEM_memory_map (or
taking some sort of union).
Then if we wanted to support pre-ballooning domain 0 via hypervisor only
interfaces for some reason in the future then we would need to add a new
option dom0_maxmem= or so which populated the result of
XENMEM_memory_map with the appropriate size. I think this would be
consistent with the behaviour for a non-privileged domain, dom0_mem= and
dom0_maxmem= would correspond to memory= and maxmem= in a domU
However, although I think that the Right Thing, I don't think having
domain 0 cut off its e820 at nr_pages unless overridden by mem= would be
a problem in practice and it certainly wins in terms of complexity of
reconciling XENMEM_memory_map and XENMEM_machine_memory_map.
Xen-devel mailing list