On 09/14/2010 02:07 AM, Ian Campbell wrote:
> On Mon, 2010-09-13 at 23:51 +0100, Jeremy Fitzhardinge wrote:
>> On 09/13/2010 02:17 PM, Dan Magenheimer wrote:
>>>> As a side-effect, it also works for dom0. If you set dom0_mem on the
>>>> Xen command line, then nr_pages is limited to that value, but the
>>>> can still see the system's real E820 map, and therefore adds all the
>>>> system's memory to its own balloon driver, potentially allowing dom0 to
>>>> expand up to take all physical memory.
>>>> However, this may caused bad side-effects if your system memory is much
>>>> larger than your dom0_mem, especially if you use a 32-bit dom0. I may
>>>> need to add a kernel command line option to limit the max initial
>>>> balloon size to mitigate this...
>>> I would call this dom0 functionality a bug. I think both Citrix
>>> and Oracle use dom0_mem as a normal boot option for every
>>> installation and, while I think both employ heuristics to choose
>>> a larger dom0_mem for larger physical memory, I don't think it
>>> grows large enough for, say, >256GB physical memory, to accommodate
>>> the necessarily large number of page tables.
>>> So, I'd vote for NOT allowing dom0 to balloon up to physical
>>> memory if dom0_mem is specified, and possibly a kernel command
>>> line option that allows it to grow beyond. Or, possibly, no
>>> option and never allow dom0 memory to grow beyond dom0_mem
>>> unless (possibly) it grows with hot-plug.
>> Yes, its a bit of a problem. The trouble is that the kernel can't
>> really distinguish the two cases; either way, it sees a Xen-supplied
>> xen_start_info->nr_pages as the amount of initial memory available, and
>> an E820 table referring to more RAM beyond that.
>> I guess there are three options:
>> 1. add a "xen_maxmem" (or something) kernel parameter to override
>> space specified in the E820 table
>> 2. ignore E820 if its a privileged domain
> As it stands I don't think it is currently possible to boot any domain 0
> kernel pre-ballooned other than by using the native mem= option.
> I think the Right Thing to do would be for privileged domains to combine
> the results of XENMEM_machine_memory_map (real e820) and
> XENMEM_memory_map (pseudo-physical "e820") by clamping the result of
> XENMEM_machine_memory_map at the maximum given in XENMEM_memory_map (or
> taking some sort of union).
Does the dom0 domain builder bother to set a pseudo-phys E820?
> However, although I think that the Right Thing, I don't think having
> domain 0 cut off its e820 at nr_pages unless overridden by mem= would be
> a problem in practice and it certainly wins in terms of complexity of
> reconciling XENMEM_memory_map and XENMEM_machine_memory_map.
Indeed. I think adding general 32x limit between base and max size will
prevent a completely unusable system, and then just suggest using mem=
to control that more precisely (esp for dom0).
Xen-devel mailing list