[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] initial ballooning amount on HVM+PoD



>>> On 17.01.14 at 16:54, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx> wrote:
> On 01/17/2014 09:33 AM, Jan Beulich wrote:
>> While looking into JÃrgen's issue with PoD setup causing soft lockups
>> in Dom0 I realized that what I did in linux-2.6.18-xen.hg's c/s
>> 989:a7781c0a3b9a ("xen/balloon: fix balloon driver accounting for
>> HVM-with-PoD case") just doesn't work - the BUG_ON() added there
>> triggers as soon as there's a reasonable amount of excess memory.
>> And that is despite me knowing that I spent significant amounts of
>> in testing that change - I must have tested something else than
>> finally got checked in, or must have screwed up in some other way.
>> Extremely embarrassing...
>>
>> In the course of finding a proper solution I soon stumbled across
>> upstream's c275a57f5e ("xen/balloon: Set balloon's initial state to
>> number of existing RAM pages"), and hence went ahead and
>> compared three different calculations for initial bs.current_pages:
>>
>> (a) upstream's (open coding get_num_physpages(), as I did this on
>>      an older kernel)
>> (b) plain old num_physpages (equaling the maximum RAM PFN)
>> (c) XENMEM_get_pod_target output (with the hypervisor altered
>>      to not refuse this for a domain doing it on itself)
>>
>> The fourth (original) method, using totalram_pages, was already
>> known to result in the driver not ballooning down enough, and
>> hence setting up the domain for an eventual crash when the PoD
>> cache runs empty.
>>
>> Interestingly, (a) too results in the driver not ballooning down
>> enough - there's a gap of exactly as many pages as are marked
>> reserved below the 1Mb boundary. Therefore aforementioned
>> upstream commit is presumably broken.
>>
>> Short of a reliable (and ideally architecture independent) way of
>> knowing the necessary adjustment value, the next best solution
>> (not ballooning down too little, but also not ballooning down much
>> more than necessary) turns out to be using the minimum of (b)
>> and (c): When the domain only has memory below 4Gb, (b) is
>> more precise, whereas in the other cases (c) gets closest.
> 
> I am not sure I understand why (b) would be the right answer for 
> less-than-4G guests. The reason for c275a57f5e patch was that max_pfn 
> includes MMIO space (which is not RAM) and thus the driver will 
> unnecessarily balloon down that much memory.

max_pfn/num_physpages isn't that far off for guest with less than
4Gb, the number calculated from the PoD data is a little worse.

>> Question now is: Considering that (a) is broken (and hard to fix)
>> and (b) is in presumably a large part of practical cases leading to
>> too much ballooning down, shouldn't we open up
>> XENMEM_get_pod_target for domains to query on themselves?
>> Alternatively, can anyone see another way to calculate a
>> reasonably precise value?
> 
> I think hypervisor query is a good thing although I don't know whether 
> exposing PoD-specific data (count and entry_count) to the guest is 
> necessary. It's probably OK (or we can set these fields to zero for 
> non-privileged domains).

That's pointless then - if no useful data is provided through the
call to non-privileged domains, we can as well keep it erroring for
them.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.