This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


re: [Xen-devel] Xen balloon driver discuss

Thank you for your kindly help. 

Well, on last mail, you mentioned that balloon will make pod_entries equal
to cache_size as soon as it start to work when guest starts up.
>From my understanding, if we start guest such as:

xm cr xxx.hvm maxmem=2048 memory=512 

then, we should set the /local/domain/did/memory/target to 522240 ( (
512M-2M) * 1204, 2M for VGA in your another patch? )
to tell the balloon driver in guest to inflate, right? And when balloon
driver balloon to let guest memory has this target,
I think pod_entires will equal to cached_size, right?

I did some experiment on this, the result shows different.

Step 1.
xm cr xxx.hvm maxmem=2048 memory=512

at the very beginning, I printed out domain tot_pages, 1320288,
pod.entry_count 523776, that is 2046M, pod.count 130560, that is 512M

(XEN) tot_pages 132088 pod_entries 523776 pod_count 130560

currently, /local/domain/did/memory/target in default will be written to

after guest start up, balloon driver will balloon, when finish, I can see
pod.entry_count reduce to 23552, pod,count 14063

(XEN)     DomPage list too long to display
(XEN) Tot pages 132088  PoD entries=23552 cachesize=14063

Step 2.

In my understanding, /local/domain/did/memory/target should be at least 510
* 1024 , and then pod_entries will equal to cache_size

I use  500, So I did: xm mem-set domain_id  500

then I can see pod.entry_count reduce to 22338, pod,count 15921, still not

(XEN) Memory pages belonging to domain 4:
(XEN)     DomPage list too long to display
(XEN) Tot pages 132088  PoD entries=22338 cachesize=15921

Step 3. 

Only after I did : xm mem-set domain_id  470
Pod_entries is equal to pod.count
(XEN)     DomPage list too long to display
(XEN) Tot pages 130825  PoD entries=14677 cachesize=14677

Later from the code, I learnt that those two values are forced to be equal,

700 out_entry_check:
701     /* If we've reduced our "liabilities" beyond our "assets", free some
702     if ( p2md->pod.entry_count < p2md->pod.count )
703     {
704         p2m_pod_set_cache_target(d, p2md->pod.entry_count);
705     }   

So in conclude, it looks like something goes wrong, the PoD entries should
equal to cachesize(pod.count) 
as soon as the balloon driver inflate to max - target, right? 

Many thanks.

From: George Dunlap [mailto:George.Dunlap@xxxxxxxxxxxxx] 
to: hotmaim
cc: Chu Rui; xen-devel@xxxxxxxxxxxxxxxxxxx; Dan Magenheimer
Sub: Re: [Xen-devel] Xen balloon driver discuss

On 29/11/10 15:41, hotmaim wrote:
>          Appreciate for the details, I get more understandings but still
have some confusions.
>     1.Is it necessery to balloon to max-target at dom U right
startup?(for, xm cr xxx.hvm maxmem=2048 memory=1024; max-target is
2048-1024) Say is it safe to balloon to let the guest has only 512M memory
in total? or 1536 M(in this situation, i guess the pod entry will
> also reduce and extra 512M memory will be added to Pod cache,right)?

I'm sorry, I can't figure out what you mean.  The tools will set
"target" to the value of "memory".  The balloon driver is supposed to
see how many total pages it has (2048M) and "inflate" the balloon until
the number of pages is at the target (1024M in your example above).

> 2. Suppose we have a xen wide PoD memorym pool, that is accessable for
every guest domains, when the guest needs a page, it get the page from the
pool, and we can still use
> balloon strategy to have the guest free pages back to the pool.  So if the
amount of all domain
> memory inuse is less than host physial memory, it is safe. And when no
memory available from
> host, domain need new memory may pause for for waiting for others to free,
or use swap memory, is it possible?

We already have a pool of free memory accessible to all the guest
domains: It's called the Xen free page list. :-)

One of the explicit purposes of PoD is to set aside a fixed amount of
memory for a guest, so that no other domains / processes can claim it.
It's guaranteed that memory, and as long as it has a working balloon
driver, shouldn't have any issues using it properly.  Sharing it with
other VMs would undermine this, and make it pretty much the same as the
Xen free page list.

I'm not an expert in tmem, but as I understand it, the whole point of
tmem is to use knowledge of the guest OS to be able to throw away
certain data.  You can't get guest-specific knowledge without modifying
the guest OS to have it tell Xen somehow.

It sounds like what you're advocating is *allocate*-on-demand (as
opposed to PoD, which allocates all the memory at the beginning but
*populates* the p2m table on demand): tell all the guests they have more
memory than is available total, assuming that only some of them are
going to try to use all of it; and allocating the memory as it's used.
This works well for processes, but operating systems are typically built
with the assumption that memory not used is memory completely wasted.
They therefore keep disk cache pages and unused memory pages around
"just in case", and I predict that any guest which has an active
workload will eventually use all the memory it's been told it has, even
if it's only actively using a small portion of it.  At that point, Xen
will be forced to try to guess which page is the least important to have
around and swap it out.

Alternately, the tools could slowly balloon down all of the guests as
the memory starts to run out; but then you have a situation where the
guest that gets the most memory is the one that touched it first, not
the one which actually needs it.

At any rate, PoD is meant to solve exactly one problem: booting
"ballooned".  At the moment it doesn't lend itself to other solutions.


> 2010-11-29,19:19,George Dunlap<George.Dunlap@xxxxxxxxxxxxx>  :
>> On 29/11/10 10:55, tinnycloud wrote:
>>> So that is, if we run out of PoD cache before balloon works, Xen will
>>> crash domain(goto out_of_memory),
>> That's right; PoD is only meant to allow a guest to run from boot until
>> the balloon driver can load.  It's to allow a guest to "boot ballooned."
>>> and at this situation, domain U swap(dom U can’t use swap memory) is
>>> available , right?
>> I don't believe swap and PoD are integrated at the moment, no.
>>> And when balloon actually works, the pod cached will finally decrease to
>>> 0, and no longer be used any more, right?
>> Conceptually, yes.  What actually happens is that ballooning will reduce
>> it so that pod_entries==cache_size.  Entries will stay PoD until the
>> guest touches them.  It's likely that eventually the guest will touch
>> all the pages, at which point the PoD cache will be 0.
>>> could we use this method to implement a tmem like memory overcommit?
>> PoD does require guest knowledge -- it requires the balloon driver to be
>> loaded soon after boot so the so the guest will limit its memory usage.
>> It also doesn't allow overcommit.  Memory in the PoD cache is already
>> allocated to the VM, and can't be used for something else.
>> You can't to overcommit without either:
>> * The guest knowing that it might not get the memory back, and being OK
>> with that (tmem), or
>> * Swapping, which doesn't require PoD at all.
>> If you're thinking about scanning for zero pages and automatically
>> reclaiming them, for instance, you have to be able to deal with a
>> situation where the guest decides to use a page you've reclaimed but
>> you've already given your last free page to someone else, and there are
>> no more zero pages anywhere on the system.  That would mean either just
>> pausing the VM indefinitely, or choosing another guest page to swap out.
>> -George
>>> *From:* Chu Rui [mailto:ruichu@xxxxxxxxx]
>>> *TO:* tinnycloud
>>> *CC:* xen-devel@xxxxxxxxxxxxxxxxxxx; George.Dunlap@xxxxxxxxxxxxx;
>>> dan.magenheimer@xxxxxxxxxx
>>> *Subject:* Re: [Xen-devel] Xen balloon driver discuss
>>> I am also interested with tinnycloud's problem.
>>> It looks that the pod cache has been used up like this:
>>> if ( p2md->pod.count == 0 )
>>> goto out_of_memory;
>>> George, would you please take a look on this problem, and, if possbile,
>>> tell a little more about what does PoD cache mean? Is it a memory pool
>>> for PoD allocation?

Xen-devel mailing list