WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] questions about ballooning

To: weiming <zephyr.zhao@xxxxxxxxx>
Subject: Re: [Xen-devel] questions about ballooning
From: Daniel Stodden <stodden@xxxxxxxxxx>
Date: Sun, 04 Nov 2007 18:04:10 +0100
Cc: Xen Developers <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Sun, 04 Nov 2007 09:05:00 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <add59a3f0711040734i657f2723k1f66e637ae823b1e@xxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Organization: Fakultät für Informatik I10, Technische Universität Münche
References: <add59a3f0711031831x71006b1cl190aacd57156133b@xxxxxxxxxxxxxx> <1194187887.15096.1.camel@xxxxxxxxxxxxxxxxxxxx> <add59a3f0711040734i657f2723k1f66e637ae823b1e@xxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
On Sun, 2007-11-04 at 10:34 -0500, weiming wrote:
> Hi Daniel,
> 
> Very appreciate for your so detailed explanation. You clarified some
> of my confusion. Before I post my question here, I read the paper
> "Memory resource Management in VMware ESX server", "art of
> virtualization", read Xen manual, checked the xen Wiki and searched in
> the maillist archive but can't get a complete  picture about balloon. 
> 
> 1) when a guesting os starts, how does it determine the amount of
> physical memory? i.e. which value determines the number of entries in
> mem_map? Is the value specified in configuration file?  

one page of initial domain memory is dedicated to a 'start_info'
structure. you may grep the xen/include sources for the definition.

then see e.g. linux-2.6-xen-sparse/arch/i386/kernel/setup-xen.c

as i believe you already understood, ther are two important distinctions
here:
- 'nr_pages': the size of the physical address range which the domain
  can use. that's basically the maximum memory, different
  from what the domain actually gets.
- 'reservation': the amount in nr_pages actually filled with machine
  memory.

nr_pages is in start info, as is the frame list corresponding to the
initial reservation set by the domain builder. the domain builder takes
gets nr_pages from the memory= field in the configuration file.

not sure about how this bootstraps in the balloon. e.g. i'm not sure
whether the whole initial memory is allocated and then returned again
only upon demand. or if the initial reservation is full memory and then
only grown by the balloon. i believe the former is the case. maybe
someone else can comment (please).

> 2) what's the exact role that xm mem-max plays?  I can set it to be
> higher than the value in configuration file.  I think that it just
> sets the "new_target" to balloon via xenbus or /proc/xen/balloon,
> right? 

you can tell the domain its physical limit is 1G. that's e.g. what the
guest storage allocator then uses to initialize itself. but you can as
well go afterwards and modify a configurable limit below the hard limit.
its then up to the balloon to wring the memory out of the guest system. 

why higher values get accepted i cannot comment on. maybe clipped
without further comment?

see, the kernel can free a lot of memory even when it is 'in use' by
swapping it to disk. that's one of the basic ideas of having a balloon
driver: do not build your own page replacement, but put put pressure on
the existing guest memory management to do it for you. that is what the
call to alloc_page() in the balloon driver is essentially doing. otoh,
memory in use be the kernel cannot be swapped. that's why the pages
grabbed by the balloon itself remain safe. that memory must be locked.

but, again, i should admit that my own understanding gets a bit fuzzy
here, regarding which is which in the config file and within xm
parameters. you're right in that the communication is performed via
xenbus. i spent more time reading xen and kernel code than the
surrounding python source. maybe someone else can comment better or
(hopefully) correct me if i'm talking rubbish somewhere above. 

send me an update once you hit it. :)

> 3) Once some pages are "ballooned out", these pages will be utilized
> by other domains, so if we later try to restore to initial status, how
> does VMM find available pages? 

the memory gets balanced by asking other domains to decrease their
reservation. you can wrench down the domain to the bare kernel when
you're a driver. the kernel gives you anything you ask for --
physically. or, rather, until the linux oom killer kicks in, a rather
undesirable side effect, as a recent thread on this list discussed.

> In increase_reservation(), 
> ...
> rc = HYPERVISOR_ memory_op(XENMEM_populate_physmap, &reservation)
> if (rc < nr_pages)
>  ...
> 
> In my understanding, hypervisor *tries* to find some free pages to
> return to the os. 

yes, this can fail. there's no fundamental (i.e. consistency) problem in
failing. the kernel will find that all memory is in use, as it will on a
native system if something grabbed all the memory. so, there's a balloon
driver saying "admittedly, i'm presently sitting on 80% of all memory,
but now it's mine. you modprobed me, so you trust me, now go look
somewhere else". a native driver would not even have been asked.

> 4)  in balloon.c, there are some functions that I can't find the
> calling sites. they are dealloc_pte_fn, alloc_empty_pages_and_pagevec,

no idea.

>  balloon_update_driver_allowance,

this one i can explain. there's memory apart from balloon entering and
leaving the domU. that's 'I/O memory' which is moved between frontend
and backend drivers to transfer data. for both receiving and sending,
domU is required to take memory from its own reservation. so it hands
these pages over to the backend driver domain and gets them back not
before there backend is finished with the transfer (i.e. map/unmapping
similar to ballooning). the balloon driver accounts for this memory, so
frontends call this function to tell her about it.

>  etc. Are they be called back by hypervisor?

they are certainly not immediate callback functions. control transfers
in/to the guest, if they are initated by the hypervisor, are always done
via event channels. the only other path would be an iret. that means
there's no 'call' between xen and guests. those symbols you see must be
in use somewhere.

regards,
daniel

-- 
Daniel Stodden
LRR     -      Lehrstuhl für Rechnertechnik und Rechnerorganisation
Institut für Informatik der TU München             D-85748 Garching
http://www.lrr.in.tum.de/~stodden         mailto:stodden@xxxxxxxxxx
PGP Fingerprint: F5A4 1575 4C56 E26A 0B33  3D80 457E 82AE B0D8 735B



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel