[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [PATCH v2] [linux-3.0+ for xen] tmem: self-ballooning and frontswap-selfshrinking



Here is V2 of the Xen tmem selfballooning/frontswap-selfshrinking
patch for review.  Many thanks to the reviewers of the V1
for your excellent comments and feedback.

Major changes since V1 (most suggested by Ian Campbell):
- nearly all code moved from *balloon.c to xen-selfballoon.c
- accounting fixes found by Daniel Kiper
- alternate access to vm_committed_as; now no VM core changes
- macros for repetitive sysfs code
- rebased to 3.0-rc1
- passes checkpatch.pl :-}

This code complements the cleancache and frontswap patchsets to
optimize support for Xen Transcendent Memory.  The policy it
implements is rudimentary, so will almost certainly need to
improve over time, but it does work well enough today.

Thanks,
Dan

P.S. Anybody who actually uses this patch is encouraged to read
http://oss.oracle.com/projects/tmem/dist/documentation/internals/linuxpatch 
from which the following brief documentation was extracted,
which will probably be the git commit comment.

=

[PATCH] [linux-2.6.39.x for xen] tmem: self-ballooning and 
frontswap-selfshrinking

Selfballooning creates memory pressure by using the Xen balloon driver
to decrease and increase available kernel memory, tracking a target value
of "Committed_AS" (see /proc/meminfo).  Sysfs tunables are provided to
adjust the frequency at which memory size is adjusted, the rate at
which ballooning is used to approach the target, and to disable
selfballooning completely.  As of 100827, selfballooning runs in a
separate kernel thread, "selfballooning".

Frontswap shrinking accounts for the fact that pages swapped to disk
may sit on disk for a very long time, even if very rarely used or if
the kernel knows they will never be used again.  This is because
the kernel need not reclaim disk space because the disk space costs
very little and can be overwritten when necessary.  When the same
pages are in frontswap, however, they are taking up valuable RAM
real estate.  So when frontswap activity is otherwise stable and the
guest kernel is not under memory pressure, the frontswap shrinker
removes some pages from frontswap and returns them to kernel memory.
Sysfs tunables are provided to adjust the frequency of shrinking
opportunities and the shrinking rate, and to create more "inertia".

Signed-off-by: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>

Diffstat:
 drivers/xen/Kconfig           |   10 +
 drivers/xen/Makefile          |    1 
 drivers/xen/xen-balloon.c     |    2 
 drivers/xen/xen-selfballoon.c |  350 ++++++++++++++++++++++++++++++++++++++++++
 include/xen/balloon.h         |   10 +
 include/xen/tmem.h            |    4 
 6 files changed, 377 insertions(+)

diff -Napur -X linux-3.0-rc1/Documentation/dontdiff -x scripts/selinux/mdp/mdp 
linux-3.0-rc1-frontswap/drivers/xen/Kconfig 
linux-3.0-rc1-frontswap-selfballoon/drivers/xen/Kconfig
--- linux-3.0-rc1-frontswap/drivers/xen/Kconfig 2011-05-31 17:09:07.606910896 
-0600
+++ linux-3.0-rc1-frontswap-selfballoon/drivers/xen/Kconfig     2011-06-16 
11:39:14.235875542 -0600
@@ -9,6 +9,16 @@ config XEN_BALLOON
          the system to expand the domain's memory allocation, or alternatively
          return unneeded memory to the system.
 
+config XEN_SELFBALLOONING
+       bool "dynamically self-balloon kernel memory to target"
+       depends on XEN && XEN_BALLOON && CLEANCACHE
+       default y
+       help
+         Self-ballooning dynamically balloons available kernel memory driven
+         by the current usage of anonymous memory ("committed AS") and
+         controlled by various sysfs-settable parameters.  May be overridden
+         by the noselfballooning kernel boot parameter
+
 config XEN_SCRUB_PAGES
        bool "Scrub pages before returning them to system"
        depends on XEN_BALLOON
diff -Napur -X linux-3.0-rc1/Documentation/dontdiff -x scripts/selinux/mdp/mdp 
linux-3.0-rc1-frontswap/drivers/xen/Makefile 
linux-3.0-rc1-frontswap-selfballoon/drivers/xen/Makefile
--- linux-3.0-rc1-frontswap/drivers/xen/Makefile        2011-05-31 
17:09:57.006875306 -0600
+++ linux-3.0-rc1-frontswap-selfballoon/drivers/xen/Makefile    2011-06-16 
11:39:06.123852289 -0600
@@ -8,6 +8,7 @@ obj-$(CONFIG_BLOCK)                     += biomerge.o
 obj-$(CONFIG_HOTPLUG_CPU)              += cpu_hotplug.o
 obj-$(CONFIG_XEN_XENCOMM)              += xencomm.o
 obj-$(CONFIG_XEN_BALLOON)              += xen-balloon.o
+obj-$(CONFIG_XEN_SELFBALLOONING)       += xen-selfballoon.o
 obj-$(CONFIG_XEN_DEV_EVTCHN)           += xen-evtchn.o
 obj-$(CONFIG_XEN_GNTDEV)               += xen-gntdev.o
 obj-$(CONFIG_XEN_GRANT_DEV_ALLOC)      += xen-gntalloc.o
diff -Napur -X linux-3.0-rc1/Documentation/dontdiff -x scripts/selinux/mdp/mdp 
linux-3.0-rc1-frontswap/drivers/xen/xen-balloon.c 
linux-3.0-rc1-frontswap-selfballoon/drivers/xen/xen-balloon.c
--- linux-3.0-rc1-frontswap/drivers/xen/xen-balloon.c   2011-05-29 
18:43:36.000000000 -0600
+++ linux-3.0-rc1-frontswap-selfballoon/drivers/xen/xen-balloon.c       
2011-06-20 15:37:30.405798859 -0600
@@ -98,6 +98,8 @@ static int __init balloon_init(void)
 
        register_balloon(&balloon_sysdev);
 
+       register_xen_selfballooning(&balloon_sysdev);
+
        target_watch.callback = watch_target;
        xenstore_notifier.notifier_call = balloon_init_watcher;
 
diff -Napur -X linux-3.0-rc1/Documentation/dontdiff -x scripts/selinux/mdp/mdp 
linux-3.0-rc1-frontswap/drivers/xen/xen-selfballoon.c 
linux-3.0-rc1-frontswap-selfballoon/drivers/xen/xen-selfballoon.c
--- linux-3.0-rc1-frontswap/drivers/xen/xen-selfballoon.c       1969-12-31 
17:00:00.000000000 -0700
+++ linux-3.0-rc1-frontswap-selfballoon/drivers/xen/xen-selfballoon.c   
2011-06-20 15:43:08.613818583 -0600
@@ -0,0 +1,350 @@
+/******************************************************************************
+ * Xen selfballoon driver
+ *
+ * Copyright (c) 2009-2011, Dan Magenheimer, Oracle Corp.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License version 2
+ * as published by the Free Software Foundation; or, when distributed
+ * separately from the Linux kernel or incorporated into other
+ * software packages, subject to the following license:
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this source file (the "Software"), to deal in the Software without
+ * restriction, including without limitation the rights to use, copy, modify,
+ * merge, publish, distribute, sublicense, and/or sell copies of the Software,
+ * and to permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include <linux/kernel.h>
+#include <linux/mm.h>
+#include <linux/mman.h>
+
+#include <xen/balloon.h>
+
+#ifdef CONFIG_FRONTSWAP
+#include <linux/frontswap.h>
+#endif
+#include <xen/tmem.h>
+
+static int xen_selfballooning_enabled __read_mostly;
+static int use_selfballooning __read_mostly = 1;
+static unsigned int selfballoon_interval __read_mostly;
+static unsigned int selfballoon_downhysteresis __read_mostly;
+static unsigned int selfballoon_uphysteresis __read_mostly;
+
+static void selfballoon_process(struct work_struct *work);
+DECLARE_DELAYED_WORK(selfballoon_worker, selfballoon_process);
+
+#ifdef CONFIG_FRONTSWAP
+static bool frontswap_selfshrinking __read_mostly;
+static int use_frontswap_selfshrink __read_mostly = 1;
+static unsigned int frontswap_hysteresis __read_mostly;
+static unsigned int frontswap_inertia __read_mostly;
+
+#define frontswap_selfshrinking_enabled frontswap_selfshrinking
+
+static unsigned long frontswap_inertia_counter;
+
+static void frontswap_selfshrink(void)
+{
+       static unsigned long cur_frontswap_pages;
+       static unsigned long last_frontswap_pages;
+       static unsigned long tgt_frontswap_pages;
+
+       if (!frontswap_selfshrinking)
+               return;
+       if (!frontswap_hysteresis)
+               return;
+       last_frontswap_pages = cur_frontswap_pages;
+       cur_frontswap_pages = frontswap_curr_pages();
+       if (!cur_frontswap_pages ||
+                       (cur_frontswap_pages > last_frontswap_pages)) {
+               frontswap_inertia_counter = frontswap_inertia;
+               return;
+       }
+       if (frontswap_inertia_counter && --frontswap_inertia_counter)
+               return;
+       if (cur_frontswap_pages <= frontswap_hysteresis)
+               tgt_frontswap_pages = 0;
+       else
+               tgt_frontswap_pages = cur_frontswap_pages -
+                       (cur_frontswap_pages / frontswap_hysteresis);
+       frontswap_shrink(tgt_frontswap_pages);
+}
+
+static int __init no_frontswap_selfshrink_setup(char *s)
+{
+       use_frontswap_selfshrink = 0;
+       return 1;
+}
+
+__setup("noselfshrink", no_frontswap_selfshrink_setup);
+#else /* !CONFIG_FRONTSWAP */
+static inline void frontswap_selfshrink(void)
+{
+}
+#define frontswap_selfshrinking_enabled 0
+#endif
+
+static void selfballoon_process(struct work_struct *work)
+{
+       unsigned long cur_pages, goal_pages, tgt_pages;
+       int reset_timer = 0;
+
+       if (xen_selfballooning_enabled) {
+               tgt_pages = balloon_stats.current_pages;
+               cur_pages = balloon_stats.current_pages;
+               goal_pages = percpu_counter_read_positive(&vm_committed_as) +
+                       balloon_stats.current_pages - totalram_pages;
+               if (cur_pages > goal_pages)
+                       tgt_pages = cur_pages -
+                               ((cur_pages - goal_pages) /
+                                 selfballoon_downhysteresis);
+               else if (cur_pages < goal_pages)
+                       tgt_pages = cur_pages +
+                               ((goal_pages - cur_pages) /
+                                 selfballoon_uphysteresis);
+               balloon_set_new_target(tgt_pages);
+               reset_timer = 1;
+       }
+       if (frontswap_selfshrinking_enabled) {
+               frontswap_selfshrink();
+               reset_timer = 1;
+       }
+       if (reset_timer)
+               schedule_delayed_work(&selfballoon_worker,
+                       selfballoon_interval * HZ);
+}
+
+#ifdef CONFIG_SYSFS
+#define SELFBALLOON_SHOW(name, format, args...)                                
\
+       static ssize_t show_##name(struct sys_device *dev,      \
+                                          struct sysdev_attribute *attr, \
+                                          char *buf) \
+       { \
+               return sprintf(buf, format, ##args); \
+       }
+
+SELFBALLOON_SHOW(selfballooning, "%d\n", xen_selfballooning_enabled);
+
+static ssize_t store_selfballooning(struct sys_device *dev,
+                           struct sysdev_attribute *attr,
+                           const char *buf,
+                           size_t count)
+{
+       char *endchar;
+       int was_enabled = xen_selfballooning_enabled;
+
+       if (!capable(CAP_SYS_ADMIN))
+               return -EPERM;
+
+       xen_selfballooning_enabled = !!memparse(buf, &endchar);
+
+       if (!was_enabled && xen_selfballooning_enabled)
+               schedule_delayed_work(&selfballoon_worker,
+                       selfballoon_interval * HZ);
+
+       return count;
+}
+
+static SYSDEV_ATTR(selfballooning, S_IRUGO | S_IWUSR,
+                  show_selfballooning, store_selfballooning);
+
+SELFBALLOON_SHOW(selfballoon_interval, "%d\n", selfballoon_interval);
+
+static ssize_t store_selfballoon_interval(struct sys_device *dev,
+                                         struct sysdev_attribute *attr,
+                                         const char *buf,
+                                         size_t count)
+{
+       char *endchar;
+
+       if (!capable(CAP_SYS_ADMIN))
+               return -EPERM;
+       selfballoon_interval = memparse(buf, &endchar);
+       return count;
+}
+
+static SYSDEV_ATTR(selfballoon_interval, S_IRUGO | S_IWUSR,
+                  show_selfballoon_interval, store_selfballoon_interval);
+
+SELFBALLOON_SHOW(selfballoon_downhys, "%d\n", selfballoon_downhysteresis);
+
+static ssize_t store_selfballoon_downhys(struct sys_device *dev,
+                                        struct sysdev_attribute *attr,
+                                        const char *buf,
+                                        size_t count)
+{
+       char *endchar;
+
+       if (!capable(CAP_SYS_ADMIN))
+               return -EPERM;
+       selfballoon_downhysteresis = memparse(buf, &endchar);
+       return count;
+}
+
+static SYSDEV_ATTR(selfballoon_downhysteresis, S_IRUGO | S_IWUSR,
+                  show_selfballoon_downhys, store_selfballoon_downhys);
+
+
+SELFBALLOON_SHOW(selfballoon_uphys, "%d\n", selfballoon_uphysteresis);
+
+static ssize_t store_selfballoon_uphys(struct sys_device *dev,
+                                      struct sysdev_attribute *attr,
+                                      const char *buf,
+                                      size_t count)
+{
+       char *endchar;
+
+       if (!capable(CAP_SYS_ADMIN))
+               return -EPERM;
+       selfballoon_uphysteresis = memparse(buf, &endchar);
+       return count;
+}
+
+static SYSDEV_ATTR(selfballoon_uphysteresis, S_IRUGO | S_IWUSR,
+                  show_selfballoon_uphys, store_selfballoon_uphys);
+
+#ifdef CONFIG_FRONTSWAP
+SELFBALLOON_SHOW(frontswap_selfshrinking, "%d\n", frontswap_selfshrinking);
+
+static ssize_t store_frontswap_selfshrinking(struct sys_device *dev,
+                                            struct sysdev_attribute *attr,
+                                            const char *buf,
+                                            size_t count)
+{
+       char *endchar;
+       int was_enabled = frontswap_selfshrinking;
+
+       if (!capable(CAP_SYS_ADMIN))
+               return -EPERM;
+
+       frontswap_selfshrinking = !!memparse(buf, &endchar);
+
+       if (!was_enabled && !xen_selfballooning_enabled &&
+                                       frontswap_selfshrinking)
+               schedule_delayed_work(&selfballoon_worker,
+                       selfballoon_interval * HZ);
+
+       return count;
+}
+
+static SYSDEV_ATTR(frontswap_selfshrinking, S_IRUGO | S_IWUSR,
+                  show_frontswap_selfshrinking, store_frontswap_selfshrinking);
+
+SELFBALLOON_SHOW(frontswap_inertia, "%d\n", frontswap_inertia);
+
+static ssize_t store_frontswap_inertia(struct sys_device *dev,
+                                      struct sysdev_attribute *attr,
+                                      const char *buf,
+                                      size_t count)
+{
+       char *endchar;
+
+       if (!capable(CAP_SYS_ADMIN))
+               return -EPERM;
+       frontswap_inertia = memparse(buf, &endchar);
+       frontswap_inertia_counter = frontswap_inertia;
+       return count;
+}
+
+static SYSDEV_ATTR(frontswap_inertia, S_IRUGO | S_IWUSR,
+                  show_frontswap_inertia, store_frontswap_inertia);
+
+SELFBALLOON_SHOW(frontswap_hysteresis, "%d\n", frontswap_hysteresis);
+
+static ssize_t store_frontswap_hysteresis(struct sys_device *dev,
+                                         struct sysdev_attribute *attr,
+                                         const char *buf,
+                                         size_t count)
+{
+       char *endchar;
+
+       if (!capable(CAP_SYS_ADMIN))
+               return -EPERM;
+       frontswap_hysteresis = memparse(buf, &endchar);
+       return count;
+}
+
+static SYSDEV_ATTR(frontswap_hysteresis, S_IRUGO | S_IWUSR,
+                  show_frontswap_hysteresis, store_frontswap_hysteresis);
+
+#endif /* CONFIG_FRONTSWAP */
+
+static struct attribute *selfballoon_attrs[] = {
+       &attr_selfballooning.attr,
+       &attr_selfballoon_interval.attr,
+       &attr_selfballoon_downhysteresis.attr,
+       &attr_selfballoon_uphysteresis.attr,
+#ifdef CONFIG_FRONTSWAP
+       &attr_frontswap_selfshrinking.attr,
+       &attr_frontswap_hysteresis.attr,
+       &attr_frontswap_inertia.attr,
+#endif
+};
+
+static struct attribute_group selfballoon_group = {
+       .name = "selfballoon",
+       .attrs = selfballoon_attrs
+};
+#endif
+
+int register_xen_selfballooning(struct sys_device *sysdev)
+{
+       int error = -1;
+
+#ifdef CONFIG_SYSFS
+       error = sysfs_create_group(&sysdev->kobj, &selfballoon_group);
+#endif
+       return error;
+}
+EXPORT_SYMBOL(register_xen_selfballooning);
+
+static int __init noselfballooning_setup(char *s)
+{
+       use_selfballooning = 0;
+       return 1;
+}
+
+__setup("noselfballooning", noselfballooning_setup);
+
+static int __init xen_selfballoon_init(void)
+{
+       if (!xen_domain())
+               return -ENODEV;
+
+       pr_info("xen/balloon: Initializing Xen selfballooning driver.\n");
+#ifdef CONFIG_FRONTSWAP
+       pr_info("xen/balloon: Initializing frontswap selfshrinking driver.\n");
+#endif
+
+       xen_selfballooning_enabled = tmem_enabled && use_selfballooning;
+       selfballoon_interval = 5;
+       selfballoon_downhysteresis = 8;
+       selfballoon_uphysteresis = 1;
+#ifdef CONFIG_FRONTSWAP
+       frontswap_selfshrinking = use_frontswap_selfshrink && frontswap_enabled;
+       frontswap_hysteresis = 20;
+       frontswap_inertia = 3;
+#endif
+       schedule_delayed_work(&selfballoon_worker, selfballoon_interval * HZ);
+
+       return 0;
+}
+
+subsys_initcall(xen_selfballoon_init);
+
+MODULE_LICENSE("GPL");
diff -Napur -X linux-3.0-rc1/Documentation/dontdiff -x scripts/selinux/mdp/mdp 
linux-3.0-rc1-frontswap/include/xen/balloon.h 
linux-3.0-rc1-frontswap-selfballoon/include/xen/balloon.h
--- linux-3.0-rc1-frontswap/include/xen/balloon.h       2011-05-29 
18:43:36.000000000 -0600
+++ linux-3.0-rc1-frontswap-selfballoon/include/xen/balloon.h   2011-06-20 
14:58:24.975176230 -0600
@@ -23,3 +23,13 @@ void balloon_set_new_target(unsigned lon
 
 int alloc_xenballooned_pages(int nr_pages, struct page** pages);
 void free_xenballooned_pages(int nr_pages, struct page** pages);
+
+struct sys_device;
+#ifdef CONFIG_XEN_SELFBALLOONING
+extern int register_xen_selfballooning(struct sys_device *sysdev);
+#else
+static inline int register_xen_selfballooning(struct sys_device *sysdev)
+{
+       return -1;
+}
+#endif
diff -Napur -X linux-3.0-rc1/Documentation/dontdiff -x scripts/selinux/mdp/mdp 
linux-3.0-rc1-frontswap/include/xen/tmem.h 
linux-3.0-rc1-frontswap-selfballoon/include/xen/tmem.h
--- linux-3.0-rc1-frontswap/include/xen/tmem.h  1969-12-31 17:00:00.000000000 
-0700
+++ linux-3.0-rc1-frontswap-selfballoon/include/xen/tmem.h      2011-06-16 
10:15:44.514850956 -0600
@@ -0,0 +1,4 @@
+#ifndef _XEN_TMEM_H
+#define _XEN_TMEM_H
+extern int tmem_enabled;
+#endif /* _XEN_TMEM_H */

Attachment: frontswap-selfballoon-linux-3.0-rc1-110620.patch
Description: Binary data

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.