[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RESEND v5 6/6] xen/arm: Implement toolstack for xl restore/save and migrate



On Fri, 2013-11-08 at 16:50 +0900, Jaeyong Yoo wrote:
> From: Alexey Sokolov <sokolov.a@xxxxxxxxxxx>
> 
> Implement for xl restore/save (which are also used for migrate) operation in 
> xc_arm_migrate.c and make it compilable.
> The overall process of save is the following:
> 1) save guest parameters (i.e., memory map, console and store pfn, etc)
> 2) save memory (if it is live, perform dirty-page tracing)
> 3) save hvm states (i.e., gic, timer, vcpu etc)
> 
> Singed-off-by: Alexey Sokolov <sokolov.a@xxxxxxxxxxx>
> ---
>  config/arm32.mk              |   1 +
>  tools/libxc/Makefile         |   6 +-
>  tools/libxc/xc_arm_migrate.c | 712 
> +++++++++++++++++++++++++++++++++++++++++++
>  tools/libxc/xc_dom_arm.c     |   4 +-
>  tools/misc/Makefile          |   4 +-
>  5 files changed, 723 insertions(+), 4 deletions(-)
>  create mode 100644 tools/libxc/xc_arm_migrate.c
> 
> diff --git a/config/arm32.mk b/config/arm32.mk
> index aa79d22..01374c9 100644
> --- a/config/arm32.mk
> +++ b/config/arm32.mk
> @@ -1,6 +1,7 @@
>  CONFIG_ARM := y
>  CONFIG_ARM_32 := y
>  CONFIG_ARM_$(XEN_OS) := y
> +CONFIG_MIGRATE := y
>  
>  CONFIG_XEN_INSTALL_SUFFIX :=
>  
> diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
> index 4c64c15..05dfef4 100644
> --- a/tools/libxc/Makefile
> +++ b/tools/libxc/Makefile
> @@ -42,8 +42,13 @@ CTRL_SRCS-$(CONFIG_MiniOS) += xc_minios.c
>  GUEST_SRCS-y :=
>  GUEST_SRCS-y += xg_private.c xc_suspend.c
>  ifeq ($(CONFIG_MIGRATE),y)
> +ifeq ($(CONFIG_X86),y)
>  GUEST_SRCS-y += xc_domain_restore.c xc_domain_save.c
>  GUEST_SRCS-y += xc_offline_page.c xc_compression.c
> +endif
> +ifeq ($(CONFIG_ARM),y)
> +GUEST_SRCS-y += xc_arm_migrate.c

I know you are just following the example above but I think this can be
GUEST_SRCS-$(CONFIG_ARM) += xc_arm...

> +endif
>  else
>  GUEST_SRCS-y += xc_nomigrate.c
>  endif
> @@ -63,7 +68,6 @@ $(patsubst %.c,%.opic,$(ELF_SRCS-y)): CFLAGS += 
> -Wno-pointer-sign
>  GUEST_SRCS-y                 += xc_dom_core.c xc_dom_boot.c
>  GUEST_SRCS-y                 += xc_dom_elfloader.c
>  GUEST_SRCS-$(CONFIG_X86)     += xc_dom_bzimageloader.c
> -GUEST_SRCS-$(CONFIG_X86)     += xc_dom_decompress_lz4.c

I don't think this was intentional, was it?

>  GUEST_SRCS-$(CONFIG_ARM)     += xc_dom_armzimageloader.c
>  GUEST_SRCS-y                 += xc_dom_binloader.c
>  GUEST_SRCS-y                 += xc_dom_compat_linux.c
> diff --git a/tools/libxc/xc_arm_migrate.c b/tools/libxc/xc_arm_migrate.c
> new file mode 100644
> index 0000000..461e339
> --- /dev/null
> +++ b/tools/libxc/xc_arm_migrate.c
> @@ -0,0 +1,712 @@

Is this implementing the exact protocol as described in
tools/libxc/xg_save_restore.h or is it a variant? Are there any docs of
the specifics of the ARM protocol?

We will eventually need to make a statement about the stability of the
protocol, i.e on x86 we support X->X+1 migrations across Xen versions. I
think we'd need to make similar guarantees on ARM before we would remove
the "tech preview" label from the migration feature.

So the docs are useful so we can review the intended protocol for
forward compatibility problems etc. We needn't necessarily implement the
x86 one from xg_save_restore.h.

In particular it would be nice if the protocol and each of the "chunks"
in it were explicitly versioned etc. For example the code assumes that
the HVM context implicitly follows the last iteration -- this caused
untold pain on x86 when remus was added...

> +/******************************************************************************
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation;
> + * version 2.1 of the License.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301 
>  USA
> + *
> + * Copyright (c) 2013, Samsung Electronics
> + */
> +
> +#include <inttypes.h>
> +#include <errno.h>
> +#include <xenctrl.h>
> +#include <xenguest.h>
> +
> +#include <unistd.h>
> +#include <xc_private.h>
> +#include <xc_dom.h>
> +#include "xc_bitops.h"
> +#include "xg_private.h"
> +
> +/* Guest RAM base */
> +#define GUEST_RAM_BASE 0x80000000
> +/*
> + *  XXX: Use correct definition for RAM base when the following patch
> + *  xen: arm: 64-bit guest support and domU FDT autogeneration
> + *  will be upstreamed.
> + */
> +
> +#define DEF_MAX_ITERS          29 /* limit us to 30 times round loop   */
> +#define DEF_MAX_FACTOR         3  /* never send more than 3x p2m_size  */
> +#define DEF_MIN_DIRTY_PER_ITER 50 /* dirty page count to define last iter */
> +#define DEF_PROGRESS_RATE      50 /* progress bar update rate */
> +
> +/* Enable this macro for debug only: "static" migration instead of live */
> +/*
> +#define DISABLE_LIVE_MIGRATION
> +*/

I don't think this is needed, the caller can be hacked if necessary.

> +
> +/* Enable this macro for debug only: additional debug info */
> +/*
> +#define ARM_MIGRATE_VERBOSE
> +*/

Likewise.

> +/* ============ Memory ============= */
> +static int save_memory(xc_interface *xch, int io_fd, uint32_t dom,
> +                       struct save_callbacks *callbacks,
> +                       uint32_t max_iters, uint32_t max_factor,
> +                       guest_params_t *params)
> +{
> +    int live =  !!(params->flags & XCFLAGS_LIVE);
> +    int debug =  !!(params->flags & XCFLAGS_DEBUG);
> +    xen_pfn_t i;
> +    char reportbuf[80];
> +    int iter = 0;
> +    int last_iter = !live;
> +    int total_dirty_pages_num = 0;
> +    int dirty_pages_on_prev_iter_num = 0;
> +    int count = 0;
> +    char *page = 0;
> +    xen_pfn_t *busy_pages = 0;
> +    int busy_pages_count = 0;
> +    int busy_pages_max = 256;
> +
> +    DECLARE_HYPERCALL_BUFFER(unsigned long, to_send);
> +
> +    xen_pfn_t start = params->start_gpfn;
> +    const xen_pfn_t end = params->max_gpfn;
> +    const xen_pfn_t mem_size = end - start;
> +
> +    if ( debug )
> +    {
> +        IPRINTF("(save mem) start=%llx end=%llx!\n", start, end);
> +    }

FYI you don't need the {}'s for cases like this.

is if ( debug ) IPRINTF(...) not the equivalent of DPRINTF?

> +
> +    if ( live )
> +    {
> +        if ( xc_shadow_control(xch, dom, 
> XEN_DOMCTL_SHADOW_OP_ENABLE_LOGDIRTY,
> +                    NULL, 0, NULL, 0, NULL) < 0 )
> +        {
> +            ERROR("Couldn't enable log-dirty mode !\n");
> +            return -1;
> +        }
> +
> +        max_iters  = max_iters  ? : DEF_MAX_ITERS;
> +        max_factor = max_factor ? : DEF_MAX_FACTOR;
> +
> +        if ( debug )
> +            IPRINTF("Log-dirty mode enabled, max_iters=%d, max_factor=%d!\n",
> +                    max_iters, max_factor);
> +    }
> +
> +    to_send = xc_hypercall_buffer_alloc_pages(xch, to_send,
> +                                              
> NRPAGES(bitmap_size(mem_size)));
> +    if ( !to_send )
> +    {
> +        ERROR("Couldn't allocate to_send array!\n");
> +        return -1;
> +    }
> +
> +    /* send all pages on first iter */
> +    memset(to_send, 0xff, bitmap_size(mem_size));
> +
> +    for ( ; ; )
> +    {
> +        int dirty_pages_on_current_iter_num = 0;
> +        int frc;
> +        iter++;
> +
> +        snprintf(reportbuf, sizeof(reportbuf),
> +                 "Saving memory: iter %d (last sent %u)",
> +                 iter, dirty_pages_on_prev_iter_num);
> +
> +        xc_report_progress_start(xch, reportbuf, mem_size);
> +
> +        if ( (iter > 1 &&
> +              dirty_pages_on_prev_iter_num < DEF_MIN_DIRTY_PER_ITER) ||
> +             (iter == max_iters) ||
> +             (total_dirty_pages_num >= mem_size*max_factor) )
> +        {
> +            if ( debug )
> +                IPRINTF("Last iteration");
> +            last_iter = 1;
> +        }
> +
> +        if ( last_iter )
> +        {
> +            if ( suspend_and_state(callbacks->suspend, callbacks->data,
> +                                   xch, dom) )
> +            {
> +                ERROR("Domain appears not to have suspended");
> +                return -1;
> +            }
> +        }
> +        if ( live && iter > 1 )
> +        {
> +            frc = xc_shadow_control(xch, dom, XEN_DOMCTL_SHADOW_OP_CLEAN,
> +                                    HYPERCALL_BUFFER(to_send), mem_size,
> +                                                     NULL, 0, NULL);
> +            if ( frc != mem_size )
> +            {
> +                ERROR("Error peeking shadow bitmap");
> +                xc_hypercall_buffer_free_pages(xch, to_send,
> +                                               
> NRPAGES(bitmap_size(mem_size)));
> +                return -1;
> +            }
> +        }
> +
> +        busy_pages = malloc(sizeof(xen_pfn_t) * busy_pages_max);
> +
> +        for ( i = start; i < end; ++i )
> +        {
> +            if ( test_bit(i - start, to_send) )
> +            {
> +                page = xc_map_foreign_range(xch, dom, PAGE_SIZE, PROT_READ, 
> i);

On x86 we try to do this in batches to reduce the overheads. I suppose
that could be a future enhancement.

> 
> +                if ( !page )
> +                {
> +                    /* This page is mapped elsewhere, should be resent later 
> */

What does this ("busy") mean? When does this happen?

[...]
> +
> +static int restore_guest_params(xc_interface *xch, int io_fd,
> +                                uint32_t dom, guest_params_t *params)
> +{
> [...]

> +    if ( xc_domain_setmaxmem(xch, dom, maxmemkb) )
> +    {
> +        ERROR("Can't set memory map");
> +        return -1;
> +    }
> +
> +    /* Set max. number of vcpus as max_vcpu_id + 1 */
> +    if ( xc_domain_max_vcpus(xch, dom, params->max_vcpu_id + 1) )

Does the higher level toolstack not take care of vcpus and maxmem? I
thought so. I think this is how it shoud be.

> +    {
> +        ERROR("Can't set max vcpu number for domain");
> +        return -1;
> +    }
> +
> +    return 0;
> +}
> +[...]

> diff --git a/tools/misc/Makefile b/tools/misc/Makefile
> index 17aeda5..0824100 100644
> --- a/tools/misc/Makefile
> +++ b/tools/misc/Makefile
> @@ -11,7 +11,7 @@ HDRS     = $(wildcard *.h)
>  
>  TARGETS-y := xenperf xenpm xen-tmem-list-parse gtraceview gtracestat 
> xenlockprof xenwatchdogd xencov
>  TARGETS-$(CONFIG_X86) += xen-detect xen-hvmctx xen-hvmcrash xen-lowmemd 
> xen-mfndump
> -TARGETS-$(CONFIG_MIGRATE) += xen-hptool
> +TARGETS-$(CONFIG_X86) += xen-hptool
>  TARGETS := $(TARGETS-y)
>  
>  SUBDIRS := $(SUBDIRS-y)
> @@ -23,7 +23,7 @@ INSTALL_BIN := $(INSTALL_BIN-y)
>  INSTALL_SBIN-y := xen-bugtool xen-python-path xenperf xenpm 
> xen-tmem-list-parse gtraceview \
>       gtracestat xenlockprof xenwatchdogd xen-ringwatch xencov
>  INSTALL_SBIN-$(CONFIG_X86) += xen-hvmctx xen-hvmcrash xen-lowmemd xen-mfndump
> -INSTALL_SBIN-$(CONFIG_MIGRATE) += xen-hptool
> +INSTALL_SBIN-$(CONFIG_X86) += xen-hptool
>  INSTALL_SBIN := $(INSTALL_SBIN-y)
>  
>  INSTALL_PRIVBIN-y := xenpvnetboot

You could resend these last two separately and they could probably go
straight in.

Ian.



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.