[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 4/5] Xen/MCE: Abort live migration when vMCE occur



On Tue, 2012-09-18 at 14:16 +0100, Liu, Jinsong wrote:
> Xen/MCE: Abort live migration when vMCE occur
> 
> This patch monitor the critical area of live migration (from vMCE point of 
> view,
> the copypages stage of migration is the critical area while other areas are 
> not).
> 
> If a vMCE occur at the critical area of live migration, abort and try 
> migration later.

Can you elaborate a little on why it is necessary to abort and try
again?

> Signed-off-by: Liu, Jinsong <jinsong.liu@xxxxxxxxx>
> 
> diff -r f843ac6f93c9 tools/libxc/xc_domain.c
> --- a/tools/libxc/xc_domain.c Wed Sep 19 01:21:18 2012 +0800
> +++ b/tools/libxc/xc_domain.c Wed Sep 19 03:31:30 2012 +0800
> @@ -283,6 +283,37 @@
>      return ret;
>  }
>  
> +/* Start vmce monitor */
> +int xc_domain_vmce_monitor_strat(xc_interface *xch,

strat?

> +                                 uint32_t domid)
> +{
> +    int ret;
> +    DECLARE_DOMCTL;
> +
> +    domctl.cmd = XEN_DOMCTL_vmce_monitor_start;
> +    domctl.domain = (domid_t)domid;
> +    ret = do_domctl(xch, &domctl);
> +
> +    return ret ? -1 : 0;
> +}
> +
> +/* End vmce monitor */
> +int xc_domain_vmce_monitor_end(xc_interface *xch,
> +                               uint32_t domid,
> +                               signed char *vmce_while_monitor)
> +{
> +    int ret;
> +    DECLARE_DOMCTL;
> +
> +    domctl.cmd = XEN_DOMCTL_vmce_monitor_end;
> +    domctl.domain = (domid_t)domid;
> +    ret = do_domctl(xch, &domctl);
> +    if ( !ret )
> +        *vmce_while_monitor = domctl.u.vmce_monitor.vmce_while_monitor;

Any reason this is a char rather than an int?

> +    return ret ? -1 : 0;
> +}
> +
>  /* get info from hvm guest for save */
>  int xc_domain_hvm_getcontext(xc_interface *xch,
>                               uint32_t domid,
> [...]
> diff -r f843ac6f93c9 tools/libxc/xenctrl.h
> --- a/tools/libxc/xenctrl.h   Wed Sep 19 01:21:18 2012 +0800
> +++ b/tools/libxc/xenctrl.h   Wed Sep 19 03:31:30 2012 +0800
> @@ -571,6 +571,26 @@
>                            xc_domaininfo_t *info);
>  
>  /**
> + * This function start monitor vmce event.
> + * @parm xch a handle to an open hypervisor interface
> + * @parm domid the domain id monitored
> + * @return 0 on success, -1 on failure
> + */
> +int xc_domain_vmce_monitor_strat(xc_interface *xch,
> +                                 uint32_t domid);
> +
> +/**
> + * This function end monitor vmce event
> + * @parm xch a handle to an open hypervisor interface
> + * @parm domid the domain id monitored
> + * @parm vmce_while_migrate a pointer return whether vMCE occur when migrate 

This function isn't actually specific to migration (even if that happens
to be the only user currently), it just tracks whether a vMCE occurs
while monitoring was in progress AFAICT.

> + * @return 0 on success, -1 on failure
> + */
> +int xc_domain_vmce_monitor_end(xc_interface *xch,
> +                               uint32_t domid,
> +                               signed char *vmce_while_monitor);
> +
> +/**
>   * This function returns information about the context of a hvm domain
>   * @parm xch a handle to an open hypervisor interface
>   * @parm domid the domain to get information from
> diff -r f843ac6f93c9 xen/include/asm-x86/domain.h
> --- a/xen/include/asm-x86/domain.h    Wed Sep 19 01:21:18 2012 +0800
> +++ b/xen/include/asm-x86/domain.h    Wed Sep 19 03:31:30 2012 +0800
> @@ -279,6 +279,11 @@
>      bool_t has_32bit_shinfo;
>      /* Domain cannot handle spurious page faults? */
>      bool_t suppress_spurious_page_faults;
> +    /* Monitoring guest memory copy of migration
> +     * = 0 - not monitoring
> +     * > 0 - monitoring
> +     * < 0 - vMCE occurred while monitoring */
> +    s8 vmce_monitor;
>  
>      /* Continuable domain_relinquish_resources(). */
>      enum {
> diff -r f843ac6f93c9 xen/include/public/domctl.h
> --- a/xen/include/public/domctl.h     Wed Sep 19 01:21:18 2012 +0800
> +++ b/xen/include/public/domctl.h     Wed Sep 19 03:31:30 2012 +0800
> @@ -828,6 +828,12 @@
>  typedef struct xen_domctl_set_access_required 
> xen_domctl_set_access_required_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_domctl_set_access_required_t);
>  
> +struct xen_domctl_vmce_monitor {
> +    signed char vmce_while_monitor;

You leak the semantics of the internal flag into this variable which
makes it rather clumsy to use (e.g. you have to check for <0). This
should just be a bool I think.

Calling vmce_monitor_end without a preceding monitor start should be an
error (-EINVAL?) and this value would be undefined in that case.

Do you actually need struct xen_domctl_vmce_monitor could the flag not
be part of the return value of XEN_DOMCTL_vmce_monitor_end? e.g. -ERRNO
on error, 0 if no vmce, 1 if vmce occurred?

Also calling vmce_monitor_start while monitoring is already in progress
should result in -EBUSY, otherwise multiple agents who try to monitor
will get unexpected/inconsistent results.

> +};
> +typedef struct xen_domctl_vmce_monitor xen_domctl_vmce_monitor_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_domctl_vmce_monitor_t);
> +
>  struct xen_domctl {
>      uint32_t cmd;
>  #define XEN_DOMCTL_createdomain                   1
> @@ -893,6 +899,8 @@
>  #define XEN_DOMCTL_set_access_required           64
>  #define XEN_DOMCTL_audit_p2m                     65
>  #define XEN_DOMCTL_set_virq_handler              66
> +#define XEN_DOMCTL_vmce_monitor_start            67
> +#define XEN_DOMCTL_vmce_monitor_end              68
>  #define XEN_DOMCTL_gdbsx_guestmemio            1000
>  #define XEN_DOMCTL_gdbsx_pausevcpu             1001
>  #define XEN_DOMCTL_gdbsx_unpausevcpu           1002
> @@ -947,6 +955,7 @@
>          struct xen_domctl_set_access_required access_required;
>          struct xen_domctl_audit_p2m         audit_p2m;
>          struct xen_domctl_set_virq_handler  set_virq_handler;
> +        struct xen_domctl_vmce_monitor      vmce_monitor;
>          struct xen_domctl_gdbsx_memio       gdbsx_guest_memio;
>          struct xen_domctl_gdbsx_pauseunp_vcpu gdbsx_pauseunp_vcpu;
>          struct xen_domctl_gdbsx_domstatus   gdbsx_domstatus;



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.