[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC][v2][PATCH 01/14] tools: introduce some new parameters to set rdm policy



On Fri, May 22, 2015 at 05:35:01PM +0800, Tiejun Chen wrote:
> This patch introduces user configurable parameters to specify RDM
> resource and according policies,
> 
> Global RDM parameter:
>     rdm = [ 'type=none/host, reserve=strict/relaxed' ]
> Per-device RDM parameter:
>     pci = [ 'sbdf, rdm_reserve=strict/relaxed' ]
> 
> Global RDM parameter, "type", allows user to specify reserved regions
> explicitly, e.g. using 'host' to include all reserved regions reported
> on this platform which is good to handle hotplug scenario. In the future
> this parameter may be further extended to allow specifying random regions,
> e.g. even those belonging to another platform as a preparation for live
> migration with passthrough devices. Instead, 'none' means we have nothing
> to do all reserved regions and ignore all policies, so guest work as before.
> 
> 'strict/relaxed' policy decides how to handle conflict when reserving RDM
> regions in pfn space. If conflict exists, 'strict' means an immediate error
> so VM will be killed, while 'relaxed' allows moving forward with a warning
> message thrown out.
> 
> Default per-device RDM policy is 'strict', while default global RDM policy
> is 'relaxed'. When both policies are specified on a given region, 'strict' is
> always preferred.
> 
> Signed-off-by: Tiejun Chen <tiejun.chen@xxxxxxxxx>
> ---
>  docs/man/xl.cfg.pod.5        | 57 +++++++++++++++++++++++++++
>  docs/misc/vtd.txt            | 24 ++++++++++++
>  tools/libxl/libxl_create.c   | 13 +++++++
>  tools/libxl/libxl_internal.h |  2 +
>  tools/libxl/libxl_pci.c      |  2 +
>  tools/libxl/libxl_types.idl  | 18 +++++++++
>  tools/libxl/libxlu_pci.c     | 92 
> ++++++++++++++++++++++++++++++++++++++++++++
>  tools/libxl/libxlutil.h      |  4 ++
>  tools/libxl/xl_cmdimpl.c     | 10 +++++
>  9 files changed, 222 insertions(+)
> 
> diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
> index 8e4154f..12c34c4 100644
> --- a/docs/man/xl.cfg.pod.5
> +++ b/docs/man/xl.cfg.pod.5
> @@ -645,6 +645,49 @@ assigned slave device.
>  
>  =back
>  
> +=item B<rdm= "RDM_RESERVE_STRING" >

Stray space after before and after "RDM_RESERVE_STRING". 

> +
> +(HVM/x86 only) Specifies the information about Reserved Device Memory (RDM),
> +which is necessary to enable robust device passthrough usage. One example of

Delete "usage".

> +RDM is reported through ACPI Reserved Memory Region Reporting (RMRR)
> +structure on x86 platform.
> +
> +B<RDM_RESERVE_STRING> has the form C<[KEY=VALUE,KEY=VALUE,...> where:
> +
> +=over 4
> +
> +=item B<KEY=VALUE>
> +
> +Possible B<KEY>s are:
> +
> +=over 4
> +
> +=item B<type="STRING">
> +
> +Currently we just have two types:
> +
> +"host" means all reserved device memory on this platform should be reserved
> +in this VM's pfn space. This global RDM parameter allows user to specify

PFN is Xen internal terminology. Do you mean "guest address space"? Note
that the reader is system administrators who might not know / want to
know Xen internals.

> +reserved regions explicitly. And using "host" to include all reserved regions
> +reported on this platform which is good to handle hotplug scenario. In the
> +future this parameter may be further extended to allow specifying random
> +regions, e.g. even those belonging to another platform as a preparation

Extending how? What's your envisaged syntax for those random regions?
Should you want to reserve more, an array is more useful. Could you
provide some examples?

> +for live migration with passthrough devices.
> +
> +"none" means we have nothing to do all reserved regions and ignore all 
> policies,
> +so guest work as before.
> +
> +=over 4
> +
> +=item B<reserve="STRING">
> +
> +Conflict may be detected when reserving reserved device memory in gfn space.

GFN is a Xen internal terminology. Maybe you should use "guest address
space"?

Nonetheless the terminology throughout this document should be
consistent.

> +"strict" means an unsolved conflict leads to immediate VM crash, while
> +"relaxed" allows VM moving forward with a warning message thrown out. 
> "relaxed"
> +is default.
> +
> +Note this may be overrided by another sub item, rdm_reserve, in pci device.
> +

"overridden by rdm_reserve option in PCI device configuration".

>  =item B<pci=[ "PCI_SPEC_STRING", "PCI_SPEC_STRING", ... ]>
>  
>  Specifies the host PCI devices to passthrough to this guest. Each 
> B<PCI_SPEC_STRING>
> @@ -707,6 +750,20 @@ dom0 without confirmation.  Please use with care.
>  D0-D3hot power management states for the PCI device. False (0) by
>  default.
>  
> +=item B<rdm_reserv="STRING">
> +
> +(HVM/x86 only) Specifies the information about Reserved Device Memory (RDM),
> +which is necessary to enable robust device passthrough usage. One example of

Delete "usage".

> +RDM is reported through ACPI Reserved Memory Region Reporting (RMRR)
> +structure on x86 platform.
> +
> +Conflict may be detected when reserving reserved device memory in gfn space.
> +"strict" means an unsolved conflict leads to immediate VM crash, while
> +"relaxed" allows VM moving forward with a warning message thrown out. 
> "strict"
> +is default.
> +

Actually these two paragraphs are the same as before. You can just point
readers to previous sections instead of copying them here.

> +Note this would override global B<rdm> option.
> +
>  =back
>  
>  =back
> diff --git a/docs/misc/vtd.txt b/docs/misc/vtd.txt
> index 9af0e99..7d63c47 100644
> --- a/docs/misc/vtd.txt
> +++ b/docs/misc/vtd.txt
> @@ -111,6 +111,30 @@ in the config file:
>  To override for a specific device:
>       pci = [ '01:00.0,msitranslate=0', '03:00.0' ]
>  
> +RDM, 'reserved device memory', for PCI Device Passthrough
> +---------------------------------------------------------
> +
> +There are some devices the BIOS controls, for e.g. USB devices to perform
> +PS2 emulation. The regions of memory used for these devices are marked
> +reserved in the e820 map. When we turn on DMA translation, DMA to those
> +regions will fail. Hence BIOS uses RMRR to specify these regions along with
> +devices that need to access these regions. OS is expected to setup
> +identity mappings for these regions for these devices to access these 
> regions.
> +
> +While creating a VM we should reserve them in advance, and avoid any 
> conflicts.
> +So we introduce user configurable parameters to specify RDM resource and
> +according policies,
> +
> +To enable this globally, add "rdm" in the config file:
> +
> +    rdm = "type=host, reserve=relaxed"   (default policy is "relaxed")
> +
> +Or just for a specific device:
> +
> +    pci = [ '01:00.0,rdm_reserve=relaxed', '03:00.0,rdm_reserve=strict' ]
> +
> +For all the options available to RDM, see xl.cfg(5).
> +
>  
>  Caveat on Conventional PCI Device Passthrough
>  ---------------------------------------------
> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
> index f0da7dc..d649ead 100644
> --- a/tools/libxl/libxl_create.c
> +++ b/tools/libxl/libxl_create.c
> @@ -100,6 +100,12 @@ static int sched_params_valid(libxl__gc *gc,
>      return 1;
>  }
>  
> +void libxl__rdm_setdefault(libxl__gc *gc, libxl_domain_build_info *b_info)
> +{
> +    b_info->rdm.type = LIBXL_RDM_RESERVE_TYPE_NONE;
> +    b_info->rdm.reserve = LIBXL_RDM_RESERVE_FLAG_RELAXED;

No, not like this. You set everything back to none and relaxed even if
it is set before this point.

It should be
    if (xxx == DEFAULT_SENTINEL_VALUE)
        xxx = THE_DEFAULT_YOU_WANT;

Have a look at libxl__device_nic_setdefault etc to get an idea
how it works. Don't hesitate to ask if I'm not clear enough.

> +}
> +
>  int libxl__domain_build_info_setdefault(libxl__gc *gc,
>                                          libxl_domain_build_info *b_info)
>  {
> @@ -410,6 +416,8 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
>                     libxl_domain_type_to_string(b_info->type));
>          return ERROR_INVAL;
>      }
> +
> +    libxl__rdm_setdefault(gc, b_info);
>      return 0;
>  }
>  
> @@ -1439,6 +1447,11 @@ static void domcreate_attach_pci(libxl__egc *egc, 
> libxl__multidev *multidev,
>      }
>  
>      for (i = 0; i < d_config->num_pcidevs; i++) {
> +        /*
> +         * If the rdm global policy is 'force' we should override each 
> device.
> +         */

"strict" not "force"

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.