[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH V2 3/3] libxl/arm: Add handling of extended regions for DomU




On 17.09.21 01:35, Stefano Stabellini wrote:

Hi Stefano

On Fri, 10 Sep 2021, Oleksandr Tyshchenko wrote:
From: Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx>

The extended region (safe range) is a region of guest physical
address space which is unused and could be safely used to create
grant/foreign mappings instead of wasting real RAM pages from
the domain memory for establishing these mappings.

The extended regions are chosen at the domain creation time and
advertised to it via "reg" property under hypervisor node in
the guest device-tree. As region 0 is reserved for grant table
space (always present), the indexes for extended regions are 1...N.
If extended regions could not be allocated for some reason,
Xen doesn't fail and behaves as usual, so only inserts region 0.

Please note the following limitations:
- The extended region feature is only supported for 64-bit domain.
- The ACPI case is not covered.

***

The algorithm to choose extended regions for non-direct mapped
DomU is simpler in comparison with the algorithm for direct mapped
Dom0. As we have a lot of unused space above 4GB, provide single
1GB-aligned region from the second RAM bank taking into the account
the maximum supported guest address space size and the amount of
memory assigned to the guest. The maximum size of the region is 128GB.

Suggested-by: Julien Grall <jgrall@xxxxxxxxxx>
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx>
---
Changes since RFC:
    - update patch description
    - drop uneeded "extended-region" DT property
    - clear reg array in finalise_ext_region() and add a TODO
---
  tools/libs/light/libxl_arm.c | 89 +++++++++++++++++++++++++++++++++++++++++++-
  1 file changed, 87 insertions(+), 2 deletions(-)

diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
index e3140a6..8c1d9d7 100644
--- a/tools/libs/light/libxl_arm.c
+++ b/tools/libs/light/libxl_arm.c
@@ -615,9 +615,12 @@ static int make_hypervisor_node(libxl__gc *gc, void *fdt,
                                "xen,xen");
      if (res) return res;
- /* reg 0 is grant table space */
+    /*
+     * reg 0 is a placeholder for grant table space, reg 1 is a placeholder
+     * for the extended region.
+     */
      res = fdt_property_regs(gc, fdt, GUEST_ROOT_ADDRESS_CELLS, 
GUEST_ROOT_SIZE_CELLS,
-                            1,GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
+                            2, 0, 0, 0, 0);
      if (res) return res;
/*
@@ -1069,6 +1072,86 @@ static void finalise_one_node(libxl__gc *gc, void *fdt, 
const char *uname,
      }
  }
+#define ALIGN_UP_TO_GB(x) (((x) + GB(1) - 1) & (~(GB(1) - 1)))
Why do we need to align it to 1GB when for Dom0 is aligned to 2MB? I
think it makes sense to have the same alignment requirement.
Here, unlike with Dom0, we can provide indeed a big region (the maximum size is 128GB), so I decided to use maximum block size for the alignment.
ok, I will use 2MB alignment to be consistent will Dom0.




+
+#define EXT_REGION_SIZE   GB(128)
The region needs to be described in xen/include/public/arch-arm.h like
GUEST_RAM1_BASE/SIZE.

ok, will do




+static void finalise_ext_region(libxl__gc *gc, struct xc_dom_image *dom)
+{
+    void *fdt = dom->devicetree_blob;
+    uint32_t regs[(GUEST_ROOT_ADDRESS_CELLS + GUEST_ROOT_SIZE_CELLS) * 2];
+    be32 *cells = &regs[0];
+    uint64_t region_size = 0, region_base, bank1end_align, bank1end_max;
+    uint32_t gpaddr_bits;
+    libxl_physinfo info;
+    int offset, rc;
+
+    offset = fdt_path_offset(fdt, "/hypervisor");
+    assert(offset > 0);
+
+    if (strcmp(dom->guest_type, "xen-3.0-aarch64")) {
+        LOG(WARN, "The extended region is only supported for 64-bit guest");
+        goto out;
+    }
+
+    rc = libxl_get_physinfo(CTX, &info);
+    assert(!rc);
+
+    gpaddr_bits = info.gpaddr_bits;
+    assert(gpaddr_bits >= 32 && gpaddr_bits <= 48);
+
+    /*
+     * Try to allocate single 1GB-aligned extended region from the second RAM
+     * bank (above 4GB) taking into the account the maximum supported guest
+     * address space size and the amount of memory assigned to the guest.
+     * The maximum size of the region is 128GB.
+     */
+    bank1end_max = min(1ULL << gpaddr_bits, GUEST_RAM1_BASE + GUEST_RAM1_SIZE);
+    bank1end_align = GUEST_RAM1_BASE +
+        ALIGN_UP_TO_GB((uint64_t)dom->rambank_size[1] << XC_PAGE_SHIFT);
+
+    if (bank1end_max <= bank1end_align) {
+        LOG(WARN, "The extended region cannot be allocated, not enough space");
+        goto out;
+    }
+
+    if (bank1end_max - bank1end_align > EXT_REGION_SIZE) {
+        region_base = bank1end_max - EXT_REGION_SIZE;
+        region_size = EXT_REGION_SIZE;
+    } else {
+        region_base = bank1end_align;
+        region_size = bank1end_max - bank1end_align;
+    }
+
+out:
+    /*
+     * The first region for grant table space must be always present.
+     * If we managed to allocate the extended region then insert it as
+     * a second region.
+     * TODO If we failed to allocate the region, we end up inserting
+     * zero-sized region. This is because we don't know in advance when
+     * creating hypervisor node whether it would be possible to allocate
+     * a region, but we have to create a placeholder anyway. The Linux driver
+     * is able to deal with by checking the region size. We cannot choose
+     * a region when creating hypervisor node because the guest memory layout
+     * is not know at that moment (and dom->rambank_size[1] is empty).
+     * We need to find a way not to expose invalid regions.
+     */
This is not great -- it would be barely spec compliant.

Absolutely agree.


When make_hypervisor_node is called we know the max memory of the guest
as build_info.max_memkb should be populate, right?

Right. Just a small change to pass build_info to make_hypervisor_node() is needed.



If so, we could at least detect whether we can have an extended region
(if not caculate the exact start address) from make_hypervisor_node.

total_guest_memory = build_info.max_memkb * 1024;
rambank1_approx = total_guest_memory - GUEST_RAM0_SIZE;
extended_region_size = GUEST_RAM1_SIZE - rambank1_approx;

if (extended_region_size >= MIN_EXT_REGION_SIZE)
     allocate_ext_region
Good point! I will recheck that. I would prefer avoid spreading extended region handling (introduce finalise_ext_region())
and do everything from the make_hypervisor_node().




+    memset(regs, 0, sizeof(regs));
+    set_range(&cells, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS,
+              GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
+    if (region_size > 0) {
I think we want to check against a minimum amount rather than 0. Maybe 64MB?

Sounds reasonable, will update.




+        LOG(DEBUG, "Extended region: %#"PRIx64"->%#"PRIx64"\n",
+            region_base, region_base + region_size);
+
+        set_range(&cells, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS,
+                  region_base, region_size);
+    }
+
+    rc = fdt_setprop_inplace(fdt, offset, "reg", regs, sizeof(regs));
+    assert(!rc);
+}
+
  int libxl__arch_domain_finalise_hw_description(libxl__gc *gc,
                                                 uint32_t domid,
                                                 libxl_domain_config *d_config,
@@ -1109,6 +1192,8 @@ int libxl__arch_domain_finalise_hw_description(libxl__gc 
*gc,
} + finalise_ext_region(gc, dom);
+
      for (i = 0; i < GUEST_RAM_BANKS; i++) {
          const uint64_t size = (uint64_t)dom->rambank_size[i] << XC_PAGE_SHIFT;
--
2.7.4


Thank you.


--
Regards,

Oleksandr Tyshchenko




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.