[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH] xen/memory: Introduce a hypercall to provide unallocated space





On 04/08/2021 21:56, Oleksandr wrote:

Hi Julien, Stefano.

Hi Oleksandr,


On 02.08.21 22:12, Oleksandr wrote:
I have done some experiments with Xen and toolstack according to the discussion above. So, I re-used DTB to pass a safe range to the domain. For the range I borrowed some space from the second RAM bank.

-#define GUEST_RAM1_BASE   xen_mk_ullong(0x0200000000) /* 1016GB of RAM @ 8GB */
-#define GUEST_RAM1_SIZE   xen_mk_ullong(0xfe00000000)
+#define GUEST_RAM1_BASE   xen_mk_ullong(0x0200000000) /* 888GB of RAM @ 8GB */
+#define GUEST_RAM1_SIZE   xen_mk_ullong(0xDE00000000)
+

I am a bit split with reducing the amount of RAM. On one hand large guest is not unheard on the server side (at least in the x86 world). On the other hand, I am not aware of anyone using Xen on Arm in such setup.

So technically this will be a regression, but it may be OK.

Regarding the range, this will be a problem as Xen configure the number of the IPA bits based on the PA bits. The lowest possible address space ize on 64-bit is 4GB.

From my understanding, this is because the number of IPA bits supported is contrained by the PA bits. So the position and the size of the region
would need to depend on the P2M configuration.

For simplicity, this could be the last few X bytes of the supported address space.

For 32-bit domain, we also need to make sure the address is usable for domain short page tables (not too long ago Debian was shipping the kernel with them rather than LPAE). I haven't yet checked what's the limit here.

+#define GUEST_SAFE_RANGE_BASE   xen_mk_ullong(0xDE00000000) /* 128GB */
+#define GUEST_SAFE_RANGE_SIZE   xen_mk_ullong(0x0200000000)

While the possible new DT bindings has not been agreed yet, I re-used existing "reg" property under the hypervisor node to pass safe range as a second region, https://elixir.bootlin.com/linux/v5.14-rc4/source/Documentation/devicetree/bindings/arm/xen.txt#L10:

So a single region works for a guest today, but for dom0 we will need multiple regions because it is may be difficult to find enough contiguous space for a single region.

That said, as dom0 is mapped 1:1 (including some guest mapping), there is also the question where to allocate the safe region. For grant table, we so far re-use the Xen address space because it is assumed it will space will always be bigger than the grant table.

I am not sure yet where we could allocate the safe regions. Stefano, do you have any ideas?




--- a/tools/libs/light/libxl_arm.c
+++ b/tools/libs/light/libxl_arm.c
@@ -735,9 +735,11 @@ static int make_hypervisor_node(libxl__gc *gc, void *fdt,
                                "xen,xen");
      if (res) return res;

-    /* reg 0 is grant table space */milat
+    /* reg 0 is grant table space, reg 1 is safe range */
     res = fdt_property_regs(gc, fdt, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS,
-                            1,GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
+                            2,
+                            GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE,
+                            GUEST_SAFE_RANGE_BASE, GUEST_SAFE_RANGE_SIZE);
      if (res) return res;

      /*


/* Resulting hypervisor node */

  hypervisor {
                 interrupts = <0x01 0x0f 0xf08>;
                 interrupt-parent = <0xfde8>;
                 compatible = "xen,xen-4.16\0xen,xen";
                reg = <0x00 0x38000000 0x00 0x1000000 0xde 0x00 0x02 0x00>;
  };


Near the same I did for the Xen itself to insert a range for Dom0. The Linux side change is just to retrieve a range from DTB instead of issuing a hypercall.

Sorry, I might miss some important bits here, but from what I wrote about the "reg" purpose, it seems it could be suitable for us, why actually not? Why do we need yet another binding? I noticed, Linux on Arm doesn't use it at all, probably it is used by other OSes, I don't know.

Linux used the range until 4.7. This was dropped by commit 3cf4095d7446efde28b48c26050b9db6f0bcb004 so the same code can be used by ACPI and DT. However, looking at this now, I think this was a bad decision because it means we are shattering superpages.

So ideally we should switch back the region to use the safe address space once this is in place.


Now, I am wondering, would it be possible to update/clarify the current "reg" purpose and use it to pass a safe unallocated space for any Xen specific mappings (grant, foreign, whatever) instead of just for the grant table region. In case, it is not allowed for any reason (compatibility PoV, etc), would it be possible to extend a property by passing an extra range separately, something similar to how I described above?

I think it should be fine to re-use the same region so long the size of the first bank is at least the size of the original region.

I also think we should be able to add extra regions as OSes are unlikely to enforce that the "reg" contains a single region.

That said, we need to be careful about new guests as the region may be quite small on older Xen. So we would need some heuristic to decide whether to stole some RAM or use the safe space.

Another possibility would be to add a new compatible in the DT that indicates the region is "big" enough.

Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.