[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Limitations for Running Xen on KVM Arm64


  • To: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: "haseeb.ashraf@xxxxxxxxxxx" <haseeb.ashraf@xxxxxxxxxxx>
  • Date: Thu, 30 Oct 2025 13:41:53 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=siemens.com; dmarc=pass action=none header.from=siemens.com; dkim=pass header.d=siemens.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=i7aWDt8g5z1cxCqCB4c7aoXoSY3AGcKGkqXkPCZxVks=; b=tN64sxUjP6XeeljX5Qw+5qT10sGbbderEiD0wwZBRHwGUzptO8yY+700dfG+IOJ2Vw0mMLfoYVOY36pPhj5hIeJnxtO9esU6gKe/Vo03H7U7G4hRIrNjMLKuqc91mgk6IizhE1BPwIk+GJkXGxjgUGUF99WS1XZaFiVPLrYv3Hb9SryJZZT4xXNuGz6bqNeiYmpqL53guBiJ5kyQnP2FXfqWVkPPiDZZ24V15yhzlnfHpyJKE6YcqBbRezARGJ/LUa6tFIv54WYEsCD3A4uqp59rkyb5Ja/oc/4yNH/GgOl7jTRCARidaQdMFQ5zzsDRUFba7XKWgjUgXP50Lmpe3g==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=bQihtHF0AK4sge7v54rGzrySLWQdwm45JstZsEue+QT9VBTpXr1pQhfHU0C2gA/FxApcBGq8I+s+d+6bHj5YlZQO9GlU8F2ttMWnElvwm5HSdbpAY/n49hq2k/WdnhCWazGp0l9GCTOmrpIR9IwF2lQk4MZGw48dYlaubSQVcpiscyeuIMiDFHNFX9ySxzQx3itecxSOV5F0WOjc5l9B9JhZprR09ZjFr2YA2iLZ5rwMYiqKXDUUz5TGi+KP2XGqDplNdz29Bb4Twb/ziOnaAlzPQ7rBTHbe2+VchOBWrcnivkE96QFQwF3yO0WCvOK1DRwZ/qeKPcTC8EADQ9LJFw==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=siemens.com;
  • Cc: "julien@xxxxxxx" <julien@xxxxxxx>, "Volodymyr_Babchuk@xxxxxxxx" <Volodymyr_Babchuk@xxxxxxxx>
  • Delivery-date: Thu, 30 Oct 2025 13:42:07 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Msip_labels: MSIP_Label_9d258917-277f-42cd-a3cd-14c4e9ee58bc_Enabled=True;MSIP_Label_9d258917-277f-42cd-a3cd-14c4e9ee58bc_SiteId=38ae3bcd-9579-4fd4-adda-b42e1495d55a;MSIP_Label_9d258917-277f-42cd-a3cd-14c4e9ee58bc_SetDate=2025-10-30T13:41:53.035Z;MSIP_Label_9d258917-277f-42cd-a3cd-14c4e9ee58bc_Name=C1 - Restricted;MSIP_Label_9d258917-277f-42cd-a3cd-14c4e9ee58bc_ContentBits=1;MSIP_Label_9d258917-277f-42cd-a3cd-14c4e9ee58bc_Method=Standard;
  • Thread-index: AQHcSVi3paxtnyIgIkOf6jhuJCbvfrTalv6c
  • Thread-topic: Limitations for Running Xen on KVM Arm64

Adding @julien@xxxxxxx and replying to his questions he asked over #XenDevel:matrix.org.

can you add some details why the implementation cannot be optimized in KVM? Asking because I have never seen such issue when running Xen on QEMU (without nested virt enabled).
AFAIK when Xen is run on QEMU without virtualization, then instructions are emulated in QEMU while with KVM, ideally the instruction should run directly on hardware except in some special cases (those trapped by FGT/CGT). Such as this one where KVM maintains shadow page tables for each VM. It traps these instructions and emulates them with callback such as handle_vmalls12e1is(). The way this callback is implemented, it has to iterate over the whole address space and clean-up the page tables which is a costly operation. Regardless of this, it should still be optimized in Xen as invalidating a selective range would be much better than invalidating a whole range of 48-bit address space.
Some details about your platform and use case would be helpful. I am interested to know whether you are using all the features for nested virt.
I am using AWS G4. My use case is to run Xen as guest hypervisor. Yes, most of the features are enabled except VHE or those which are disabled by KVM.

Regards,
Haseeb Ashraf

From: Ashraf, Haseeb (DI SW EDA HAV SLS EPS RTOS LIN)
Sent: Thursday, October 30, 2025 11:12 AM
To: xen-devel@xxxxxxxxxxxxxxxxxxxx <xen-devel@xxxxxxxxxxxxxxxxxxxx>
Subject: Limitations for Running Xen on KVM Arm64
 
Hello Xen development community,

I wanted to discuss the limitations that I have faced while running Xen on KVM on Arm64 machines. I hope I am using the right mailing list.

The biggest limitation is the costly emulation of instruction tlbi vmalls12e1is in KVM. The cost is exponentially proportional to the IPA size exposed by KVM for VM hosting Xen. If I reduce the IPA size to 40-bits in KVM, then this issue is not much observable but with the IPA size of 48-bits, it is 256x more costly than the former one. Xen uses this instruction too frequently and this instruction is trapped and emulated by KVM, and performance is not as good as on bare-metal hardware. With 48-bit IPA, it can take up to 200 minutes for domu creation with just 128M RAM. I have identified two places in Xen which are problematic w.r.t the usage of this instruction and hoping to reduce the frequency of this instruction or use a more relevant TLBI instruction instead of invalidating whole stage-1 and stage-2 translations.

diff --git a/xen/arch/arm/mmu/p2m.c b/xen/arch/arm/mmu/p2m.c
index 7642dbc7c5..e96ff92314 100644
--- a/xen/arch/arm/mmu/p2m.c
+++ b/xen/arch/arm/mmu/p2m.c
@@ -1103,7 +1103,8 @@ static int __p2m_set_entry(struct p2m_domain *p2m,

    if ( removing_mapping )
        /* Flush can be deferred if the entry is removed */
-        p2m->need_flush |= !!lpae_is_valid(orig_pte);
+        //p2m->need_flush |= !!lpae_is_valid(orig_pte);
+        p2m->need_flush |= false;
    else
    {
        lpae_t pte = mfn_to_p2m_entry(smfn, t, a);
; switch to current VMID
tlbi rvae1, guest_vaddr ; first invalidate stage-1 TLB by guest VA for current VMID
tlbi ripas2e1, guest_paddr ; then invalidate stage-2 TLB by IPA range for current VMID
dsb ish
isb
; switch back the VMID

diff --git a/xen/arch/arm/mmu/p2m.c b/xen/arch/arm/mmu/p2m.c
index 7642dbc7c5..e96ff92314 100644
--- a/xen/arch/arm/mmu/p2m.c
+++ b/xen/arch/arm/mmu/p2m.c
@@ -247,7 +247,7 @@ void p2m_restore_state(struct vcpu *n)
      * when running multiple vCPU of the same domain on a single pCPU.
      */
     if ( *last_vcpu_ran != INVALID_VCPU_ID && *last_vcpu_ran != n->vcpu_id )
-        flush_guest_tlb_local();
+        ; // flush_guest_tlb_local();
 
     *last_vcpu_ran = n->vcpu_id;
 } 

Thanks & Regards,
Haseeb Ashraf

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.