[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH v3 1/1] xen: delay xen_hvm_init_time_ops() to xen_hvm_smp_prepare_boot_cpu()


  • To: xen-devel@xxxxxxxxxxxxxxxxxxxx, x86@xxxxxxxxxx
  • From: Dongli Zhang <dongli.zhang@xxxxxxxxxx>
  • Date: Thu, 24 Feb 2022 13:50:49 -0800
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=gZipOVj0ANJPBHydjW1hu0Ub3HQQYNC8CTUA2epZ4yg=; b=I66/Lw5Npf4r55R4d/s+m0V+iX/0MNLsioOrKDSwzzg51r+sO8ijhX5YK2DpIM6qm3geUq5i/eAXyROKRElhglA7ROamebZO1LkpLl4uQrim5v68ajD5pT3rrM31rGrt5QvyBnrTuHjr7b/n9RR5qlLjBvMm2hzNl+GFqfHRuVBEX1yEi5gm1HkqoV+hZNLFXxWCP6C7FzBABclZfYyG0NIMUa9HAz+Sxm123eFTv4X2DdNWewb4job+SbPF6uW2tbLOxdChlHX+v6Ll+2n9mZtEt4WuuEr3N/gDDde+b7bXRSQOiarnQPD94kUH4Usz2CJZiRoGAtTRLvkIy+cnNw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=nkgE1SjfwRbvJL3MQxLCHAqcCRB4o78p8vJ6s4+/HkU6u7/1bJYF8yFFtjYLkPjtJpTP120sCp81UzRfdX507m4PEODNl/0hZaOj7yGO+9mkbiK2Bbt4AmPQ30WIIdlkum9pNIOwoa6+eYZxdmCUvPswSrtWtsrdMwiBLembYZdlmerc8ivt4xGulMwW3FAH1v+duEM0qipYSMsoGw7RArcdEmuw/Xxbkv9CjHcUeXGrdQyTR68A8SjdKSdotMsp9akNkvAhB558xeLcxz3eGfErcUpDJoWKgQCOiT5f2/loaHbxr19xZSOdPSp/FVPJfwnce7IuNFozuNOGqQ0Jng==
  • Cc: linux-kernel@xxxxxxxxxxxxxxx, boris.ostrovsky@xxxxxxxxxx, jgross@xxxxxxxx, sstabellini@xxxxxxxxxx, tglx@xxxxxxxxxxxxx, mingo@xxxxxxxxxx, bp@xxxxxxxxx, dave.hansen@xxxxxxxxxxxxxxx, hpa@xxxxxxxxx, joe.jin@xxxxxxxxxx
  • Delivery-date: Thu, 24 Feb 2022 21:51:54 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

The sched_clock() can be used very early since commit 857baa87b642
("sched/clock: Enable sched clock early"). In addition, with commit
38669ba205d1 ("x86/xen/time: Output xen sched_clock time from 0"), kdump
kernel in Xen HVM guest may panic at very early stage when accessing
&__this_cpu_read(xen_vcpu)->time as in below:

setup_arch()
 -> init_hypervisor_platform()
     -> x86_init.hyper.init_platform = xen_hvm_guest_init()
         -> xen_hvm_init_time_ops()
             -> xen_clocksource_read()
                 -> src = &__this_cpu_read(xen_vcpu)->time;

This is because Xen HVM supports at most MAX_VIRT_CPUS=32 'vcpu_info'
embedded inside 'shared_info' during early stage until xen_vcpu_setup() is
used to allocate/relocate 'vcpu_info' for boot cpu at arbitrary address.

However, when Xen HVM guest panic on vcpu >= 32, since
xen_vcpu_info_reset(0) would set per_cpu(xen_vcpu, cpu) = NULL when
vcpu >= 32, xen_clocksource_read() on vcpu >= 32 would panic.

This patch always delays xen_hvm_init_time_ops() to later in
xen_hvm_smp_prepare_boot_cpu() after the 'vcpu_info' for boot vcpu is
registered.

This issue can be reproduced on purpose via below command at the guest
side when kdump/kexec is enabled:

"taskset -c 33 echo c > /proc/sysrq-trigger"

Unfortunately, the 'soft_reset' (kexec) does not work with mainline xen
version so that I can test this patch only with HVM guest on old xen
hypervisor where 'soft_reset' is working. The bugfix for PVM is not
implemented due to the lack of testing environment.

Cc: Joe Jin <joe.jin@xxxxxxxxxx>
Signed-off-by: Dongli Zhang <dongli.zhang@xxxxxxxxxx>
---
Changed since v1:
  - Add commit message to explain why xen_hvm_init_time_ops() is delayed
    for any vcpus. (Suggested by Boris Ostrovsky)
  - Add a comment in xen_hvm_smp_prepare_boot_cpu() referencing the related
    code in xen_hvm_guest_init(). (suggested by Juergen Gross)
Changed since v2:
  - Delay for all VCPUs. (Suggested by Boris Ostrovsky)
  - Add commit message that why PVM is not supported by this patch
  - Test if kexec/kdump works with mainline xen (HVM and PVM)

 arch/x86/xen/enlighten_hvm.c |  1 -
 arch/x86/xen/smp_hvm.c       | 17 +++++++++++++++++
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index 517a9d8d8f94..53f306ec1d3b 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -216,7 +216,6 @@ static void __init xen_hvm_guest_init(void)
        WARN_ON(xen_cpuhp_setup(xen_cpu_up_prepare_hvm, xen_cpu_dead_hvm));
        xen_unplug_emulated_devices();
        x86_init.irqs.intr_init = xen_init_IRQ;
-       xen_hvm_init_time_ops();
        xen_hvm_init_mmu_ops();
 
 #ifdef CONFIG_KEXEC_CORE
diff --git a/arch/x86/xen/smp_hvm.c b/arch/x86/xen/smp_hvm.c
index 6ff3c887e0b9..9a5efc1a1633 100644
--- a/arch/x86/xen/smp_hvm.c
+++ b/arch/x86/xen/smp_hvm.c
@@ -19,6 +19,23 @@ static void __init xen_hvm_smp_prepare_boot_cpu(void)
         */
        xen_vcpu_setup(0);
 
+       /*
+        * xen_hvm_init_time_ops() used to be called at very early stage
+        * by xen_hvm_guest_init(). While only MAX_VIRT_CPUS 'vcpu_info'
+        * are embedded inside 'shared_info', the VM would use them until
+        * xen_vcpu_setup() is used to allocate/relocate them at arbitrary
+        * address.
+        *
+        * However, when Xen HVM guest boots on vcpu >= MAX_VIRT_CPUS
+        * (e.g., kexec kernel), per_cpu(xen_vcpu, cpu) is NULL at early
+        * stage. To access per_cpu(xen_vcpu, cpu) via
+        * xen_clocksource_read() would panic the kernel.
+        *
+        * Therefore we always delay xen_hvm_init_time_ops() to
+        * xen_hvm_smp_prepare_boot_cpu() to avoid the panic.
+        */
+       xen_hvm_init_time_ops();
+
        /*
         * The alternative logic (which patches the unlock/lock) runs before
         * the smp bootup up code is activated. Hence we need to set this up
-- 
2.17.1




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.