[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] cpu/hotplug: Allow the CPU in CPU_UP_PREPARE state to be brought up again.


  • To: Thomas Gleixner <tglx@xxxxxxxxxxxxx>, Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>, "Longpeng (Mike, Cloud Infrastructure Service Product Dept.)" <longpeng2@xxxxxxxxxx>
  • From: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>
  • Date: Wed, 24 Nov 2021 21:17:34 -0500
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=XIExMbmexIaHezqSI/l6ZxC8Z/2Hwa+QIr5mkWtcLk8=; b=MCsOZo8TKfgtYvHtE3uVJeUDw8k37ERRWSwskj80GuWO2NMhOkK1ZjNBCDNCZ5jorqhrjDM+NYpuU3PelI3S+ARRVA0m9KDnTdg7tocxjP2hX9dApKiYjCx/S8H2VhVsZedL9TCkVAw8ajLZlKlGoAbaYWprMaJjIlv5ytgVQAg8ZmMp85WYdISk6LkPj5yo/uX6EusJEdWndczBGIndurms0GMdU3/Qs2tdrH55eEAVTN123SsAbREdavf80KOlfrvSlbn50FEOL3i8nymfmrnG9wQqPydgdwjiyAGh7SpbGQ3pM3b2Q2ccl3S3HPU48ACLrhfU7hS0tqPZBkeyxg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=nwVH5PMmRewWFxcNUl8jo4h95uf6DrWPCIBFy4hTPfb9F7aq+W5sO7gYot+wn3HSpzIPXZRiztnJOxwchoOkt8dkN2KzVdiyWE8+xB7D8x+dMZEGgClQ4hAOffdyjKVffNgf/j3QlLbLVMR3W/IKRneQ1Z1zAgS4Mv1tMsSo0sPwhT1oz8PxaczWNZUPVscw7OFviSgrr4/R7FhHl8KtEBtpg8T6CpkySpwn18WsBTgH4F47MgfV9Oz8zOTkztcUb6U+kcUQbtqXRsn6Uej0lHqDb0fCDBWpUMhOyCfhoPIIdYKThk3APYcJEqhoW4Zz2SxLUocUakiRA26KhDK6DQ==
  • Cc: linux-kernel@xxxxxxxxxxxxxxx, "Gonglei (Arei)" <arei.gonglei@xxxxxxxxxx>, x86@xxxxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxxx, Peter Zijlstra <peterz@xxxxxxxxxxxxx>, Ingo Molnar <mingo@xxxxxxxxxx>, Valentin Schneider <valentin.schneider@xxxxxxx>, Juergen Gross <jgross@xxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Ingo Molnar <mingo@xxxxxxxxxx>, Borislav Petkov <bp@xxxxxxxxx>, Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>, "H. Peter Anvin" <hpa@xxxxxxxxx>
  • Delivery-date: Thu, 25 Nov 2021 02:20:19 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>


On 11/24/21 5:54 PM, Thomas Gleixner wrote:
On Mon, Nov 22 2021 at 16:47, Sebastian Andrzej Siewior wrote:
From: "Longpeng(Mike)" <longpeng2@xxxxxxxxxx>

A CPU will not show up in virtualized environment which includes an
Enclave. The VM splits its resources into a primary VM and a Enclave
VM. While the Enclave is active, the hypervisor will ignore all requests
to bring up a CPU and this CPU will remain in CPU_UP_PREPARE state.
The kernel will wait up to ten seconds for CPU to show up
(do_boot_cpu()) and then rollback the hotplug state back to
CPUHP_OFFLINE leaving the CPU state in CPU_UP_PREPARE. The CPU state is
set back to CPUHP_TEARDOWN_CPU during the CPU_POST_DEAD stage.

After the Enclave VM terminates, the primary VM can bring up the CPU
again.

Allow to bring up the CPU if it is in the CPU_UP_PREPARE state.

[bigeasy: Rewrite commit description.]

Signed-off-by: Longpeng(Mike) <longpeng2@xxxxxxxxxx>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
Link: https://lore.kernel.org/r/20210901051143.2752-1-longpeng2@xxxxxxxxxx
---

For XEN: this changes the behaviour as it allows to invoke
cpu_initialize_context() again should it have have earlier. I *think*
this is okay and would to bring up the CPU again should the memory
allocation in cpu_initialize_context() fail.
Any comment from XEN folks?


If memory allocation in cpu_initialize_context() fails we will not be able to 
bring up the VCPU because xen_cpu_initialized_map bit at the top of that 
routine will already have been set. We will BUG in xen_pv_cpu_up() on second 
(presumably successful) attempt because nothing for that VCPU would be 
initialized. This can in principle be fixed by moving allocation to the top of 
the routine and freeing context if the bit in the bitmap is already set.


Having said that, allocation really should not fail: for PV guests we first 
bring max number of VCPUs up and then offline them down to however many need to 
run. I think if we fail allocation during boot we are going to have a really 
bad day anyway.



-boris




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.