[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[RFC] x86: Alternative AP bringup protocol (i.e SMP on SEV-ES)


  • To: Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: "Teddy Astie" <teddy.astie@xxxxxxxxxx>
  • Date: Thu, 30 Oct 2025 17:43:45 +0000
  • Delivery-date: Thu, 30 Oct 2025 17:44:04 +0000
  • Feedback-id: 30504962:30504962.20251030:md
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hello,

In order to implement SMP support in SEV-ES, things quickly gets quite 
complicated.

As a PVH guest under Xen, we have 2 ways of initializing a vCPU (aside 
BSP) :
- VCPUOP_initialise
- INIT-SIPI with vLAPIC

The first one is a hypercall that takes vcpu_hvm_context where provide 
some initial state for the vCPU. The second one works by initializing 
the vCPU at some cs.

This works, but for instance under SEV-ES (*), the vCPUs state (VMSA) 
must be measured then stays encrypted, which means that we can't set the 
vCPU state once the VM is started; which prevent both of the methods to 
work.

IOW, all vCPUs state must be known before the guest actually starts 
running (and ideally be defined as a part of boot ABI in order to be 
able to reconstruct the VMSA for remote attestation).

GHCB specification provide a way to deal with it ([1] SEV-ES GHCB 
standardization 4.3 SMP Booting).
It is mostly based on a "AP Jump Table" address that can be queried (and 
also modified by inside-guest UEFI firmware) by the guest through a GHCB 
operation to the hypervisor.
This AP Jump Table is the address of a IP:CS combination that will be 
used to initialize the vCPU (e.g as a part of a long jump instruction 
that the vCPU is initially pointing to).

But it's UEFI firmware centric, and is still relies on 
hypervisor-specific behaviors. And it relies on the hypervisor to give a 
proper "AP Jump Table" addresses (originally given by guest UEFI 
firmware) which could be tampered (defeating some of the security 
aspects of SEV-ES).
Another issue is that the CPU initially starts in real mode, which 
complicates the placement of such AP Jump Table.

Here is a idea on a alternative functionally similar to SEV-ES 
specification but more flexible and somewhat simpler to implement :

Introduce a new special page "Alternative AP bring-up page" which 
contains some header (similar to vcpu_hvm_x86_64) and some vcpu 
initialization logic that sets up some control registers, EFER, GPR, 
..., and then long jump to some guest-provided CS:EIP.

All !BSP vCPUs start at the entry point with a CPU state similar to the 
one defined in direct boot ABI, all vCPUs are initially stopped.

In order to initialize a vCPU :
- sets a appropriate vCPU state in bring-up page
- calls VCPUOP_up on this vCPU
- wait for vCPU initialization termination

This is similar to the one proposed in GHCB specification with some 
differences :
- vCPU starts in protected mode (instead of real mode), which avoids 
some of the AP Jump Table placement restrictions, as we now can put our 
spacial page along the other ones (xenstore, pv console, ...)
- we avoid potentially complicated initialization trampoline chains
- we can start the vCPU directly in long mode (from guest PoV) if 
appropriate EFER and control registers values are provided

And the "AP Jump Table" protocol can still be implemented on top of this 
proposal, given that the guest UEFI firmware supports it.

Given that this expands some aspects of the "direct boot ABI", I would 
like to gather some feedback on the idea.

Thanks

[1] 
https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/specifications/56421.pdf

* regarding SEV-SNP, a different (simpler) method for vCPU 
initialization is supported as the guest can directly provide a usable 
VMSA with the entire encrypted state of the vCPU through


--
Teddy Astie | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.