[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [Draft B] Boot ABI for HVM guests without a device-model


The discussion in [1] lead to an agreement of the missing pieces in PVH
(or HVM without a device-model) in order to progress with it's

One of the missing pieces is a new boot ABI, that replaces the PV boot
ABI. The aim of this new boot ABI is to remove the limitations of the
PV boot ABI, that are no longer present when using auto-translated
guests. The new boot protocol should allow to use the same entry point
for both 32bit and 64bit guests, and let the guest choose it's bitness
at run time without the domain builder knowing in advance.


[1] http://lists.xen.org/archives/html/xen-devel/2015-06/msg00258.html

HVM direct boot ABI

Since the Xen entry point into the kernel can be different from the
native entry point, ELFNOTES are used in order to tell the domain
builder how to load and jump into the kernel entry point. At least the
following ELFNOTES are required in order to use this boot ABI:

ELFNOTE(Xen, XEN_ELFNOTE_GUEST_OS,              .asciz, "FreeBSD")
ELFNOTE(Xen, XEN_ELFNOTE_XEN_VERSION,           .asciz, "xen-3.0")
ELFNOTE(Xen, XEN_ELFNOTE_PHYS32_ENTRY,          .quad,  xen_start32)
ELFNOTE(Xen, XEN_ELFNOTE_FEATURES,              .asciz, 
XENFEAT_writable_page_tables) | \
                                                       (1 << 
XENFEAT_auto_translated_physmap) | \
                                                       (1 << 
XENFEAT_supervisor_mode_kernel) | \
                                                       (1 << 
ELFNOTE(Xen, XEN_ELFNOTE_LOADER,                .asciz, "generic")

The first three notes contain information about the guest kernel and
the Xen hypercall ABI version. The following notes are of special

 * XEN_ELFNOTE_PHYS32_ENTRY: the 32bit physical entry point into the kernel.
 * XEN_ELFNOTE_FEATURES: features required by the guest kernel in order
   to run.

The presence of the XEN_ELFNOTE_PHYS32_ENTRY note indicates that the
kernel supports the boot ABI described in this document.

The domain builder will load the kernel into the guest memory space and
jump into the entry point defined at XEN_ELFNOTE_PHYS32_ENTRY with the
following machine state:

 * ebx: contains the physical memory address where the loader has placed
   the boot start info structure.

 * cr0: bit 0 (PE) will be set. All the other writeable bits are cleared.

 * cr4: all bits are cleared.

 * cs: must be a 32-bit read/execute code segment with an offset of â0â
   and a limit of â0xFFFFFFFFâ. The selector value is unspecified.

 * ds, es: must be a 32-bit read/write data segment with an offset of
   â0â and a limit of â0xFFFFFFFFâ. The selector values are all unspecified.

 * tr: must be a 32-bit TSS (active) with a base of '0' and a limit of '0xFF'.

 * eflags: bit 17 (VM) must be cleared. Bit 9 (IF) must be cleared.
   Other bits are all unspecified.

All other processor registers and flag bits are unspecified. The OS is in
charge of setting up it's own stack, GDT and IDT.

The format of the structure passed in the %ebx register is the following:

struct hvm_start_info {
#define HVM_START_MAGIC_VALUE 0x336ec578
    uint32_t magic;             /* Contains the magic value 0x336ec578       */
                                /* ("xEn3" with the 0x80 bit of the "E" set).*/
    uint32_t flags;             /* SIF_xxx flags.                            */
    uint32_t cmdline_paddr;     /* Physical address of the command line.     */
    uint32_t nr_modules;        /* Number of modules passed to the kernel.   */
    uint32_t modlist_paddr;     /* Physical address of an array of           */
                                /* hvm_modlist_entry.                        */

struct hvm_modlist_entry {
    uint64_t paddr;             /* Physical address of the module.           */
    uint64_t size;              /* Size of the module in bytes.              */

This structure is guaranteed to always be placed in memory after the
loaded kernel and modules. There's no upper bound on the size of the
structure, users should be aware that it might cross a page boundary.

Note that the boot protocol resembles the multiboot1 specification,
this is done so OSes with multiboot1 entry points can reuse those if

Other relevant information needed in order to boot a guest kernel
(console page address, xenstore event channel...) can be obtained
using HVMPARAMS, just like it's done on HVM guests.

The setup of the hypercall page is also performed in the same way
as HVM guests, using a wrmsr.

AP startup

AP startup is performed using hypercalls. The following VCPU operations
are used in order to bring up secondary vCPUs:

 * VCPUOP_initialise is used to set the initial state of the vCPU. The
   argument passed to the hypercall must be of the type vcpu_hvm_context
   (see public/hvm/hvm_vcpu.h for the layout of the structure). Note that
   this hypercall allows starting the vCPU in several modes (16/32/64bits),
   regardless of the mode the BSP is currently running on.

 * VCPUOP_up is used to launch the vCPU once the initial state has been
   set using VCPUOP_initialise.

 * VCPUOP_down is used to bring down a vCPU.

 * VCPUOP_is_up is used to scan the number of available vCPUs.

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.