|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Xen Summit 2025 - Design Discussion Notes - Xen ABI
Hi all, it was requested I send the notes I took during the design discussion
on the ABI / APIs to the list.
Normally I keep this as personal notes, so there may be errors (esp if I did
not hear correctly), so please feel free to correct or expand. Details may be
missing where I am unaware of the history behind something.
-Alex Merritt
Design Discussion: Xen ABIs and APIs
- Chris on remote: Andrew has been and wants to work on a new ABI
- Andrew: put together a collection of documents to understand what we have
to work with, what we want to improve, before starting the work on any
design or iterations on interfaces we currently have
- link to document in the design session
‐
https://design-sessions.xenproject.org/uid/discussion/disc_3IEQbyaCTkqLf2fFzoze/view
- number of things we have been aware of for a while
- some attempts to address them on the list
- one problem: if you only try to fix one of them, it brings in discussion of
fixing many other items
- everyone has opinion on what the end result will look like
- existing designs only fix subsets, not the whole thing
- we want to address all the problems from the start, before deciding on a plan
to fix them
- enumerate the ABIs and APIs that currently exist
‐ problems not apparent if you just think about this
‐ many folks think this is just the hypercalls
‐ there is the enumeration information
‐ xen has many bugs - originally monorepo with xen, linux, qemu, BSDs,
bochs, ... with “make world” you got a system. All guests were
required to have event channel - no discovery exists because they all
had it
‐ grant table v2, migrate old version of xen to new, exercise new code
paths, then kernel crashed
‐ initial state of vcpus - many folks don’t think about them, but what
xen presents, we have bugs describing those via the hypercalls we use
‐ the hypercalls themselves -- 46? -- half of them specific for PV
guests
- x86 HVM / ARM HVM are only a small fraction of the total
hypercalls that exist
- the reason the hypercalls look like this now, Xen started with pv guests on
x86, a VAS system made sense
‐ when HVM guests came along, we have hacks fitting PV guests into HVM
‐ Xen has to walk the page tables of the guest just to get the
information it needs, you cannot do that in encrypted VMs by design
‐ need to change the way we deal with pointers in the API
- evtchn send, pass pointer information on the stack
‐ get interrupt for someone else!
- look over all APIs and ABIs that exist because they have different problems in
different areas
- XenServer cares most about right now host UEFI secure boot
‐ new priv boundary that does not exist previously
‐ admin with root cannot (should not) violate security boundary, cannot
read/write arbitrary memroy
‐ hypercalls: open /dev/xen/privcmd and pointers into user space
memory, nothing stops passing kernel pointer memory
- giant privilege escalation hole in UEFI secure boot
- root user space is not priv enough to execute arbitrary code
‐ all problems compound, thus we want to look at all of them before we
start figuring out what to do
- another example: being based on x86 originally, large hypercalls have a shift
by 12, assume 4k pages, problem with ARM wanting 64k page tables
‐ event the data layout wants to change
- if you change the version of Xen, you break the user space (library versions)
‐ was intentional choice early on, doesn’t scale
‐ get rid of unstable APIs -- killing xen
- security hotfix - recompile QEMU
‐ ABI rules say any change in hypervisor, thus rebuilding user space,
and QEMU -- anything that links against the xen packages!
- Bertrand: look at problem yesterday: how we create and configure a guest,
coherency to reach dom0less
‐ twice code to create a guest, duplicated code
‐ duplicate configuration format
‐ if we modify ABI between dom0 and Xen, need to look at have
coherent format so we can reuse the same code
- Alex M: can we hide hypercalls via libraries?
‐ yes but currently the versions for a break
‐ definitely an option forward
‐ still doesn’t solve the issue, because other libraries in other
languages
won’t be shielded from unstable ABIs
- Jan: both knowing what to do and where we go is useful
‐ Andrew: have to have broad idea where to go....
- Jan: carrying out hypercall is independent of the mechansim we define
‐ Andrew: still needs backwards compatibility
‐ Andrew: use higher op numbers
- Alex M: is our problem unique to us?
‐ Andrew: we have enough corner cases that yes
‐ Bertrand: PV guests require a large number of hypercalls
‐ Jan: keep VA for PV hypercalls
- Rich on call: work together with Chris to write down something difficult in
scope
‐ any work written down, useful for folks on other side where we may
encounter failures
‐ newcomers: xen forked by HP (?)
‐ everyone tried to narrow to verticle markets, focus on specific
markets
‐ Xen: is last entity standing, still trying to pull all stakeholders
together,
but not sure how long it will last
‐ if collapses: accidental or intentional interoperability, carve out
the
pieces so that the ppl at table today have a chance to know what
results from it
‐ what will last longest: certified entities that have long lifecycles,
decades or more
‐ certified snapshots will become longest lived design choices
- Andrew: shared info page
‐ layout was done with unsigned longs which changed sizes
‐ layout of the shared info page changes
‐ different vcpus can be in different modes at a time
‐ we cache the mode of the cpu at the point which it makes one of two
types of hypercalls
- another design session tomorrow
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |