[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] XenProject/XenServer QEMU working group minutes, 30th August 2016



QEMU XenServer/XenProject Working group meeting 30th August 2016
================================================================

Attendance
----------

Andrew Cooper
Ian Jackson
Paul Durrant
David Vrabel
Jennifer Herbert

Introduction
------------

Introduced Paul Durrant to the working group.

Started by recapping our purpose: A way to make it possible for qemu
to be able to make hypercalls without too much privilege, in a way
which is up streamable.    Dom0 guest must not be able abuse interface
to compromise dom0 kernel.

QEMU Hypercalls – DM op
-----------------------

Has been much discussion on XenDevel -  a problem identified is when
you have operations with references to  other user memory objects,
such as with Track dirty VRAM (As used with the VGA buffer)  At the
moment, apparently there is only that one, but others may emerge.

Most obvious solution would involve the guest kernel validating the
virtual address passed, however that would would rely on the guest
kernel knowing where to objects where.   This is to be avoided.

Ian recounts how there where variose proposals, on XenDevel involving,
essentially, informing the hypervisor, or some way providing the
information about which virtual addresses where being talked about by
the hypercall to the hypervisor.  Many of these involved this
information being transmitted via a different channel.

Ian suggest the idea provide a way for the kernel to tell the
hypervisor is user virtual rages, dm op allowed memory. And there
would be a flag, in the dm op, in a fixed location, that would tell
the hypervisor, this only talks about special dm pre-approved memory.


A scheme of pre-nominating an area in QEMU, maybe using hypercall
buffers is briefly discussed,as well as a few other ideas, but
concludes that doesn’t really address the problem of future DM ops –
of which there could easily be.  Even if we can avoid the problem by
special cases for our current set-up, we still need a story for how to
add future interfaces with handles, without the need to change the
kernel interface.  Once we come up with story, we wouldn't necessarily
have to implement it.


The concept of using physical addressed hypercall buffers was
discussed.  Privcmd could allocate you a place, and mmap it into user
ram, and this is the only memory that would be used with the
hypercalls.  A hypercall would tell you the buffer range. Each qemu
would need to be associated with the correct set of physical buffers.

A recent AMD proposal was discussed, which would use only physical
addresses, no virtual address.  The upshot being we should come up
with a solution that is not incompatible this.

Ideas further discussed: User code could just put stuff in mmaped
memory, and only refer to offset within that buffer.  The privcmd
driver would fill in physical details. All dm ops would have 3
arguments:  dm op, pointer to to struct, and optional pointer to
restriction array – the last of which is filled in my privcmd driver.
It is discussed how privcmd driver must not look at the dm op number,
in particular, to know how to validate addresses, as it must be
independent from the API.

A scheme where qemu calls an ioctl before it drops privileges, to set
up restrictions ahead of time, is discussed.  One scheme may work by
setting up a rang for a given domain or VCPU.

The assumption is that all device model, running in the same domain,
have the same virtual address layout.  Then there would be a flag, in
the stable bit of the api, if to apply that restriction  - any kernel
dm op would not apply that restriction.

The idea can be extended – to have more one address range, or can have
range explicitly provided in the hypercall.  This latter suggestion is
preferred, however each platform would have different valid address
ranges, and privcmd is platform independent.  Its discussed how a
function could be created to return valid rages for your given
platform, but this is not considered a element solution. The third
parameter of the dm op could be array of ranges, where common case for
virtual addresses may be 0-3GB, but for physical addresses, it might
be quite fragmented.


A further ideas is proposed to extend the dm op, to have a fixed part,
to have an array of guest handles, the kernel can audit. The
arguments would be:

Arg1: Dom ID:
Arg2: Guests handle array of tuples(address, size)
Arg3: Number guest handles.

The first element of the array could be the DM op structure itself,
containing the DM Op code, and othe argument to the perticular op.
The Privcmd driver would only pass though what is provided by the
user.  Any extra elements would be ignored by the hypercall, and if
there where insufficient, the hypercall code would see a NULL, and be
able to gracefully fail.

The initial block (of dm arguments) passed in the array would be
copied into pre-zeroed memory of max op size, having checked the size
is not greater then this.  No need to check minimum, buffer
initialised to zero, so zero length would result in op 0 being called.
Functions/Macros could be created to make retrieving such a block
easier.

Any further blocks needed would be implicitly refereed too, as a given
dm op knows it will put buffer foo in array position bar. It would
then use the provided function/macros to retrieve it.

This last idea is compared with the proposal previously posted to
xendevel by Ian.  This scheme is slightly messier in the dm op code,
having to refer to numbers instead of fields, however, the pros are
that:

* it's more extendible.  It dostn involeve providing a new weird copy
  to user memory macro, that can be misused, which security  implications.
* The restriction, is bound the the specific call, and can vary.
* Priv cmd slightly simpler, can just call access_ok.
* Is physical access scheme compatible.

David agrees to write this idea up in a design document. He will not
need to discuss any individual dm ops, but should describe the pros
and cons compared on other ideas on the table.


XenStore
--------

The xs-restrict mechanism was summarised, and its limitation – it does
not work though the kernel XenStore driver, which is needed to talk to
a XenStore domain.  A way to fix this would be to create a wrapper.

Another approach is to try and remove XenStore from all
non-priverlaged parts of QEMU – as it is thought there isn't that much
use remaining.  Protocols such as QMP would be used instead.  PV
drivers such as QDISK could be run in a separate qemu process – for
which a patch exists. There where concerns this would like a lot of
time to achieve.

Although time ran out, it was vaguely concluded that multiple
approaches could be run in parallel, where initially xs-restrict is
used as is, and then a the xenstore wrapper could be developed
alongside efforts to reduce XenStore use in QEMU.  Even with the
XenStore wrapper, QEMU may benefit from reducing the number of
communication protocols in use – ie removing XenStore use.



Action items
------------

David: Write up latest DM op proposal.
Jenny: Write up and arrange next meetup.



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.