[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] user/hypervisor address space solution

I think I've solved this problem to my satisfaction. The solution I
implemented was not registering buffers and running an allocator out of
it, since that got a lot too complicated for my taste.

The code I've just checked in to the PPC trees
(http://xenbits.xensource.com/ext/linux-ppc-2.6.hg and
http://xenbits.xensource.com/ext/xenppc-unstable.hg) basically creates a
scatter/gather list for communication, and requires no hcall-user
modifications (e.g. libxc). In summary, all pointers in the hcall data
structures are replaced with pointers to scatter/gather structures.
Xen's copy_to/from_user now works only with these structures.

Userspace allocates memory for buffers and for the dom0_op itself from
anywhere it likes. When it calls into the kernel via privcmd_ioctl(),
the kernel records the virtually contiguous buffers with scatter/gather
structures containing physical addresses, and replaces all nested
virtual pointers with physical pointers to these structures.

Kernel code (e.g. the console driver) does basically the same thing,
except it's done in hcall wrapper functions rather than the privcmd

The kernel is the best place to create these structures for two reasons:
a) they contain physical addresses, and b) we don't need to modify any
userspace tools. We all know that as time goes by, people would fail to
use a fancy allocator since they don't need to on x86. We would need to
keep going back to patch and retest all those changes in common code.

- the kernel must understand all data structures coming in via privcmd
        - adds complexity to the kernel
        - may fail poorly if data structures change without new opcodes
                - could be mitigated by something like Linux's ksyms
        - using new dom0 ops requires kernel build and reboot
                - but we already had to reboot Xen anyways
                        - unless migrating domains...
- get/put_user() can no longer be single-instruction accesses

- should usually fail gracefully (new/unknown dom0 op returns ENOSYS)
- requires no userspace modifications
        - supports multipage buffers
        - supports allocation from arbitrary memory
- changes are localized to the PPC kernel
        - new interfaces can be fixed with an arch-specific patch, which won't
risk breaking already-tested users

There's one catch: get/put_user(). I haven't implemented those properly
yet, but that interface would need to change something like this:
--- a/xen/drivers/char/console.c        Mon Feb 13 10:35:24 2006
+++ b/xen/drivers/char/console.c        Tue Feb 14 09:24:07 2006
@@ -363,7 +363,7 @@
         while ( (serial_rx_cons != serial_rx_prod) && (rc < count) )
             if ( put_user(serial_rx_ring[SERIAL_RX_MASK(serial_rx_cons)],
-                          &buffer[rc]) )
+                          buffer, rc) )
                 rc = -EFAULT;
In other words, provide the base buffer pointer and an offset into that
buffer. On x86,
#define put_user(value, base, offset) \
        x86_put_user(value, base + offset)

There are very few uses of get/put_user() in common code right now, and
those can be trivially converted. However, there are some in arch code
e.g. xen/arch/x86/domain.c) that cannot, and so the current put_user
interface would need to be preserved as some arch-specific macro. In
this case I'd call it "x86_put_user" to emphasize that only x86 arch
code should be using it. Too many x86isms creep into places like

These changes are far less invasive and less complicated than the rework
that libxc would need, so I still believe this is the best solution.

Finally, I am happy to help adapt the code to work on other
architectures. In particular, once an hcall routine is properly
abstracted, most of linux/arch/powerpc/platforms/xen/hcall.c and
xen/arch/ppc/usercopy.c could be used by any architecture that needs it.

Hollis Blanchard
IBM Linux Technology Center

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.