[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] xencomm address space API

We were discussing buffer management for the xencomm address space that
we'd use to allocate memory to pass between userspace/kernel and
hypervisor. It's needed in the kernel for hcalls made from the kernel
that currently contain pointers (in particular the balloon driver).

x86 wants to know about buffer (not memory) allocation and destruction,
since that's when it needs to mlock/munlock. PPC wants to know about
buffer allocation so the buffer can be registered with the hypervisor.

Unfortunately, implementing a buffer registration API became rather

Problem #1: concurrency. By decoupling buffer registration from the
actual hcall, we can introduce problems like these:

dom0 tool:              balloon driver:
register buffer
                        register buffer
                        unregister buffer
unregister buffer

This means that the hypervisor must track multiple registered buffers
per domain. (In the general case this could be an arbitrary number, but
I guess it would need to be limited to prevent a domain from exhausting
the Xen heap.)

That also means that each hcall must somehow indicate which buffer
should be used with its arguments. I think that could be done by
encoding the buffer ID into the memory reference, necessitating an API
like this:
        bufid = alloc_buf(nr_pages)
                user: mmap anonymous page
                user: register page with kernel
                kernel: translate to phys and register with xen
                xen: returns bufid

        memref = alloc_mem(bufid, nr_bytes)
                user: ... some allocator code ...
                user: return bufid | memref

        struct.foo = memref

In Xen, copy_from_user(xenbuf, memref) would then decode memref to
figure out what buffer was being referred to. copy_from_user would then
need to understand the data structures used by userland to track the
memory references within the buffer.

Problem #2: Spanning pages is still really difficult. One possible
solution (different from above) would be to have the kernel reserve some
physically contiguous pages, and then export that area by having
userland mmap some device.

Problem #3: We need to know beforehand the maximum number of bytes
needed for the buffer.

Problem #4: The kernel must track the buffers that userland registered,
and unregister them when the process dies, since it may not have been
able to unregister them properly.

This mail isn't comprehensive, but I think gives some idea of the
complexity involved. So a solution like replacing pointers with embedded
structures is far more attractive.

Hollis Blanchard
IBM Linux Technology Center

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.