This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] xencomm address space API

To: Keir Fraser <Keir.Fraser@xxxxxxxxxxxx>
Subject: [Xen-devel] xencomm address space API
From: Hollis Blanchard <hollisb@xxxxxxxxxx>
Date: Tue, 07 Feb 2006 14:45:34 +1100
Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Tue, 07 Feb 2006 03:56:32 +0000
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Organization: IBM Linux Technology Center
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
We were discussing buffer management for the xencomm address space that
we'd use to allocate memory to pass between userspace/kernel and
hypervisor. It's needed in the kernel for hcalls made from the kernel
that currently contain pointers (in particular the balloon driver).

x86 wants to know about buffer (not memory) allocation and destruction,
since that's when it needs to mlock/munlock. PPC wants to know about
buffer allocation so the buffer can be registered with the hypervisor.

Unfortunately, implementing a buffer registration API became rather

Problem #1: concurrency. By decoupling buffer registration from the
actual hcall, we can introduce problems like these:

dom0 tool:              balloon driver:
register buffer
                        register buffer
                        unregister buffer
unregister buffer

This means that the hypervisor must track multiple registered buffers
per domain. (In the general case this could be an arbitrary number, but
I guess it would need to be limited to prevent a domain from exhausting
the Xen heap.)

That also means that each hcall must somehow indicate which buffer
should be used with its arguments. I think that could be done by
encoding the buffer ID into the memory reference, necessitating an API
like this:
        bufid = alloc_buf(nr_pages)
                user: mmap anonymous page
                user: register page with kernel
                kernel: translate to phys and register with xen
                xen: returns bufid

        memref = alloc_mem(bufid, nr_bytes)
                user: ... some allocator code ...
                user: return bufid | memref

        struct.foo = memref

In Xen, copy_from_user(xenbuf, memref) would then decode memref to
figure out what buffer was being referred to. copy_from_user would then
need to understand the data structures used by userland to track the
memory references within the buffer.

Problem #2: Spanning pages is still really difficult. One possible
solution (different from above) would be to have the kernel reserve some
physically contiguous pages, and then export that area by having
userland mmap some device.

Problem #3: We need to know beforehand the maximum number of bytes
needed for the buffer.

Problem #4: The kernel must track the buffers that userland registered,
and unregister them when the process dies, since it may not have been
able to unregister them properly.

This mail isn't comprehensive, but I think gives some idea of the
complexity involved. So a solution like replacing pointers with embedded
structures is far more attractive.

Hollis Blanchard
IBM Linux Technology Center

Xen-devel mailing list