This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] Re: blktap: Sync with XCP, dropping zero-copy.

On Tue, 16 Nov 2010, Daniel Stodden wrote:
> Let's say we create an extension to tapdisk which speaks blkback's
> datapath in userland. We'd basically put one of those tapdisks on every
> storage node, independent of the image type, such as a bare LUN or a
> VHD. We add a couple additional IPC calls to make it directly
> connect/disconnect to/from (ring-ref,event-channel) pairs.
> Means it doesn't even need to talk xenstore, the control plane could all
> be left to some single daemon, which knows how to instruct the right
> tapdev (via libblktapctl) by looking at the physical-device node. I
> guess getting the control stuff out of the kernel is always a good idea.
> There are some important parts which would go missing. Such as
> ratelimiting gntdev accesses -- 200 thundering tapdisks each trying to
> gntmap 352 pages simultaneously isn't so good, so there still needs to
> be some bridge arbitrating them. I'd rather keep that in kernel space,
> okay to cram stuff like that into gntdev? It'd be much more
> straightforward than IPC.
> Also, I was absolutely certain I once saw VM_FOREIGN support in gntdev..
> Can't find it now, what happened? Without, there's presently still no
> zero-copy.
> Once the issues were solved, it'd be kinda nice. Simplifies stuff like
> memshr for blktap, which depends on getting hold of original grefs.
> We'd presumably still need the tapdev nodes, for qemu, etc. But those
> can stay non-xen aware then.

Considering that there is a blkback implementation in qemu already, why
don't use it? I don't certainly feel the need of yet another blkback
A lot of people are working on qemu nowadays and this would let us
exploit some of that work and contribute to it ourselves.
We would only need to write a vhd block driver in qemu (even though a
"vdi" driver is already present, I assume is not actually compatible?)
and everything else is already there.
We could reuse their qcow and qcow2 drivers that honestly are better
maintained than ours (we receive a bug report per week about qcow/qcow2
not working properly).
Finally qemu needs to be able to do I/O anyway because of the IDE
emulation, so it has to be in the picture in a way or another. One day
not far from now when we make virtio work on Xen, even the fast PV
data path might go through qemu, so we might as well optimize it.
After talking to the xapi guys to better understand their requirements,
I am pretty sure that the new upstream qemu with QMP support would be
able to satisfy them without issues.
Of all the possible solutions, this is certainly the one that requires
less lines of code and would allow us to reuse more resource that
otherwise would just remain untapped.

I backported the upstream xen_disk implementation to qemu-xen
and run a test on the upstream 2.6.37rc1 kernel as dom0: VMs boot fine
and performances seem to be interesting.  For the moment I am thinking
about enabling the qemu blkback implementation as a fallback in case
blktap2 is not present in the system (ie: 2.6.37 kernels).

Xen-devel mailing list

<Prev in Thread] Current Thread [Next in Thread>