[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] V4V

On Tue, 29 May 2012, Daniel De Graaf wrote:
> On 05/24/2012 01:23 PM, Jean Guyader wrote:
> > As I'm going through the code to clean-up XenClient's inter VM
> > communication
> > (V4V), I thought it would be a good idea to start a thread to talk about
> > the
> > fundamental differences between V4V and libvchan. I believe the two system
> > are
> > not clones of eachother and they serve different
> > purposes.
> > 
> > 
> > Disclaimer: I'm not an expert in libvchan; most of the assertion I'm doing
> > about libvchan it coming from my reading of the code. If some of the facts
> > are wrong it's only due to my ignorance about the subject.
> > 
> I'll try to fill in some of these points with my understanding of libvchan;
> I have correspondingly less knowledge of V4V, so I may be wrong in assumptions
> there.
> > 1. Why V4V?
> > 
> > About the time when we started XenClient (3 year ago) we were looking for a
> > lightweight inter VM communication scheme. We started working on a system
> > based on netchannel2 at the time called V2V (VM to VM). The system
> > was very similar to what libvchan is today, and we started to hit some
> > roadblocks:
> > 
> >     - The setup relied on a broker in dom0 to prepare the xenstore node
> >       permissions when a guest wanted to create a new connection. The code
> >       to do this setup was a single point of failure. If the
> >       broker was down you could create any more connections.
> libvchan avoids this by allowing the application to determine the xenstore
> path and adjusts permissions itself; the path /local/domain/N/data is
> suitable for a libvchan server in domain N to create the nodes in question.

Let say that the frontend lives in domain A and that the backend lives
in domain N.
Usually the frontend has a node:


that points to the backend, in this case:


The backend is not allowed to write to the frontend path, so it cannot write
its own path in the backend node. Clearly the frontend doesn't know that
information so it cannot fill it up. So the toolstack (typically in
dom0) helps with the initial setup writing down under the frontend path
where is the backend.
How does libvchan solve this issue?

> >     - Symmetric communications were a nightmare. Take the case where A is a
> >       backend for B and B is a backend for A. If one of the domain crash the
> >       other one couldn't be destroyed because it has some paged mapped from
> >       the dead domain. This specific issue is probably fixed today.
> This is mostly taken care of by improvements in the hypervisor's handling of
> grant mappings. If one domain holds grant mappings open, the domain whose
> grants are held can't be fully destroyed, but if both domains are being
> destroyed then cycles of grant mappings won't stop them from going away.

However under normal circumstances the domain holding the mappings (that
I guess it would be the domain running the backend, correct?) would
recognize that the other domain is gone and therefore unmap the grants
and close the connection, right?
I hope that if the frontend crashes and dies, it doesn't necessarily
become a zombie because the backend holds some mappings.

> >     - The PV connect/disconnect state-machine is poorly implemented.
> >       There's no trivial mechanism to synchronize disconnecting/reconnecting
> >       and dom0 must also allow the two domains to see parts of xenstore
> >       belonging to the other domain in the process.
> No interaction from dom0 is required to allow two domUs to communicate using
> xenstore (assuming the standard xenstored; more restrictive xenstored
> daemons may add such restrictions, intended to be used in conjunction with XSM
> policies preventing direct communication via event channels/grants). The
> connection state machine is weak; an external mechanism (perhaps the standard
> xenbus "state" entry) could be used to coordinate this better in the user of
> libvchan.

I am curious to know what the "connection state machine" is in libvchan.

> >     - Using the grant-ref model and having to map grant pages on each
> >       transfer cause updates to V->P memory mappings and thus leads to
> >       TLB misses and flushes (TLB flushes being expensive operations).
> This mapping only happens once at the open of the channel, so this cost 
> becomes
> unimportant for a long-running channel. The cost is far higher for HVM domains
> that use PCI devices since the grant mapping causes an IOMMU flush.

So I take that you are not passing grant refs through the connection,
unlike blkfront and blkback.

> [followup from Stefano's replies]
> I would not expect much difference even on a NUMA system, assuming each domU
> is fully contained within a NUMA node: one of the two copies made by libvchan
> will be confined to a single node, while the other copy will be cross-node.
> With domUs not properly confined to nodes, the hypervisor might be able to do
> better in cases where libvchan would have required two inter-node copies.

Right, I didn't realize that libvchan uses copies rather than grant refs
to transfer the actual data.

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.