[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] V4V

On 05/31/2012 01:20 PM, Stefano Stabellini wrote:
> On Wed, 30 May 2012, Daniel De Graaf wrote:
>> On 05/30/2012 07:41 AM, Stefano Stabellini wrote:
>>> On Tue, 29 May 2012, Daniel De Graaf wrote:
>>>> On 05/24/2012 01:23 PM, Jean Guyader wrote:
>>>>> As I'm going through the code to clean-up XenClient's inter VM
>>>>> communication
>>>>> (V4V), I thought it would be a good idea to start a thread to talk about
>>>>> the
>>>>> fundamental differences between V4V and libvchan. I believe the two system
>>>>> are
>>>>> not clones of eachother and they serve different
>>>>> purposes.
>>>>> Disclaimer: I'm not an expert in libvchan; most of the assertion I'm doing
>>>>> about libvchan it coming from my reading of the code. If some of the facts
>>>>> are wrong it's only due to my ignorance about the subject.
>>>> I'll try to fill in some of these points with my understanding of libvchan;
>>>> I have correspondingly less knowledge of V4V, so I may be wrong in 
>>>> assumptions
>>>> there.
>>>>> 1. Why V4V?
>>>>> About the time when we started XenClient (3 year ago) we were looking for 
>>>>> a
>>>>> lightweight inter VM communication scheme. We started working on a system
>>>>> based on netchannel2 at the time called V2V (VM to VM). The system
>>>>> was very similar to what libvchan is today, and we started to hit some
>>>>> roadblocks:
>>>>>     - The setup relied on a broker in dom0 to prepare the xenstore node
>>>>>       permissions when a guest wanted to create a new connection. The code
>>>>>       to do this setup was a single point of failure. If the
>>>>>       broker was down you could create any more connections.
>>>> libvchan avoids this by allowing the application to determine the xenstore
>>>> path and adjusts permissions itself; the path /local/domain/N/data is
>>>> suitable for a libvchan server in domain N to create the nodes in question.
>>> Let say that the frontend lives in domain A and that the backend lives
>>> in domain N.
>>> Usually the frontend has a node:
>>> /local/domain/A/device/<devicename>/<number>/backend
>>> that points to the backend, in this case:
>>> /local/domain/N/backend/<devicename>/A/<number>
>>> The backend is not allowed to write to the frontend path, so it cannot write
>>> its own path in the backend node. Clearly the frontend doesn't know that
>>> information so it cannot fill it up. So the toolstack (typically in
>>> dom0) helps with the initial setup writing down under the frontend path
>>> where is the backend.
>>> How does libvchan solve this issue?
>> Libvchan requires both endpoints to know the domain ID of the peer they are
>> communicating with - this could be communicated during domain build or 
>> through
>> a name service. The application then defines a path such as
>> "/local/domain/$server_domid/data/example-app/$client_domid" which is 
>> writable
>> by the server; the server creates nodes here that are readable by the client.
> Is it completely up to the application to choose a xenstore path and
> give write permissions to the other end?
> It looks like something that could be generalized and moved to a library.
> How do you currently tell to the server the domid of the client?

This depends on the client. One method would be to watch @introduceDomain in
Xenstore and set up a vchan for each new domain (this assumes that your server
wants to talk to every new domain). You could also use existing communications
channels (network or vchan from dom0) to inform a server of clients, and also to
inform the client of the server's domid.

The nodes used by libvchan could be placed under normal frontend/backend device
paths, but the current xenstore permissions require that this be done by dom0.
In this case, the usual xenbus conventions can be used; the management of this
state could be useful for a library.

Xenstore permissions are handled in libvchan; all it needs is a writable path to
create nodes. The original libvchan was using a hard-coded path similar to my
example, but it was decided that allowing the application to define the path 
be more flexible.
>>>>>     - Symmetric communications were a nightmare. Take the case where A is 
>>>>> a
>>>>>       backend for B and B is a backend for A. If one of the domain crash 
>>>>> the
>>>>>       other one couldn't be destroyed because it has some paged mapped 
>>>>> from
>>>>>       the dead domain. This specific issue is probably fixed today.
>>>> This is mostly taken care of by improvements in the hypervisor's handling 
>>>> of
>>>> grant mappings. If one domain holds grant mappings open, the domain whose
>>>> grants are held can't be fully destroyed, but if both domains are being
>>>> destroyed then cycles of grant mappings won't stop them from going away.
>>> However under normal circumstances the domain holding the mappings (that
>>> I guess it would be the domain running the backend, correct?) would
>>> recognize that the other domain is gone and therefore unmap the grants
>>> and close the connection, right?
>>> I hope that if the frontend crashes and dies, it doesn't necessarily
>>> become a zombie because the backend holds some mappings.
>> The mapping between frontend/backend and vchan client/server may be 
>> backwards:
>> the server must be initialized first and provides the pages for the client to
>> map. It looks like you are considering the frontend to be the server.
>> The vchan client domain maps grants provided by the server. If the server's
>> domain crashes, it may become a zombie until the client application notices 
>> the
>> crash. This will happen if the client uses the vchan and gets an error when
>> sending an event notification (in this case, a well-behaved client will 
>> close the
>> vchan). If the client does not often send data on the vchan, it can use a 
>> watch on
>> the server's xenstore node and close the vchan when the node is deleted.
>> A client that does not notice the server's destruction will leave a zombie 
>> domain.
>> A system administrator can resolve this by killing the client process.
> This looks like a serious issue. Considering that libvchan already does
> copies to transfer the data, couldn't you switch to grant table copy
> operations? That would remove the zombie domain problem I think.

The grant table copy operations would work for the actual data, but would be 
inefficient for updating the shared page (ring indexes and notification bits)
which need to be checked and updated before and after each copy, requiring three
or four copy operations per library call. The layout of the shared page would 
need to be rearranged to make all the fields updated by one domain adjacent and
replace notification bits with a different mechanism.

The Linux gntdev driver does not currently support copy operations; this would
need to be added. You would also lose the ability for the server to detect when 
client application exits using the unmap notify byte (as opposed to the entire
client domain crashing) - however, this functionality may not be very important.

An alternate solution (certainly not for 4.2) would be to fix the zombie domain
problem altogether, since it is not limited to vchan - any frontend/backend 
that does not respond to domain destruction events can cause zombie domains. The
mapped pages could be reassigned to the domain mapping them until all the 
are removed and the pages released back to Xen's heap.

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.