[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Re: Interdomain comms

On Fri, 2005-05-06 at 19:19 -0500, Eric Van Hensbergen wrote:

> This is exactly the sort of thing that the Plan 9 networking model was
> designed to do.
> The idea being that the Plan 9 model provides a nice abstract layer
> which to communicate with AND to organize (the organization is an
> important feature)

Yes, the organisation is important, I expect we can learn a lot from 9P
here.  I'm not trying to address the organisation aspect yet though (my
API proposal is lower level) so please let's postpone that aspect of the

> Looking over your earlier proposal it seems like an awful lot of
> complexity to accomplish a relatively simple task.  Perhaps the
> refined API will demonstrate that simplicity better? I'd be really
> interested in the security details of your model as well when you
> finish the more detailed proposal.

I'd need help from security experts but, as an initial stab, if the
idc_address and remote_buffer_references are capabilities then I think
the security falls out in the wash since it's impossible to access
something unless you have been granted permission.

> I'd suggest we step through a complete example with actual
> applications and actual devices that would demonstrate the problem
> that we are trying to solve.  Perhaps Ron and I can pull together an
> alternate proposal based on such a concrete example.

OK, here goes:

A significant difference between Plan 9 and Xen which is relevant to
this discussion is that Plan 9 is designed to construct a single shared
environment from multiple physical machines whereas Xen is designed to
partition a single physical machine into multiple isolated environments.
Arguably, Xen clusters might also partition multiple physical machines
into multiple isolated environments with some weird and wonderful
cross-machine sharing and replication going on.

The significance of this difference is that in the Xen environment,
there are many interesting opportunities for optimisations across the
virtual machines running on the same physical machine. These
optimisations are not relevant to a native Plan 9 system and so (AFAICT
with 20 mins experience :-) ) there is no provision for them in 9P.

Take, for example, the case where 1000 virtual machines are booting on
the same physical machine from copy-on-write file-systems which share a
common ancestry.

With my proposal, when the virtual machines read a file into memory, the
operation might proceed as follows:

The FE registers a local buffer to receive the file contents and is
given a remote buffer reference.

The FE issues a transaction to the BE to read the file, passing the
remote buffer reference.

The BE looks up the file using its meta data and happens to find the
file contents in its buffer cache.

The BE makes an idc_send_to_remote_buffer call passing a local buffer
reference for the file data in its local buffer cache and the remote
buffer reference provided by the FE.

The IDC implementation resolves the remote buffer reference to a local
buffer reference (since the FE buffer happens to reside on the same
physical machine in this example) and makes a call to
local_buffer_reference_copy to copy the file data from its buffer cache
to the FE buffer.

The local_buffer_reference_copy implementation determines the type of
copy based on the types of the source and destination local buffer
references which, for the sake of argument, both happen to be of a type
backed by reference counted pages from compatible page pools.

The implementation of local_buffer_reference_copy for that specific
combination of buffer types maps the BE pages into the FE address space
incrementing their reference counts and also unmaps the old FE pages and
decrements their reference counts, returning them to the free pool if

The other 999 virtual machines boot and do the same thing (since the
file in question happens to be one which has not diverged) leaving all
1000 virtual machines sharing the same physical memory.

Had the virtual machines been booting on different physical machines
then the path through the FE and BE client code would have been
identical (so we have met the network transparency goal) but the IDC
implementation would have taken an alternative path upon discovering
that the remote_buffer_reference was genuinely remote.

Hopefully this explains better where I'm coming from.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.