[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC v4]Proposal to allow setting up shared memory areas between VMs from xl config file



On Mon, Jul 31, 2017 at 02:30:47PM -0700, Stefano Stabellini wrote:
> On Mon, 31 Jul 2017, Edgar E. Iglesias wrote:
> > On Fri, Jul 28, 2017 at 09:03:15PM +0800, Zhongze Liu wrote:
> > > ====================================================
> > > 1. Motivation and Description
> > 
> > Hi,
> > 
> > I think this looks quite useful. I have a few comments inline.
> 
> Hi Edgar, thanks for giving it a look!
> 
> 
> > > ====================================================
> > > Virtual machines use grant table hypercalls to setup a share page for
> > > inter-VMs communications. These hypercalls are used by all PV
> > > protocols today. However, very simple guests, such as baremetal
> > > applications, might not have the infrastructure to handle the grant table.
> > > This project is about setting up several shared memory areas for inter-VMs
> > > communications directly from the VM config file.
> > > So that the guest kernel doesn't have to have grant table support (in the
> > > embedded space, this is not unusual) to be able to communicate with
> > > other guests.
> > > 
> > > ====================================================
> > > 2. Implementation Plan:
> > > ====================================================
> > > 
> > > ======================================
> > > 2.1 Introduce a new VM config option in xl:
> > > ======================================
> > > 
> > > 2.1.1 Design Goals
> > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > 
> > > The shared areas should be shareable among several (>=2) VMs, so every 
> > > shared
> > > physical memory area is assigned to a set of VMs. Therefore, a “token” or
> > > “identifier” should be used here to uniquely identify a backing memory 
> > > area.
> > > A string no longer than 128 bytes is used here to serve the purpose.
> > > 
> > > The backing area would be taken from one domain, which we will regard
> > > as the "master domain", and this domain should be created prior to any
> > > other "slave domain"s. Again, we have to use some kind of tag to tell who
> > > is the "master domain".
> > > 
> > > And the ability to specify the permissions and cacheability (and 
> > > shareability
> > > for ARM guest's) of the pages to be shared should also be given to the 
> > > user.
> > > 
> > > 2.2.2 Syntax and Behavior
> > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > The following example illustrates the syntax of the proposed config entry
> > > (suppose that we're on x86):
> > > 
> > > In xl config file of vm1:
> > >   static_shm = [ 'id=ID1, begin=0x100000, end=0x200000, role=master,
> > >                   cache_policy=x86_normal, prot=r2',
> > > 
> > >                  'id=ID2, begin=0x300000, end=0x400000, role=master' ]
> > > 
> > > In xl config file of vm2:
> > >   static_shm = [ 'id=ID1, offset = 0, begin=0x500000, end=0x600000,
> > > role=slave, prot=ro' ]
> > > 
> > > In xl config file of vm3:
> > >   static_shm = [ 'id=ID2, offset = 10000, begin=0x690000,
> > > end=0x800000, role=slave, prot=ro' ]
> > > 
> > > where:
> > >   @id                   The identifier of the backing memory area.
> > >                         Can be any string that matches the regexp "[^ 
> > > \t\n,]+"
> > >                         and no longer than 128 characters
> > > 
> > >   @offset               Can only appear when @role = slave. The sharing 
> > > will
> > >                         start from the beginning of backing memory area 
> > > plus
> > > this offset. If not set, it defaults to zero.
> > >                         Can be decimals or hexadecimals of the form 
> > > "0x20000",
> > >                         and should be the multiple of the hypervisor page
> > > granularity (currently 4K on both ARM and x86).
> > > 
> > >   @begin/end            The boundaries of the shared memory area. The 
> > > format
> > >                         requirements are the same with @offset.
> > 
> > I'm assuming this is all specified in GFN and also not MFN contigous?
> > Would it be possible to allow the specification of MFN mappings
> > that are contigous?
> 
> That could be done with the iomem= parameter?

The missing part from the iomem paramater is the attributes, cacheability, 
shared etc.
But we could perhaps add that to iomem somehow.


> 
> 
> > This would be useful to map specific kinds of memory (e.g On Chip RAMs).
> > 
> > Other use-cases are when there are not only guests sharing
> > the pages but also devices. In some cases these devs may be locked in
> > with low-latency access to specific memory regions.
> > 
> > Perhaps something like the following?
> > addr=gfn@<mfn>
> > size=0x1000
> > 
> > with mfn being optional?
> 
> I can see that it might be useful, but we are trying to keep the scope
> small to be able to complete the project within the limited timeframe
> allowed by GSoC. We only have one month left! We risk not getting the
> feature completed.  Once this set of features is done and committed, we
> can expand on it.
> 
> For the sake of this document, we should make clear that addresses are in
> the gfn space and memory is allocated to the master domain.

OK, I understand. :-)
Yes, documenting it would be good.


> 
> 
> > >   @role                 Can only be 'master' or 'slave', it defaults to 
> > > 'slave'.
> > > 
> > >   @prot                 When @role = master, this means the largest set of
> > >                         stage-2 permission flags that can be granted to 
> > > the
> > >                         slave domains.
> > > When @role = slave, this means the stage-2 permission
> > >                         flags of the shared memory area.
> > >                         Currently only 'rw' is supported. If not set. it
> > >                         defaults to 'rw'.
> > > 
> > >   @cache_policy         The stage-2 cacheability/shareability attributes 
> > > of the
> > >                         shared memory area. Currently, only two policies 
> > > are
> > >                         supported:
> > >                           * ARM_normal: Only applicable to ARM guests. 
> > > This
> > >                                         would mean Inner and Outer 
> > > Write-Back
> > >                                         Cacheable, and Inner Shareable.
> > 
> > 
> > Is there a reason not to set this to Outer Shareable?
> > Again, mainly useful when these pages get shared with devs as well.
> > 
> > The guest can always lower it to Inner Shareable via S1 tables if needed.
> 
> I don't think we can support memory sharing with devices in this version
> of the document (see above about GSoC timelines). Normal memory is inner
> shareable in Xen today, it makes sense to default to that.

I thought we mapped RAM as Outer shareable to guests but you seem to be right.
I think we should be mapping all RAM as Outer Shareable and then let the
guest decide what is Inner and what is Outer via it's S1 tables.
Right now it would be impossible to be Coherent with a DMA device outside
of the Inner domain...

Perhaps we should fix that and then ARM_normal would by itself become Outer.
If there's agreement I can test it and send a patch.

Best regards,
Edgar



>  
>  
> > >                           * x86_normal: Only applicable to x86 HVM 
> > > guests. This
> > >                                         would mean Write-Back Cacheable.
> > >                         If not set, it defaults to the *_normal policy 
> > > for the
> > >                         corresponding platform.
> > > 
> > > Note:
> > >   The sizes of the areas specified by @begin and @end in the slave
> > >   domain's config file should be smaller than the corresponding sizes 
> > > specified
> > >   in its master's domain. And @offset should always be within the backing
> > >   memory region. Overlapping backing memory areas are allowed, but the 
> > > slave's
> > >   can't map two different backing memory region's into an overlapping 
> > > memory
> > >   space.
> > >   The "master" role in vm1 for both ID1 and ID2 indicates that vm1 should 
> > > be
> > >   created prior to both vm2 and vm3, for they both rely on the pages 
> > > backed by
> > >   vm1. If one tries to create vm2 or vm3 prior to vm1, she will get an 
> > > error.
> > > 
> > > In the example above. A memory area ID1 will be shared between vm1 and 
> > > vm2.
> > > This area will be taken from vm1 and added to vm2's stage-2 page table.
> > > The parameter "prot=rw" means that this memory area is offered with 
> > > read-write
> > > permission. vm1 can access this area using 0x100000~0x200000, and vm2 
> > > using
> > > 0x500000~0x600000. The stage-2 cache policy of this backing memory area is
> > > x86_normal.
> > > 
> > > Likewise, a memory area ID2 will be shared between vm1 and vm3 with 
> > > read-write
> > > permissions. vm1 is the master and vm2 the slave. Note the @offset = 
> > > 0x10000
> > > in vm2' config, the actual sharing relationship would be:
> > >    (vm1 : 0x310000~0x400000) <=====> (vm2 : 0x690000~0x800000)
> > > The stage-2 cache policy of this backing memory area is x86_normal.
> > > 
> > > ======================================
> > > 2.2 Store the mem-sharing information in xenstore
> > > ======================================
> > > For we don't have some persistent storage for xl to store the information
> > > of the shared memory areas, we have to find some way to keep it between xl
> > > launches. And xenstore is a good place to do this. The information for one
> > > shared area should include the ID, master's domid, address range,
> > > memory attributes and information of the slaves etc.
> > > A current plan is to place the information under /local/shared_mem/ID.
> > > Still take the above config files as an example:
> > > 
> > > If we instantiate vm1, vm2 and vm3, one after another, “xenstore ls -f” 
> > > should
> > > output something like this:
> > > 
> > > After VM1 was instantiated, the output of “xenstore ls -f
> > > will be something like this:
> > > 
> > >     /local/shared_mem/ID1/master = domid_of_vm1
> > >     /local/shared_mem/ID1/begin = 0x100000
> > >     /local/shared_mem/ID1/end = 0x200000
> > >     /local/shared_mem/ID1/prot = "rw"
> > >     /local/shared_mem/ID1/cache_policy = "x86_normal"
> > >     /local/shared_mem/ID1/slaves = ""
> > > 
> > >     /local/shared_mem/ID2/master = domid_of_vm1
> > >     /local/shared_mem/ID2/begin = 0x300000
> > >     /local/shared_mem/ID2/end = 0x400000
> > >     /local/shared_mem/ID2/permissions = "rw"
> > >     /local/shared_mem/ID1/x86_cacheattr = "x86_normal"
> > >     /local/shared_mem/ID2/slaves = ""
> > > 
> > > After VM2 was instantiated, the following new lines will appear:
> > > 
> > >     /local/shared_mem/ID1/slaves/domid_of_vm2/begin = 0x500000
> > >     /local/shared_mem/ID1/slaves/domid_of_vm2/end = 0x600000
> > >     /local/shared_mem/ID1/slaves/domid_of_vm2/permissions = "rw"
> > > 
> > > After VM2 was instantiated, the following new lines will appear:
> > > 
> > >     /local/shared_mem/ID2/slaves/domid_of_vm3/gmfn_begin = 0x690000
> > >     /local/shared_mem/ID2/slaves/domid_of_vm3/gmfn_end = 0x800000
> > >     /local/shared_mem/ID2/slaves/domid_of_vm3/permissions = "rw"
> > > 
> > > 
> > > When we encounter an static_shm entry with id = IDx during "xl create":
> > > 
> > >   + If there's NO corresponding entry in xenstore:
> > >     + If @role=master, create the corresponding entries for IDx in 
> > > xenstore
> > >     + If @role=role, say error.
> > > 
> > >   + If the corresponding entry exists in xenstore:
> > >     + If @role=master, say error
> > >     + If @role=slave, map the pages to the newly created domain, and add 
> > > the
> > >       neccesasry informations under /local/shared_mem/IDx/slaves.
> > > 
> > > ======================================
> > > 2.3 mapping the memory areas
> > > ======================================
> > > Handle the newly added config option in tools/{xl, libxl} and utilize
> > > toos/libxc to do the actual memory mapping. Specifically, we will use
> > > xc_domain_add_to_physmap_batch with XENMAPSPACE_gmfn_foreign to
> > > do the actual mapping.
> > > 
> > > ======================================
> > > 2.4 error handling
> > > ======================================
> > > Add code to handle various errors: Invalid address, invalid permissions, 
> > > wrong
> > > order of vm creation, wrong length of memory area etc.
> > > 
> > > ====================================================
> > > 3. Expected Outcomes/Goals:
> > > ====================================================
> > > A new VM config option in xl will be introduced, allowing users to setup
> > > several shared memory areas for inter-VMs communications.
> > > This should work on both x86 and ARM.
> > > 
> > > ====================================================
> > > 3. Future Directions:
> > > ====================================================
> > > Implement the missing @prot flags and @cache_policy options.
> > > 
> > > Set up a notification channel between domains who are communicating 
> > > through
> > > shared memory regions, this allows one vm to signal her friends when data 
> > > is
> > > available in the shared memory or when the data in the shared memory is
> > > consumed. The channel could be built upon PPI or SGI.
> > > 
> > > 
> > > [See also:
> > > https://wiki.xenproject.org/wiki/Outreach_Program_Projects#Share_a_page_in_memory_from_the_VM_config_file]
> > > 
> > > Cheers,
> > > 
> > > Zhongze Liu
> > 
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@xxxxxxxxxxxxx
> > https://lists.xen.org/xen-devel
> > 


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.