[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v8] x86/mem-sharing: mem-sharing a range of memory
On 26/07/2016 16:49, George Dunlap wrote: > On Tue, Jul 26, 2016 at 4:22 PM, Tamas K Lengyel > <tamas.lengyel@xxxxxxxxxxxx> wrote: >> On Tue, Jul 26, 2016 at 3:12 AM, George Dunlap <george.dunlap@xxxxxxxxxx> >> wrote: >>> On Wed, Jul 20, 2016 at 7:01 PM, Tamas K Lengyel >>> <tamas.lengyel@xxxxxxxxxxxx> wrote: >>>> Currently mem-sharing can be performed on a page-by-page basis from the >>>> control >>>> domain. However, this process is quite wasteful when a range of pages have >>>> to >>>> be deduplicated. >>>> >>>> This patch introduces a new mem_sharing memop for range sharing where >>>> the user doesn't have to separately nominate each page in both the source >>>> and >>>> destination domain, and the looping over all pages happen in the >>>> hypervisor. >>>> This significantly reduces the overhead of sharing a range of memory. >>>> >>>> Signed-off-by: Tamas K Lengyel <tamas.lengyel@xxxxxxxxxxxx> >>>> Acked-by: Wei Liu <wei.liu2@xxxxxxxxxx> >>>> Reviewed-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> >>>> --- >>>> Cc: Ian Jackson <ian.jackson@xxxxxxxxxxxxx> >>>> Cc: George Dunlap <george.dunlap@xxxxxxxxxxxxx> >>>> Cc: Jan Beulich <jbeulich@xxxxxxxx> >>>> Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> >>>> >>>> v8: style fixes and minor adjustments >>>> --- >>>> tools/libxc/include/xenctrl.h | 15 ++++ >>>> tools/libxc/xc_memshr.c | 19 +++++ >>>> tools/tests/mem-sharing/memshrtool.c | 22 ++++++ >>>> xen/arch/x86/mm/mem_sharing.c | 140 >>>> +++++++++++++++++++++++++++++++++++ >>>> xen/include/public/memory.h | 10 ++- >>>> 5 files changed, 205 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h >>>> index e904bd5..3782eff 100644 >>>> --- a/tools/libxc/include/xenctrl.h >>>> +++ b/tools/libxc/include/xenctrl.h >>>> @@ -2334,6 +2334,21 @@ int xc_memshr_add_to_physmap(xc_interface *xch, >>>> domid_t client_domain, >>>> unsigned long client_gfn); >>>> >>>> +/* Allows to deduplicate a range of memory of a client domain. Using >>>> + * this function is equivalent of calling xc_memshr_nominate_gfn for each >>>> gfn >>>> + * in the two domains followed by xc_memshr_share_gfns. >>>> + * >>>> + * May fail with -EINVAL if the source and client domain have different >>>> + * memory size or if memory sharing is not enabled on either of the >>>> domains. >>>> + * May also fail with -ENOMEM if there isn't enough memory available to >>>> store >>>> + * the sharing metadata before deduplication can happen. >>>> + */ >>>> +int xc_memshr_range_share(xc_interface *xch, >>>> + domid_t source_domain, >>>> + domid_t client_domain, >>>> + uint64_t start, >>>> + uint64_t end); >>>> + >>>> /* Debug calls: return the number of pages referencing the shared frame >>>> backing >>>> * the input argument. Should be one or greater. >>>> * >>>> diff --git a/tools/libxc/xc_memshr.c b/tools/libxc/xc_memshr.c >>>> index deb0aa4..2b871c7 100644 >>>> --- a/tools/libxc/xc_memshr.c >>>> +++ b/tools/libxc/xc_memshr.c >>>> @@ -181,6 +181,25 @@ int xc_memshr_add_to_physmap(xc_interface *xch, >>>> return xc_memshr_memop(xch, source_domain, &mso); >>>> } >>>> >>>> +int xc_memshr_range_share(xc_interface *xch, >>>> + domid_t source_domain, >>>> + domid_t client_domain, >>>> + uint64_t start, >>>> + uint64_t end) >>>> +{ >>>> + xen_mem_sharing_op_t mso; >>>> + >>>> + memset(&mso, 0, sizeof(mso)); >>>> + >>>> + mso.op = XENMEM_sharing_op_range_share; >>>> + >>>> + mso.u.range.client_domain = client_domain; >>>> + mso.u.range.start = start; >>>> + mso.u.range.end = end; >>>> + >>>> + return xc_memshr_memop(xch, source_domain, &mso); >>>> +} >>>> + >>>> int xc_memshr_domain_resume(xc_interface *xch, >>>> domid_t domid) >>>> { >>>> diff --git a/tools/tests/mem-sharing/memshrtool.c >>>> b/tools/tests/mem-sharing/memshrtool.c >>>> index 437c7c9..2af6a9e 100644 >>>> --- a/tools/tests/mem-sharing/memshrtool.c >>>> +++ b/tools/tests/mem-sharing/memshrtool.c >>>> @@ -24,6 +24,8 @@ static int usage(const char* prog) >>>> printf(" nominate <domid> <gfn> - Nominate a page for sharing.\n"); >>>> printf(" share <domid> <gfn> <handle> <source> <source-gfn> >>>> <source-handle>\n"); >>>> printf(" - Share two pages.\n"); >>>> + printf(" range <source-domid> <destination-domid> <start-gfn> >>>> <end-gfn>\n"); >>>> + printf(" - Share pages between domains in >>>> range.\n"); >>>> printf(" unshare <domid> <gfn> - Unshare a page by grabbing a >>>> writable map.\n"); >>>> printf(" add-to-physmap <domid> <gfn> <source> <source-gfn> >>>> <source-handle>\n"); >>>> printf(" - Populate a page in a domain with >>>> a shared page.\n"); >>>> @@ -180,6 +182,26 @@ int main(int argc, const char** argv) >>>> } >>>> printf("Audit returned %d errors.\n", rc); >>>> } >>>> + else if( !strcasecmp(cmd, "range") ) >>>> + { >>>> + domid_t sdomid, cdomid; >>>> + int rc; >>>> + uint64_t start, end; >>>> + >>>> + if ( argc != 6 ) >>>> + return usage(argv[0]); >>>> >>>> + sdomid = strtol(argv[2], NULL, 0); >>>> + cdomid = strtol(argv[3], NULL, 0); >>>> + start = strtoul(argv[4], NULL, 0); >>>> + end = strtoul(argv[5], NULL, 0); >>>> + >>>> + rc = xc_memshr_range_share(xch, sdomid, cdomid, start, end); >>>> + if ( rc < 0 ) >>>> + { >>>> + printf("error executing xc_memshr_range_share: %s\n", >>>> strerror(errno)); >>>> + return rc; >>>> + } >>>> + } >>>> return 0; >>>> } >>>> diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c >>>> index a522423..329fbd9 100644 >>>> --- a/xen/arch/x86/mm/mem_sharing.c >>>> +++ b/xen/arch/x86/mm/mem_sharing.c >>>> @@ -1294,6 +1294,58 @@ int relinquish_shared_pages(struct domain *d) >>>> return rc; >>>> } >>>> >>>> +static int range_share(struct domain *d, struct domain *cd, >>>> + struct mem_sharing_op_range *range) >>>> +{ >>>> + int rc = 0; >>>> + shr_handle_t sh, ch; >>>> + unsigned long start = range->_scratchspace ?: range->start; >>>> + >>>> + while( range->end >= start ) >>>> + { >>>> + /* >>>> + * We only break out if we run out of memory as individual pages >>>> may >>>> + * legitimately be unsharable and we just want to skip over those. >>>> + */ >>>> + rc = mem_sharing_nominate_page(d, start, 0, &sh); >>>> + if ( rc == -ENOMEM ) >>>> + break; >>>> + >>>> + if ( !rc ) >>>> + { >>>> + rc = mem_sharing_nominate_page(cd, start, 0, &ch); >>>> + if ( rc == -ENOMEM ) >>>> + break; >>>> + >>>> + if ( !rc ) >>>> + { >>>> + /* If we get here this should be guaranteed to succeed. */ >>>> + rc = mem_sharing_share_pages(d, start, sh, >>>> + cd, start, ch); >>>> + ASSERT(!rc); >>>> + } >>>> + } >>>> + >>>> + /* Check for continuation if it's not the last iteration. */ >>>> + if ( range->end >= ++start && hypercall_preempt_check() ) >>>> + { >>>> + rc = 1; >>>> + break; >>>> + } >>>> + } >>>> + >>>> + range->_scratchspace = start; >>>> + >>>> + /* >>>> + * The last page may fail with -EINVAL, and for range sharing we don't >>>> + * care about that. >>>> + */ >>>> + if ( range->end < start && rc == -EINVAL ) >>>> + rc = 0; >>>> + >>>> + return rc; >>>> +} >>>> + >>>> int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg) >>>> { >>>> int rc; >>>> @@ -1468,6 +1520,94 @@ int >>>> mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg) >>>> } >>>> break; >>>> >>>> + case XENMEM_sharing_op_range_share: >>>> + { >>>> + unsigned long max_sgfn, max_cgfn; >>>> + struct domain *cd; >>>> + >>>> + rc = -EINVAL; >>>> + if ( mso.u.range._pad[0] || mso.u.range._pad[1] || >>>> + mso.u.range._pad[2] ) >>>> + goto out; >>>> + >>>> + /* >>>> + * We use _scratchscape for the hypercall continuation value. >>>> + * Ideally the user sets this to 0 in the beginning but >>>> + * there is no good way of enforcing that here, so we just >>>> check >>>> + * that it's at least in range. >>>> + */ >>>> + if ( mso.u.range._scratchspace && >>>> + (mso.u.range._scratchspace < mso.u.range.start || >>>> + mso.u.range._scratchspace > mso.u.range.end) ) >>>> + goto out; >>>> + >>>> + if ( !mem_sharing_enabled(d) ) >>>> + goto out; >>>> + >>>> + rc = >>>> rcu_lock_live_remote_domain_by_id(mso.u.range.client_domain, >>>> + &cd); >>>> + if ( rc ) >>>> + goto out; >>>> + >>>> + /* >>>> + * We reuse XENMEM_sharing_op_share XSM check here as this is >>>> + * essentially the same concept repeated over multiple pages. >>>> + */ >>>> + rc = xsm_mem_sharing_op(XSM_DM_PRIV, d, cd, >>>> + XENMEM_sharing_op_share); >>>> + if ( rc ) >>>> + { >>>> + rcu_unlock_domain(cd); >>>> + goto out; >>>> + } >>>> + >>>> + if ( !mem_sharing_enabled(cd) ) >>>> + { >>>> + rcu_unlock_domain(cd); >>>> + rc = -EINVAL; >>>> + goto out; >>>> + } >>>> + >>>> + /* >>>> + * Sanity check only, the client should keep the domains >>>> paused for >>>> + * the duration of this op. >>>> + */ >>>> + if ( !atomic_read(&d->pause_count) || >>>> + !atomic_read(&cd->pause_count) ) >>>> + { >>>> + rcu_unlock_domain(cd); >>>> + rc = -EINVAL; >>>> + goto out; >>>> + } >>>> + >>>> + max_sgfn = domain_get_maximum_gpfn(d); >>>> + max_cgfn = domain_get_maximum_gpfn(cd); >>>> + >>>> + if ( max_sgfn < mso.u.range.start || max_sgfn < >>>> mso.u.range.end || >>>> + max_cgfn < mso.u.range.start || max_cgfn < >>>> mso.u.range.end ) >>>> + { >>>> + rcu_unlock_domain(cd); >>>> + rc = -EINVAL; >>>> + goto out; >>>> + } >>>> + >>>> + rc = range_share(d, cd, &mso.u.range); >>>> + rcu_unlock_domain(cd); >>>> + >>>> + if ( rc > 0 ) >>>> + { >>>> + if ( __copy_to_guest(arg, &mso, 1) ) >>>> + rc = -EFAULT; >>>> + else >>>> + rc = >>>> hypercall_create_continuation(__HYPERVISOR_memory_op, >>>> + "lh", >>>> XENMEM_sharing_op, >>>> + arg); >>>> + } >>>> + else >>>> + mso.u.range._scratchspace = 0; >>>> + } >>>> + break; >>>> + >>>> case XENMEM_sharing_op_debug_gfn: >>>> { >>>> unsigned long gfn = mso.u.debug.u.gfn; >>>> diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h >>>> index 29ec571..e0bc018 100644 >>>> --- a/xen/include/public/memory.h >>>> +++ b/xen/include/public/memory.h >>>> @@ -465,6 +465,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_mem_access_op_t); >>>> #define XENMEM_sharing_op_debug_gref 5 >>>> #define XENMEM_sharing_op_add_physmap 6 >>>> #define XENMEM_sharing_op_audit 7 >>>> +#define XENMEM_sharing_op_range_share 8 >>>> >>>> #define XENMEM_SHARING_OP_S_HANDLE_INVALID (-10) >>>> #define XENMEM_SHARING_OP_C_HANDLE_INVALID (-9) >>>> @@ -500,7 +501,14 @@ struct xen_mem_sharing_op { >>>> uint64_aligned_t client_gfn; /* IN: the client gfn */ >>>> uint64_aligned_t client_handle; /* IN: handle to the client >>>> page */ >>>> domid_t client_domain; /* IN: the client domain id */ >>>> - } share; >>>> + } share; >>>> + struct mem_sharing_op_range { /* OP_RANGE_SHARE */ >>>> + uint64_aligned_t start; /* IN: start gfn. */ >>>> + uint64_aligned_t end; /* IN: end gfn (inclusive) */ >>>> + uint64_aligned_t _scratchspace; /* Must be set to 0 */ >>> Tamas, >>> >>> Why include this "scratchspace" that's not used in the interface, >>> rather than just doing what the memory operations in >>> xen/common/memory.c do, and store it in EAX by shifting it over by >>> MEMOP_EXTENT_SHIFT? I looked through the history and I see that v1 >>> did something very much like that, but it was changed to using the >>> scratch space without any explanation. >>> >>> Having this in the public interface is ugly; and what's worse, it >>> exposes some of the internal mechanism to the guest. >>> >>> -George >> Well, that's exactly what I did in an earlier version of the patch but >> it was requested that I change it to something like this by Andrew >> (see https://lists.xen.org/archives/html/xen-devel/2015-10/msg00434.html). >> Then over the various iterations it ended up like looking like this. > Oh right -- sorry, I did look but somehow I missed that Andrew had requested > it. > > I would have read his comment to mean to put the _scratchspace > variable in the larger structure. But it has his R-b, so I'll > consider myself answered. I am open to other suggestion wrt naming, but wasn't looking to bikeshed the issue. _opaque is a common name used. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |