|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v8] x86/mem-sharing: mem-sharing a range of memory
On Tue, Jul 26, 2016 at 4:22 PM, Tamas K Lengyel
<tamas.lengyel@xxxxxxxxxxxx> wrote:
> On Tue, Jul 26, 2016 at 3:12 AM, George Dunlap <george.dunlap@xxxxxxxxxx>
> wrote:
>> On Wed, Jul 20, 2016 at 7:01 PM, Tamas K Lengyel
>> <tamas.lengyel@xxxxxxxxxxxx> wrote:
>>> Currently mem-sharing can be performed on a page-by-page basis from the
>>> control
>>> domain. However, this process is quite wasteful when a range of pages have
>>> to
>>> be deduplicated.
>>>
>>> This patch introduces a new mem_sharing memop for range sharing where
>>> the user doesn't have to separately nominate each page in both the source
>>> and
>>> destination domain, and the looping over all pages happen in the hypervisor.
>>> This significantly reduces the overhead of sharing a range of memory.
>>>
>>> Signed-off-by: Tamas K Lengyel <tamas.lengyel@xxxxxxxxxxxx>
>>> Acked-by: Wei Liu <wei.liu2@xxxxxxxxxx>
>>> Reviewed-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>>> ---
>>> Cc: Ian Jackson <ian.jackson@xxxxxxxxxxxxx>
>>> Cc: George Dunlap <george.dunlap@xxxxxxxxxxxxx>
>>> Cc: Jan Beulich <jbeulich@xxxxxxxx>
>>> Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>>>
>>> v8: style fixes and minor adjustments
>>> ---
>>> tools/libxc/include/xenctrl.h | 15 ++++
>>> tools/libxc/xc_memshr.c | 19 +++++
>>> tools/tests/mem-sharing/memshrtool.c | 22 ++++++
>>> xen/arch/x86/mm/mem_sharing.c | 140
>>> +++++++++++++++++++++++++++++++++++
>>> xen/include/public/memory.h | 10 ++-
>>> 5 files changed, 205 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
>>> index e904bd5..3782eff 100644
>>> --- a/tools/libxc/include/xenctrl.h
>>> +++ b/tools/libxc/include/xenctrl.h
>>> @@ -2334,6 +2334,21 @@ int xc_memshr_add_to_physmap(xc_interface *xch,
>>> domid_t client_domain,
>>> unsigned long client_gfn);
>>>
>>> +/* Allows to deduplicate a range of memory of a client domain. Using
>>> + * this function is equivalent of calling xc_memshr_nominate_gfn for each
>>> gfn
>>> + * in the two domains followed by xc_memshr_share_gfns.
>>> + *
>>> + * May fail with -EINVAL if the source and client domain have different
>>> + * memory size or if memory sharing is not enabled on either of the
>>> domains.
>>> + * May also fail with -ENOMEM if there isn't enough memory available to
>>> store
>>> + * the sharing metadata before deduplication can happen.
>>> + */
>>> +int xc_memshr_range_share(xc_interface *xch,
>>> + domid_t source_domain,
>>> + domid_t client_domain,
>>> + uint64_t start,
>>> + uint64_t end);
>>> +
>>> /* Debug calls: return the number of pages referencing the shared frame
>>> backing
>>> * the input argument. Should be one or greater.
>>> *
>>> diff --git a/tools/libxc/xc_memshr.c b/tools/libxc/xc_memshr.c
>>> index deb0aa4..2b871c7 100644
>>> --- a/tools/libxc/xc_memshr.c
>>> +++ b/tools/libxc/xc_memshr.c
>>> @@ -181,6 +181,25 @@ int xc_memshr_add_to_physmap(xc_interface *xch,
>>> return xc_memshr_memop(xch, source_domain, &mso);
>>> }
>>>
>>> +int xc_memshr_range_share(xc_interface *xch,
>>> + domid_t source_domain,
>>> + domid_t client_domain,
>>> + uint64_t start,
>>> + uint64_t end)
>>> +{
>>> + xen_mem_sharing_op_t mso;
>>> +
>>> + memset(&mso, 0, sizeof(mso));
>>> +
>>> + mso.op = XENMEM_sharing_op_range_share;
>>> +
>>> + mso.u.range.client_domain = client_domain;
>>> + mso.u.range.start = start;
>>> + mso.u.range.end = end;
>>> +
>>> + return xc_memshr_memop(xch, source_domain, &mso);
>>> +}
>>> +
>>> int xc_memshr_domain_resume(xc_interface *xch,
>>> domid_t domid)
>>> {
>>> diff --git a/tools/tests/mem-sharing/memshrtool.c
>>> b/tools/tests/mem-sharing/memshrtool.c
>>> index 437c7c9..2af6a9e 100644
>>> --- a/tools/tests/mem-sharing/memshrtool.c
>>> +++ b/tools/tests/mem-sharing/memshrtool.c
>>> @@ -24,6 +24,8 @@ static int usage(const char* prog)
>>> printf(" nominate <domid> <gfn> - Nominate a page for sharing.\n");
>>> printf(" share <domid> <gfn> <handle> <source> <source-gfn>
>>> <source-handle>\n");
>>> printf(" - Share two pages.\n");
>>> + printf(" range <source-domid> <destination-domid> <start-gfn>
>>> <end-gfn>\n");
>>> + printf(" - Share pages between domains in
>>> range.\n");
>>> printf(" unshare <domid> <gfn> - Unshare a page by grabbing a
>>> writable map.\n");
>>> printf(" add-to-physmap <domid> <gfn> <source> <source-gfn>
>>> <source-handle>\n");
>>> printf(" - Populate a page in a domain with a
>>> shared page.\n");
>>> @@ -180,6 +182,26 @@ int main(int argc, const char** argv)
>>> }
>>> printf("Audit returned %d errors.\n", rc);
>>> }
>>> + else if( !strcasecmp(cmd, "range") )
>>> + {
>>> + domid_t sdomid, cdomid;
>>> + int rc;
>>> + uint64_t start, end;
>>> +
>>> + if ( argc != 6 )
>>> + return usage(argv[0]);
>>>
>>> + sdomid = strtol(argv[2], NULL, 0);
>>> + cdomid = strtol(argv[3], NULL, 0);
>>> + start = strtoul(argv[4], NULL, 0);
>>> + end = strtoul(argv[5], NULL, 0);
>>> +
>>> + rc = xc_memshr_range_share(xch, sdomid, cdomid, start, end);
>>> + if ( rc < 0 )
>>> + {
>>> + printf("error executing xc_memshr_range_share: %s\n",
>>> strerror(errno));
>>> + return rc;
>>> + }
>>> + }
>>> return 0;
>>> }
>>> diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
>>> index a522423..329fbd9 100644
>>> --- a/xen/arch/x86/mm/mem_sharing.c
>>> +++ b/xen/arch/x86/mm/mem_sharing.c
>>> @@ -1294,6 +1294,58 @@ int relinquish_shared_pages(struct domain *d)
>>> return rc;
>>> }
>>>
>>> +static int range_share(struct domain *d, struct domain *cd,
>>> + struct mem_sharing_op_range *range)
>>> +{
>>> + int rc = 0;
>>> + shr_handle_t sh, ch;
>>> + unsigned long start = range->_scratchspace ?: range->start;
>>> +
>>> + while( range->end >= start )
>>> + {
>>> + /*
>>> + * We only break out if we run out of memory as individual pages
>>> may
>>> + * legitimately be unsharable and we just want to skip over those.
>>> + */
>>> + rc = mem_sharing_nominate_page(d, start, 0, &sh);
>>> + if ( rc == -ENOMEM )
>>> + break;
>>> +
>>> + if ( !rc )
>>> + {
>>> + rc = mem_sharing_nominate_page(cd, start, 0, &ch);
>>> + if ( rc == -ENOMEM )
>>> + break;
>>> +
>>> + if ( !rc )
>>> + {
>>> + /* If we get here this should be guaranteed to succeed. */
>>> + rc = mem_sharing_share_pages(d, start, sh,
>>> + cd, start, ch);
>>> + ASSERT(!rc);
>>> + }
>>> + }
>>> +
>>> + /* Check for continuation if it's not the last iteration. */
>>> + if ( range->end >= ++start && hypercall_preempt_check() )
>>> + {
>>> + rc = 1;
>>> + break;
>>> + }
>>> + }
>>> +
>>> + range->_scratchspace = start;
>>> +
>>> + /*
>>> + * The last page may fail with -EINVAL, and for range sharing we don't
>>> + * care about that.
>>> + */
>>> + if ( range->end < start && rc == -EINVAL )
>>> + rc = 0;
>>> +
>>> + return rc;
>>> +}
>>> +
>>> int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg)
>>> {
>>> int rc;
>>> @@ -1468,6 +1520,94 @@ int
>>> mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg)
>>> }
>>> break;
>>>
>>> + case XENMEM_sharing_op_range_share:
>>> + {
>>> + unsigned long max_sgfn, max_cgfn;
>>> + struct domain *cd;
>>> +
>>> + rc = -EINVAL;
>>> + if ( mso.u.range._pad[0] || mso.u.range._pad[1] ||
>>> + mso.u.range._pad[2] )
>>> + goto out;
>>> +
>>> + /*
>>> + * We use _scratchscape for the hypercall continuation value.
>>> + * Ideally the user sets this to 0 in the beginning but
>>> + * there is no good way of enforcing that here, so we just
>>> check
>>> + * that it's at least in range.
>>> + */
>>> + if ( mso.u.range._scratchspace &&
>>> + (mso.u.range._scratchspace < mso.u.range.start ||
>>> + mso.u.range._scratchspace > mso.u.range.end) )
>>> + goto out;
>>> +
>>> + if ( !mem_sharing_enabled(d) )
>>> + goto out;
>>> +
>>> + rc =
>>> rcu_lock_live_remote_domain_by_id(mso.u.range.client_domain,
>>> + &cd);
>>> + if ( rc )
>>> + goto out;
>>> +
>>> + /*
>>> + * We reuse XENMEM_sharing_op_share XSM check here as this is
>>> + * essentially the same concept repeated over multiple pages.
>>> + */
>>> + rc = xsm_mem_sharing_op(XSM_DM_PRIV, d, cd,
>>> + XENMEM_sharing_op_share);
>>> + if ( rc )
>>> + {
>>> + rcu_unlock_domain(cd);
>>> + goto out;
>>> + }
>>> +
>>> + if ( !mem_sharing_enabled(cd) )
>>> + {
>>> + rcu_unlock_domain(cd);
>>> + rc = -EINVAL;
>>> + goto out;
>>> + }
>>> +
>>> + /*
>>> + * Sanity check only, the client should keep the domains
>>> paused for
>>> + * the duration of this op.
>>> + */
>>> + if ( !atomic_read(&d->pause_count) ||
>>> + !atomic_read(&cd->pause_count) )
>>> + {
>>> + rcu_unlock_domain(cd);
>>> + rc = -EINVAL;
>>> + goto out;
>>> + }
>>> +
>>> + max_sgfn = domain_get_maximum_gpfn(d);
>>> + max_cgfn = domain_get_maximum_gpfn(cd);
>>> +
>>> + if ( max_sgfn < mso.u.range.start || max_sgfn <
>>> mso.u.range.end ||
>>> + max_cgfn < mso.u.range.start || max_cgfn <
>>> mso.u.range.end )
>>> + {
>>> + rcu_unlock_domain(cd);
>>> + rc = -EINVAL;
>>> + goto out;
>>> + }
>>> +
>>> + rc = range_share(d, cd, &mso.u.range);
>>> + rcu_unlock_domain(cd);
>>> +
>>> + if ( rc > 0 )
>>> + {
>>> + if ( __copy_to_guest(arg, &mso, 1) )
>>> + rc = -EFAULT;
>>> + else
>>> + rc =
>>> hypercall_create_continuation(__HYPERVISOR_memory_op,
>>> + "lh",
>>> XENMEM_sharing_op,
>>> + arg);
>>> + }
>>> + else
>>> + mso.u.range._scratchspace = 0;
>>> + }
>>> + break;
>>> +
>>> case XENMEM_sharing_op_debug_gfn:
>>> {
>>> unsigned long gfn = mso.u.debug.u.gfn;
>>> diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
>>> index 29ec571..e0bc018 100644
>>> --- a/xen/include/public/memory.h
>>> +++ b/xen/include/public/memory.h
>>> @@ -465,6 +465,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_mem_access_op_t);
>>> #define XENMEM_sharing_op_debug_gref 5
>>> #define XENMEM_sharing_op_add_physmap 6
>>> #define XENMEM_sharing_op_audit 7
>>> +#define XENMEM_sharing_op_range_share 8
>>>
>>> #define XENMEM_SHARING_OP_S_HANDLE_INVALID (-10)
>>> #define XENMEM_SHARING_OP_C_HANDLE_INVALID (-9)
>>> @@ -500,7 +501,14 @@ struct xen_mem_sharing_op {
>>> uint64_aligned_t client_gfn; /* IN: the client gfn */
>>> uint64_aligned_t client_handle; /* IN: handle to the client
>>> page */
>>> domid_t client_domain; /* IN: the client domain id */
>>> - } share;
>>> + } share;
>>> + struct mem_sharing_op_range { /* OP_RANGE_SHARE */
>>> + uint64_aligned_t start; /* IN: start gfn. */
>>> + uint64_aligned_t end; /* IN: end gfn (inclusive) */
>>> + uint64_aligned_t _scratchspace; /* Must be set to 0 */
>>
>> Tamas,
>>
>> Why include this "scratchspace" that's not used in the interface,
>> rather than just doing what the memory operations in
>> xen/common/memory.c do, and store it in EAX by shifting it over by
>> MEMOP_EXTENT_SHIFT? I looked through the history and I see that v1
>> did something very much like that, but it was changed to using the
>> scratch space without any explanation.
>>
>> Having this in the public interface is ugly; and what's worse, it
>> exposes some of the internal mechanism to the guest.
>>
>> -George
>
> Well, that's exactly what I did in an earlier version of the patch but
> it was requested that I change it to something like this by Andrew
> (see https://lists.xen.org/archives/html/xen-devel/2015-10/msg00434.html).
> Then over the various iterations it ended up like looking like this.
Oh right -- sorry, I did look but somehow I missed that Andrew had requested it.
I would have read his comment to mean to put the _scratchspace
variable in the larger structure. But it has his R-b, so I'll
consider myself answered.
Thanks,
-George
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |