[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 1/2] Interface for grant copy operation in libs.



On Wed, Jun 22, 2016 at 03:52:43PM +0100, Wei Liu wrote:
> On Wed, Jun 22, 2016 at 02:52:47PM +0100, David Vrabel wrote:
> > On 22/06/16 14:29, Wei Liu wrote:
> > > On Wed, Jun 22, 2016 at 01:37:50PM +0100, David Vrabel wrote:
> > >> On 22/06/16 12:21, Wei Liu wrote:
> > >>> On Wed, Jun 22, 2016 at 10:37:24AM +0100, David Vrabel wrote:
> > >>>> On 22/06/16 09:38, Paulina Szubarczyk wrote:
> > >>>>> In a linux part an ioctl(gntdev, IOCTL_GNTDEV_GRANT_COPY, ..)
> > >>>>> system call is invoked. In mini-os the operation is yet not
> > >>>>> implemented. For other OSs there is a dummy implementation.
> > >>>> [...]
> > >>>>> --- a/tools/libs/gnttab/linux.c
> > >>>>> +++ b/tools/libs/gnttab/linux.c
> > >>>>> @@ -235,6 +235,51 @@ int osdep_gnttab_unmap(xengnttab_handle *xgt,
> > >>>>>      return 0;
> > >>>>>  }
> > >>>>>  
> > >>>>> +int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
> > >>>>> +                            uint32_t count,
> > >>>>> +                            xengnttab_grant_copy_segment_t *segs)
> > >>>>> +{
> > >>>>> +    int i, rc;
> > >>>>> +    int fd = xgt->fd;
> > >>>>> +    struct ioctl_gntdev_grant_copy copy;
> > >>>>> +
> > >>>>> +    copy.segments = calloc(count, sizeof(struct 
> > >>>>> ioctl_gntdev_grant_copy_segment));
> > >>>>> +    copy.count = count;
> > >>>>> +    for (i = 0; i < count; i++)
> > >>>>> +    {
> > >>>>> +        copy.segments[i].flags = segs[i].flags;
> > >>>>> +        copy.segments[i].len = segs[i].len;
> > >>>>> +        if (segs[i].flags == GNTCOPY_dest_gref) 
> > >>>>> +        {
> > >>>>> +            copy.segments[i].dest.foreign.ref = 
> > >>>>> segs[i].dest.foreign.ref;
> > >>>>> +            copy.segments[i].dest.foreign.domid = 
> > >>>>> segs[i].dest.foreign.domid;
> > >>>>> +            copy.segments[i].dest.foreign.offset = 
> > >>>>> segs[i].dest.foreign.offset;
> > >>>>> +            copy.segments[i].source.virt = segs[i].source.virt;
> > >>>>> +        } 
> > >>>>> +        else 
> > >>>>> +        {
> > >>>>> +            copy.segments[i].source.foreign.ref = 
> > >>>>> segs[i].source.foreign.ref;
> > >>>>> +            copy.segments[i].source.foreign.domid = 
> > >>>>> segs[i].source.foreign.domid;
> > >>>>> +            copy.segments[i].source.foreign.offset = 
> > >>>>> segs[i].source.foreign.offset;
> > >>>>> +            copy.segments[i].dest.virt = segs[i].dest.virt;
> > >>>>> +        }
> > >>>>> +    }
> > >>>>> +
> > >>>>> +    rc = ioctl(fd, IOCTL_GNTDEV_GRANT_COPY, &copy);
> > >>>>> +    if (rc) 
> > >>>>> +    {
> > >>>>> +        GTERROR(xgt->logger, "ioctl GRANT COPY failed %d ", errno);
> > >>>>> +    }
> > >>>>> +    else 
> > >>>>> +    {
> > >>>>> +        for (i = 0; i < count; i++)
> > >>>>> +            segs[i].status = copy.segments[i].status;
> > >>>>> +    }
> > >>>>> +
> > >>>>> +    free(copy.segments);
> > >>>>> +    return rc;
> > >>>>> +}
> > >>>>
> > >>>> I know Wei asked for this but you've replaced what should be a single
> > >>>> pointer assignment with a memory allocation and two loops over all the
> > >>>> segments.
> > >>>>
> > >>>> This is a hot path and the two structures (the libxengnttab one and the
> > >>>> Linux kernel one) are both part of their respective ABIs and won't
> > >>>> change so Wei's concern that they might change in the future is 
> > >>>> unfounded.
> > >>>>
> > >>>
> > >>> The fundamental question is: will the ABI between the library and the
> > >>> kernel ever go mismatch?
> > >>>
> > >>> My answer is "maybe".  My rationale is that everything goes across
> > >>> boundary of components need to be considered with caution. And I tend to
> > >>> assume the worst things will happen.
> > >>>
> > >>> To guarantee that they will never go mismatch is to have
> > >>>
> > >>>    typedef ioctl_gntdev_grant_copy_segment 
> > >>> xengnttab_grant_copy_segment_t;
> > >>>
> > >>> But that's not how the code is written.
> > >>>
> > >>> I would like to hear a third opinion. Is my concern unfounded? Am I too
> > >>> cautious? Is there any compelling argument that I missed?
> > >>>
> > >>> Somewhat related, can we have some numbers please? It could well be the
> > >>> cost of the two loops is much cheaper than whatever is going on inside
> > >>> the kernel / hypervisor. And it could turn out that the numbers render
> > >>> this issue moot.
> > >>
> > >> I did some (very) adhoc measurements and with the worst case of single
> > >> short segments for each ioctl, the optimized version of
> > >> osdep_gnttab_grant_copy() looks to be ~5% faster.
> > >>
> > >> This is enough of a difference that we should use the optimized version.
> > >>
> > >> The unoptimized version also adds an additional failure path (the
> > >> calloc) which would be best avoided.
> > >>
> > > 
> > > Your test case includes a lot of  noise in libc allocator, so...
> > > 
> > > Can you give try the following patch (apply on top of Paulina's patch)?
> > > The basic idea is to provide scratch space for the structures. Note, the
> > > patch is compile test only.
> > [...]
> > > +#define COPY_SEGMENT_CACHE_SIZE 1024
> > 
> > Arbitrary limit on number of segments.
> > 
> > > +    copy.segments = xgt->osdep_data;
> > 
> > Not thread safe.
> > 
> 
> Both issues are real, but this is just a gross hack to try to get some
> numbers.
> 
> > I tried using alloca() which has <1% performance penalty but the failure
> > mode for alloca() is really bad so I would not recommend it.
> > 
> 
> Agreed.
> 
> But if you want to use the stack, maybe C99 variable length array would
> do?
> 

The numbers (stack based < 1% overhead, heap based ~5% overhead) suggest
that all the assignments are fast. It is the malloc / free pair that is
slow.

And actually we can just use a combination of statically allocated stack
based array and heap based array. Say, let's have a X element array
(pick the number used in hypervisor preemption check), if count > X, use
heap based array (with the hope that the libc allocation / free overhead
should be masked by the copying overhead in hypervisor).

That would achieve both safety and performance, and render a lot of the
other discussions (the expectation of application, the interface in
other platform etc) moot. Looks like the good solution for me.

David, what do you think?

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.