[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 5/6] xen-gntalloc: Userspace grant allocation driver



On 12/14/2010 02:06 PM, Daniel De Graaf wrote:
>>> +static int gntalloc_mmap(struct file *filp, struct vm_area_struct *vma)
>>> +{
>>> +   struct gntalloc_file_private_data *priv = filp->private_data;
>>> +   struct gntalloc_gref *gref;
>>> +
>>> +   if (debug)
>>> +           printk("%s: priv %p, page %lu\n", __func__,
>>> +                  priv, vma->vm_pgoff);
>>> +
>>> +   /*
>>> +    * There is a 1-to-1 correspondence of grant references to shared
>>> +    * pages, so it only makes sense to map exactly one page per
>>> +    * call to mmap().
>>> +    */
>> Single-page mmap makes sense if the only possible use-cases are for
>> single-page mappings, but if you're talking about framebuffers and the
>> like is seems like a very awkward way to use mmap.  It would be cleaner
>> from an API perspective to have a user-mode defined flat address space
>> indexed by pgoff which maps to an array of grefs, so you can sensibly do
>> a multi-page mapping.
>>
>> It would also allow you to hide the grefs from usermode entirely.  Then
>> its just up to usermode to choose suitable file offsets for itself.
> I considered this, but wanted to keep userspace compatability with the
> previously created interface.

Is that private to you, or something in broader use?

>  If there's no reason to avoid doing so, I'll
> change the ioctl interface to allocate an array of grants and calculate the
> offset similar to how gntdev does currently (picks a suitable open slot).

I guess there's three options: you could get the kernel to allocate
extents, make usermode do it, or have one fd per extent and always start
from offset 0.  I guess the last could get very messy if you want to
have lots of mappings...  Making usermode define the offsets seems
simplest and most flexible, because then they can stitch together the
file-offset space in any way that's convenient to them (you just need to
deal with overlaps in that space).

> Userspace does still have to know about grefs, of course, but only to pass
> to the domain doing the mapping, not for its own mmap().

Ah, yes, of course.

>>> +   if (((vma->vm_end - vma->vm_start) >> PAGE_SHIFT) != 1) {
>>> +           printk(KERN_ERR "%s: Only one page can be memory-mapped "
>>> +                   "per grant reference.\n", __func__);
>>> +           return -EINVAL;
>>> +   }
>>> +
>>> +   if (!(vma->vm_flags & VM_SHARED)) {
>>> +           printk(KERN_ERR "%s: Mapping must be shared.\n",
>>> +                   __func__);
>>> +           return -EINVAL;
>>> +   }
>>> +
>>> +   spin_lock(&gref_lock);
>>> +   gref = find_gref(priv, vma->vm_pgoff);
>>> +   if (gref == NULL) {
>>> +           spin_unlock(&gref_lock);
>>> +           printk(KERN_ERR "%s: Could not find a grant reference with "
>>> +                   "page index %lu.\n", __func__, vma->vm_pgoff);
>>> +           return -ENOENT;
>>> +   }
>>> +   gref->users++;
>>> +   spin_unlock(&gref_lock);
>>> +
>>> +   vma->vm_private_data = gref;
>>> +
>>> +   /* This flag prevents Bad PTE errors when the memory is unmapped. */
>>> +   vma->vm_flags |= VM_RESERVED;
>>> +   vma->vm_flags |= VM_DONTCOPY;
>>> +   vma->vm_flags |= VM_IO;
>> If you set VM_PFNMAP then you don't need to deal with faults.
> Useful to know. Is that more efficient/preferred to defining a
> .fault handler? I used this method because that's what is used in
> kernel/relay.c.

Well, as you currently have it, your mmap() function doesn't map
anything, so you're relying on demand faulting to populate the ptes. 
Since you've already allocated the pages that's just a soft fault, but
it means you end up with a lot of per-page entries into the hypervisor.

If you make mmap pre-populate all the ptes (a nice big fat batch
operation), then you should never get faults on the vma.  You can set
PFNMAP to make sure of that (since you're already setting all the
"woowoo vma" flags, that makes sense).

Its actual meaning is "this vma contains pages which are not really
kernel memory, so paging them doesn't make sense" - ie, device or
foreign memory (we use it in gntdev).

In this case, the pages are normal kernel pages but they're being given
over to a "device" and so are no longer subject to normal kernel
lifetime rules.  So I think PFNMAP makes sense in this case too.


    J

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.