WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] Re: [PATCH 1/3] libxc: add xc_gnttab_map_grant_ref_notify

On Wed, 2011-09-21 at 16:02 +0100, Daniel De Graaf wrote:
> On 09/21/2011 06:03 AM, Ian Campbell wrote:
> > On Mon, 2011-09-19 at 23:43 +0100, Daniel De Graaf wrote:
> >> @@ -116,4 +116,35 @@ struct ioctl_gntdev_set_max_grants {
> >>    uint32_t count;
> >>  };
> >>  
> >> +/*
> >> + * Sets up an unmap notification within the page, so that the other side 
> >> can do
> >> + * cleanup if this side crashes. Required to implement cross-domain robust
> >> + * mutexes or close notification on communication channels.
> >> + *
> >> + * Each mapped page only supports one notification; multiple calls 
> >> referring to
> >> + * the same page overwrite the previous notification. You must clear the
> >> + * notification prior to the IOCTL_GNTALLOC_DEALLOC_GREF if you do not 
> >> want it
> >> + * to occur.
> >> + */
> >> +#define IOCTL_GNTDEV_SET_UNMAP_NOTIFY \
> >> +_IOC(_IOC_NONE, 'G', 7, sizeof(struct ioctl_gntdev_unmap_notify))
> >> +struct ioctl_gntdev_unmap_notify {
> >> +  /* IN parameters */
> >> +  /* Offset in the file descriptor for a byte within the page (same as
> >> +   * used in mmap).
> > 
> > I'm probably being thick but I don't understand what this means, i.e.
> > what this thing is relative to.
> 
> This is an offset that acts like a byte offset in the /dev/xen/gntdev device.
> Conceptually, if /dev/xen/evtchn implemented pwrite() then this offset would
> be the offset you would pass to modify your target byte.
> 
> For example, if you use gntdev to map two pages, the first will be at offset
> 0 and the second at 4096; you would pass 4098 here to indicate the third byte
> of the second page. The offsets (0, 4096) are returned by the map-grant 
> ioctls.

Hmm. I think I was confused because it was an offset into the file
rather than, say, a virtual address.

When I map a page how do I know what the offset of it is wrt the file
descriptor? DO I just have to remember how many pages I mapped an *=
4096?

> 
> >>  If using UNMAP_NOTIFY_CLEAR_BYTE, this is the byte to
> >> +   * be cleared. Otherwise, it can be any byte in the page whose
> >> +   * notification we are adjusting.
> >> +   */
> >> +  uint64_t index;
> >> +  /* Action(s) to take on unmap */
> >> +  uint32_t action;
> >> +  /* Event channel to notify */
> >> +  uint32_t event_channel_port;
> > 
> > evtchn_port_t ?
> 
> Using that would require an include dependency on event_channel.h which is
> not exposed as a userspace-visible header. Linux's evtchn.h also does not
> use evtchn_port_t (it uses unsigned int).
> 
> Since this is just a direct copy of the header from the linux source tree,
> any changes really need to happen there first.

OK, that's fine as it is then.

> 
> >> +};
> >> +
> >> +/* Clear (set to zero) the byte specified by index */
> >> +#define UNMAP_NOTIFY_CLEAR_BYTE 0x1
> >> +/* Send an interrupt on the indicated event channel */
> >> +#define UNMAP_NOTIFY_SEND_EVENT 0x2
> >> +
> >>  #endif /* __LINUX_PUBLIC_GNTDEV_H__ */
> >> diff --git a/tools/libxc/xc_gnttab.c b/tools/libxc/xc_gnttab.c
> >> index 4f55fce..3d3c63b 100644
> >> --- a/tools/libxc/xc_gnttab.c
> >> +++ b/tools/libxc/xc_gnttab.c
> >> @@ -18,6 +18,7 @@
> >>   */
> >>  
> >>  #include "xc_private.h"
> >> +#include <errno.h>
> >>  
> >>  int xc_gnttab_op(xc_interface *xch, int cmd, void * op, int op_size, int 
> >> count)
> >>  {
> >> @@ -174,6 +175,28 @@ void *xc_gnttab_map_domain_grant_refs(xc_gnttab *xcg,
> >>                                                    count, domid, refs, 
> >> prot);
> >>  }
> >>  
> >> +void *xc_gnttab_map_grant_ref_notify(xc_gnttab *xcg,
> >> +                                     uint32_t domid,
> >> +                                     uint32_t ref,
> >> +                                     uint32_t notify_offset,
> >> +                                     evtchn_port_t notify_port,
> >> +                                     int *notify_result)
> >> +{
> >> +  if (xcg->ops->u.gnttab.map_grant_ref_notify)
> >> +          return xcg->ops->u.gnttab.map_grant_ref_notify(xcg, 
> >> xcg->ops_handle,
> >> +                  domid, ref, notify_offset, notify_port, notify_result);
> >> +  else {
> >> +          void* area = xcg->ops->u.gnttab.map_grant_ref(xcg, 
> >> xcg->ops_handle,
> >> +                  domid, ref, PROT_READ|PROT_WRITE);
> >> +          if (area && notify_result) {
> >> +                  *notify_result = -1;
> >> +                  errno = ENOSYS;
> >> +          }
> >> +          return area;
> >> +  }
> > 
> > I think the new public interface is fine but do we really need a new
> > internal interface here?
> > 
> > I think you can just add the notify_* arguments to the existing OSDEP
> > function and have those OS backends which don't implement that feature
> > return ENOSYS if notify_offset != 0 (or ~0 or whatever invalid value
> > works).
> > 
> > Why doesn't the *_notify variant take a prot argument?
> 
> At least for the byte-clear portion of the notify, the page must be writable
> or the notify will not work. I suppose an event channel alone could be used
> for a read-only mapping, but normally the unmapping of a read-only mapping is
> not an important event - although I guess you could contrive a use for this
> type of notificaiton, so there's no reason not to allow it.

If you combine this the two functions then returning EINVAL for attempts
to map without PROT_WRITE (or whatever perm is necessary) would be
reasonable IMHO.

> > I'd be tempted to do away with notify_result too -- if the caller asked
> > for notification and we fail to give that then we can cleanup and return
> > an error. If they want to try again without the notification then that's
> > up to them.
> 
> The source of the error might be unclear, but this would make the interface
> cleaner.
> 
> > 
> >> +}
> >> +
> >> +
> >>  int xc_gnttab_munmap(xc_gnttab *xcg,
> >>                       void *start_address,
> >>                       uint32_t count)
> >> diff --git a/tools/libxc/xc_linux_osdep.c b/tools/libxc/xc_linux_osdep.c
> >> index dca6718..3040cb6 100644
> >> --- a/tools/libxc/xc_linux_osdep.c
> >> +++ b/tools/libxc/xc_linux_osdep.c
> >> @@ -613,6 +613,62 @@ static void 
> >> *linux_gnttab_map_domain_grant_refs(xc_gnttab *xcg, xc_osdep_handle
> >>      return do_gnttab_map_grant_refs(xcg, h, count, &domid, 0, refs, prot);
> >>  }
> >>  
> >> +static void *linux_gnttab_map_grant_ref_notify(xc_gnttab *xch, 
> >> xc_osdep_handle h,
> >> +                                               uint32_t domid, uint32_t 
> >> ref,
> >> +                                               uint32_t notify_offset,
> >> +                                               evtchn_port_t notify_port,
> >> +                                               int *notify_result)
> >> +{
> >> +    int fd = (int)h;
> >> +    int rv = 0;
> >> +    struct ioctl_gntdev_map_grant_ref map;
> >> +    struct ioctl_gntdev_unmap_notify notify;
> >> +    void *addr;
> >> +
> >> +    map.count = 1;
> >> +    map.refs[0].domid = domid;
> >> +    map.refs[0].ref = ref;
> >> +
> >> +    if ( ioctl(fd, IOCTL_GNTDEV_MAP_GRANT_REF, &map) ) {
> >> +        PERROR("xc_gnttab_map_grant_ref: ioctl MAP_GRANT_REF failed");
> >> +        return NULL;
> >> +    }
> >> +
> >> +    addr = mmap(NULL, XC_PAGE_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 
> >> map.index);
> >> +    if ( addr == MAP_FAILED )
> >> +    {
> >> +        int saved_errno = errno;
> >> +        struct ioctl_gntdev_unmap_grant_ref unmap_grant;
> >> +
> >> +        PERROR("xc_gnttab_map_grant_ref: mmap failed");
> >> +        unmap_grant.index = map.index;
> >> +        unmap_grant.count = 1;
> >> +        ioctl(fd, IOCTL_GNTDEV_UNMAP_GRANT_REF, &unmap_grant);
> >> +        errno = saved_errno;
> >> +        return NULL;
> >> +    }
> > 
> > The non-notify variant handles EAGAIN, why doesn't this one need to do
> > so?
> 
> The non-notify variant doesn't need to handle EAGAIN anymore (with modern
> kernels, at least... perhaps it should remain for older kernels). Also,
> do_gnttab_map_grant_refs does not handle EAGAIN.

OK I guess that is fine (although if you combine them as I suggest it
comes back?)

I hadn't noticed that we have both map_gratn_ref and map_grant_refs,
that seems like an area which could be cleaned up. (not saying you
should, just noticing it)

> 
> >> +
> >> +    notify.index = map.index;
> >> +    notify.action = 0;
> >> +    if (notify_offset >= 0) {
> >> +        notify.index += notify_offset;
> >> +        notify.action |= UNMAP_NOTIFY_CLEAR_BYTE;
> >> +    }
> >> +    if (notify_port >= 0) {
> >> +        notify.event_channel_port = notify_port;
> >> +        notify.action |= UNMAP_NOTIFY_SEND_EVENT;
> >> +    }
> >> +    if (notify.action)
> >> +        rv = ioctl(fd, IOCTL_GNTDEV_SET_UNMAP_NOTIFY, &notify);
> > 
> > Is there a race if the other end (or this process) dies between the MAP
> > ioctl and here?
> > 
> > Ian.
> > 
> 
> Technically it's a race, but it doesn't impact any reasonable use of the
> notification. The local process can't actually be using the shared page
> at this point, and the other side will not be certain that the map has
> actually succeeded until after the function returns (and it is notified
> in some way - libvchan changes the notify byte from 2->1 at this point).
> 
> If the domain whose memory we are mapping crashes, this ioctl will succeed
> unless the event channel it refers to has already been invalidated - but
> either way, the notifications are now irrelevant as there is nobody to
> listen to them.

OK.

Thanks,
Ian.

> 




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel