[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 10/15] argo: implement the notify op



 On Mon, Jan 7, 2019 at 8:44 AM Christopher Clark
<christopher.w.clark@xxxxxxxxx> wrote:
>
> Queries for data about space availability in registered rings and
> causes notification to be sent when space has become available.
>
> The hypercall op populates a supplied data structure with information about
> ring state, and if insufficent space is currently available in a given ring,

insufficient

> the hypervisor will record the domain's expressed interest and notify it
> when it observes that space has become available.
>
> Checks for free space occur when this notify op is invoked, so it may be
> intentionally invoked with no data structure to populate
> (ie. a NULL argument) to trigger such a check and consequent notifications.
>
> Limit the maximum number of notify requests in a single operation to a
> simple fixed limit of 256.
>
> Signed-off-by: Christopher Clark <christopher.clark6@xxxxxxxxxxxxxx>
> ---
> v2 feedback Jan: drop cookie, implement teardown
> v2 notify: add flag to indicate ring is shared
> v2 argument name for fill_ring_data arg is now currd
> v2 self: check ring size vs request and flag error rather than queue signal
> v2 feedback Jan: drop 'message' from 'argo_message_op'
> v2 self: simplify signal_domid, drop unnecessary label + goto
> v2 self: skip the cookie check in pending_cancel
> v2 self: implement npending limit on number of pending entries
> v1 feedback #16 Jan: sanitize_ring in ringbuf_payload_space
> v2 self: inline fill_ring_data_array
> v2 self: avoid retesting dst_d for put_domain
> v2 self/Jan: remove use of magic verification field and tidy up
> v1 feedback #16 Jan: remove testing of magic in guest-supplied structure
> v2 self: s/argo_pending_ent/pending_ent/g
> v2 feedback v1#13 Roger: use OS-supplied roundup; drop from public header
> v1,2 feedback Jan/Roger/Paul: drop errno returning guest access functions
> v1 feedback Roger, Jan: drop argo prefix on static functions
> v2 self: reduce indentation via goto out if arg NULL
> v1 feedback #13 Jan: resolve checking of array handle and use of __copy
>
> v1 #5 (#16) feedback Paul: notify op: use currd in do_argo_message_op
> v1 #5 (#16) feedback Paul: notify op: use currd in argo_notify
> v1 #5 (#16) feedback Paul: notify op: use currd in argo_notify_check_pending
> v1 #5 (#16) feedback Paul: notify op: use currd in argo_fill_ring_data_array
> v1 #13 (#16) feedback Paul: notify op: do/while: reindent only
> v1 #13 (#16) feedback Paul: notify op: do/while: goto
> v1 : add compat xlat.lst entries
> v1: add definition for copy_field_from_guest_errno
> v1 #13 feedback Jan: make 'ring data' comment comply with single-line style
> v1 feedback #13 Jan: use __copy; so define and use __copy_field_to_guest_errno
> v1: #13 feedback Jan: public namespace: prefix with xen
> v1: #13 feedback Jan: add blank line after case in do_argo_message_op
> v1: self: rename ent id to domain_id
> v1: self: ent id-> domain_id
> v1: self: drop signal if domain_cookie mismatches
> v1. feedback #15 Jan: make loop i unsigned
> v1. self: drop unnecessary mb() in argo_notify_check_pending
> v1. self: add blank line
> v1 #16 feedback Jan: const domain arg to +argo_fill_ring_data
> v1. feedback #15 Jan: check unusued hypercall args are zero
> v1 feedback #16 Jan: add comment on space available signal policy
> v1. feedback #16 Jan: move declr, drop braces, lower indent
> v1. feedback #18 Jan: meld the resource limits into the main commit
> v1. feedback #16 Jan: clarify use of magic field
> v1. self: use single copy to read notify ring data struct
> v1: argo_fill_ring_data: fix dprintk types for port field
> v1: self: use %x for printing port as per other print sites
> v1. feedback Jan: add comments explaining ring full vs empty
> v1. following Jan: fix argo_ringbuf_payload_space calculation for empty ring
>
>  xen/common/argo.c         | 359 
> ++++++++++++++++++++++++++++++++++++++++++++++
>  xen/include/public/argo.h |  67 +++++++++
>  xen/include/xlat.lst      |   2 +
>  3 files changed, 428 insertions(+)
>
> diff --git a/xen/common/argo.c b/xen/common/argo.c
> index 4548435..37eb291 100644
> --- a/xen/common/argo.c
> +++ b/xen/common/argo.c
> @@ -29,6 +29,7 @@
>  #include <public/argo.h>
>
>  #define MAX_RINGS_PER_DOMAIN            128U
> +#define MAX_NOTIFY_COUNT                256U
>  #define MAX_PENDING_PER_RING             32U
>
>  /* All messages on the ring are padded to a multiple of the slot size. */
> @@ -43,6 +44,8 @@ DEFINE_XEN_GUEST_HANDLE(xen_argo_iov_t);
>  DEFINE_XEN_GUEST_HANDLE(xen_argo_page_descr_t);
>  DEFINE_XEN_GUEST_HANDLE(xen_argo_register_ring_t);
>  DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_t);
> +DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_data_t);
> +DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_data_ent_t);
>  DEFINE_XEN_GUEST_HANDLE(xen_argo_send_addr_t);
>  DEFINE_XEN_GUEST_HANDLE(xen_argo_unregister_ring_t);
>
> @@ -231,6 +234,13 @@ static DEFINE_RWLOCK(argo_lock); /* L1 */
>  #define argo_dprintk(format, ... ) ((void)0)
>  #endif
>
> +static struct argo_ring_info *
> +ring_find_info(const struct domain *d, const struct argo_ring_id *id);
> +
> +static struct argo_ring_info *
> +ring_find_info_by_match(const struct domain *d, uint32_t port,
> +                        domid_t partner_id);

Can you place the static functions such that you don't need prototypes for them?

> +
>  /*
>   * This hash function is used to distribute rings within the per-domain
>   * hash tables (d->argo->ring_hash and d->argo_send_hash). The hash table
> @@ -265,6 +275,17 @@ signal_domain(struct domain *d)
>  }
>
>  static void
> +signal_domid(domid_t domain_id)
> +{
> +    struct domain *d = get_domain_by_id(domain_id);

Newline.

> +    if ( !d )
> +        return;
> +
> +    signal_domain(d);
> +    put_domain(d);
> +}
> +
> +static void
>  ring_unmap(struct argo_ring_info *ring_info)
>  {
>      unsigned int i;
> @@ -473,6 +494,62 @@ get_sanitized_ring(xen_argo_ring_t *ring, struct 
> argo_ring_info *ring_info)
>      return 0;
>  }
>
> +static uint32_t
> +ringbuf_payload_space(struct domain *d, struct argo_ring_info *ring_info)
> +{
> +    xen_argo_ring_t ring;
> +    uint32_t len;
> +    int32_t ret;

You use a signed type to internally store the return value, but the
return type from the function itself is unsigned. Is it guaranteed
that ret < INT32_MAX always?

> +
> +    ASSERT(spin_is_locked(&ring_info->lock));
> +
> +    len = ring_info->len;
> +    if ( !len )
> +        return 0;
> +
> +    ret = get_sanitized_ring(&ring, ring_info);
> +    if ( ret )
> +        return 0;
> +
> +    argo_dprintk("sanitized ringbuf_payload_space: tx_ptr=%d rx_ptr=%d\n",
> +                 ring.tx_ptr, ring.rx_ptr);
> +
> +    /*
> +     * rx_ptr == tx_ptr means that the ring has been emptied, so return
> +     * the maximum payload size that can be accepted -- see message size
> +     * checking logic in the entry to ringbuf_insert which ensures that
> +     * there is always one message slot (of size ROUNDUP_MESSAGE(1)) left
> +     * available, preventing a ring from being entirely filled. This ensures
> +     * that matching ring indexes always indicate an empty ring and not a
> +     * full one.
> +     * The subtraction here will not underflow due to minimum size 
> constraints
> +     * enforced on ring size elsewhere.
> +     */
> +    if ( ring.rx_ptr == ring.tx_ptr )
> +        return len - sizeof(struct xen_argo_ring_message_header)
> +                   - ROUNDUP_MESSAGE(1);

Why not do something like:

ret = ring.rx_ptr - ring.tx_ptr;
if ( ret <= 0)
    ret += len;

Instead of this early return?

The only difference when the ring is full is that len should be used
instead of the ptr difference.

> +
> +    ret = ring.rx_ptr - ring.tx_ptr;
> +    if ( ret < 0 )
> +        ret += len;
> +
> +    /*
> +     * The maximum size payload for a message that will be accepted is:
> +     * (the available space between the ring indexes)
> +     *    minus (space for a message header)
> +     *    minus (space for one message slot)
> +     * since ringbuf_insert requires that one message slot be left
> +     * unfilled, to avoid filling the ring to capacity and confusing a full
> +     * ring with an empty one.
> +     * Since the ring indexes are sanitized, the value in ret is aligned, so
> +     * the simple subtraction here works to return the aligned value needed:
> +     */
> +    ret -= sizeof(struct xen_argo_ring_message_header);
> +    ret -= ROUNDUP_MESSAGE(1);
> +
> +    return (ret < 0) ? 0 : ret;
> +}
> +
>  /*
>   * iov_count returns its count on success via an out variable to avoid
>   * potential for a negative return value to be used incorrectly
> @@ -812,6 +889,61 @@ pending_remove_all(struct argo_ring_info *ring_info)
>      ring_info->npending = 0;
>  }
>
> +static void
> +pending_notify(struct hlist_head *to_notify)
> +{
> +    struct hlist_node *node, *next;
> +    struct pending_ent *ent;
> +
> +    ASSERT(rw_is_locked(&argo_lock));
> +
> +    hlist_for_each_entry_safe(ent, node, next, to_notify, node)
> +    {
> +        hlist_del(&ent->node);
> +        signal_domid(ent->domain_id);
> +        xfree(ent);
> +    }
> +}
> +
> +static void
> +pending_find(const struct domain *d, struct argo_ring_info *ring_info,
> +             uint32_t payload_space, struct hlist_head *to_notify)
> +{
> +    struct hlist_node *node, *next;
> +    struct pending_ent *ent;
> +
> +    ASSERT(rw_is_locked(&d->argo->lock));
> +
> +    /*
> +     * TODO: Current policy here is to signal _all_ of the waiting domains
> +     *       interested in sending a message of size less than payload_space.
> +     *
> +     * This is likely to be suboptimal, since once one of them has added
> +     * their message to the ring, there may well be insufficient room
> +     * available for any of the others to transmit, meaning that they were
> +     * woken in vain, which created extra work just to requeue their wait.
> +     *
> +     * Retain this simple policy for now since it at least avoids starving a
> +     * domain of available space notifications because of a policy that only
> +     * notified other domains instead. Improvement may be possible;
> +     * investigation required.
> +     */
> +
> +    spin_lock(&ring_info->lock);
> +    hlist_for_each_entry_safe(ent, node, next, &ring_info->pending, node)
> +    {
> +        if ( payload_space >= ent->len )
> +        {
> +            if ( ring_info->id.partner_id == XEN_ARGO_DOMID_ANY )
> +                wildcard_pending_list_remove(ent->domain_id, ent);
> +            hlist_del(&ent->node);
> +            ring_info->npending--;
> +            hlist_add_head(&ent->node, to_notify);
> +        }
> +    }
> +    spin_unlock(&ring_info->lock);
> +}
> +
>  static int
>  pending_queue(struct argo_ring_info *ring_info, domid_t src_id,
>                unsigned int len)
> @@ -874,6 +1006,27 @@ pending_requeue(struct argo_ring_info *ring_info, 
> domid_t src_id,
>  }
>
>  static void
> +pending_cancel(struct argo_ring_info *ring_info, domid_t src_id)
> +{
> +    struct hlist_node *node, *next;
> +    struct pending_ent *ent;
> +
> +    ASSERT(spin_is_locked(&ring_info->lock));
> +
> +    hlist_for_each_entry_safe(ent, node, next, &ring_info->pending, node)
> +    {
> +        if ( ent->domain_id == src_id )
> +        {
> +            if ( ring_info->id.partner_id == XEN_ARGO_DOMID_ANY )
> +                wildcard_pending_list_remove(ent->domain_id, ent);
> +            hlist_del(&ent->node);
> +            xfree(ent);
> +            ring_info->npending--;
> +        }
> +    }
> +}
> +
> +static void
>  wildcard_rings_pending_remove(struct domain *d)
>  {
>      struct hlist_node *node, *next;
> @@ -994,6 +1147,92 @@ partner_rings_remove(struct domain *src_d)
>  }
>
>  static int
> +fill_ring_data(const struct domain *currd,
> +               XEN_GUEST_HANDLE(xen_argo_ring_data_ent_t) data_ent_hnd)
> +{
> +    xen_argo_ring_data_ent_t ent;
> +    struct domain *dst_d;
> +    struct argo_ring_info *ring_info;
> +    int ret;
> +
> +    ASSERT(rw_is_locked(&argo_lock));
> +
> +    ret = __copy_from_guest(&ent, data_ent_hnd, 1) ? -EFAULT : 0;
> +    if ( ret )
> +        goto out;

if ( __copy_from_guest(&ent, data_ent_hnd, 1) )
    return -EFAULT;

And you can get rid of the out label.

> +
> +    argo_dprintk("fill_ring_data: ent.ring.domain=%u,ent.ring.port=%x\n",
> +                 ent.ring.domain_id, ent.ring.port);
> +
> +    ent.flags = 0;

Please memset ent to 0 or initialize it to { }, or else you are
leaking hypervisor stack data to the guest in the padding field.

> +
> +    dst_d = get_domain_by_id(ent.ring.domain_id);
> +    if ( dst_d )
> +    {
> +        if ( dst_d->argo )
> +        {
> +            read_lock(&dst_d->argo->lock);
> +
> +            ring_info = ring_find_info_by_match(dst_d, ent.ring.port,
> +                                                currd->domain_id);
> +            if ( ring_info )
> +            {
> +                uint32_t space_avail;
> +
> +                ent.flags |= XEN_ARGO_RING_DATA_F_EXISTS;
> +                ent.max_message_size = ring_info->len -
> +                                   sizeof(struct 
> xen_argo_ring_message_header) -
> +                                   ROUNDUP_MESSAGE(1);
> +
> +                if ( ring_info->id.partner_id == XEN_ARGO_DOMID_ANY )
> +                    ent.flags |= XEN_ARGO_RING_DATA_F_SHARED;
> +
> +                spin_lock(&ring_info->lock);
> +
> +                space_avail = ringbuf_payload_space(dst_d, ring_info);
> +
> +                argo_dprintk("fill_ring_data: port=%x space_avail=%u"
> +                             " space_wanted=%u\n",
> +                             ring_info->id.port, space_avail,
> +                             ent.space_required);
> +
> +                /* Do not queue a notification for an unachievable size */
> +                if ( ent.space_required > ent.max_message_size )
> +                    ent.flags |= XEN_ARGO_RING_DATA_F_EMSGSIZE;
> +                else if ( space_avail >= ent.space_required )
> +                {
> +                    pending_cancel(ring_info, currd->domain_id);
> +                    ent.flags |= XEN_ARGO_RING_DATA_F_SUFFICIENT;
> +                }
> +                else
> +                {
> +                    pending_requeue(ring_info, currd->domain_id,
> +                                    ent.space_required);
> +                    ent.flags |= XEN_ARGO_RING_DATA_F_PENDING;
> +                }
> +
> +                spin_unlock(&ring_info->lock);
> +
> +                if ( space_avail == ent.max_message_size )
> +                    ent.flags |= XEN_ARGO_RING_DATA_F_EMPTY;
> +
> +            }
> +            read_unlock(&dst_d->argo->lock);
> +        }
> +        put_domain(dst_d);
> +    }
> +
> +    ret = __copy_field_to_guest(data_ent_hnd, &ent, flags) ? -EFAULT : 0;
> +    if ( ret )
> +        goto out;
> +
> +    ret = __copy_field_to_guest(data_ent_hnd, &ent, max_message_size) ?
> +                -EFAULT : 0;
> + out:
> +    return ret;
> +}
> +
> +static int
>  find_ring_mfn(struct domain *d, gfn_t gfn, mfn_t *mfn)
>  {
>      p2m_type_t p2mt;
> @@ -1526,6 +1765,111 @@ register_ring(struct domain *currd,
>      return ret;
>  }
>
> +static void
> +notify_ring(struct domain *d, struct argo_ring_info *ring_info,
> +            struct hlist_head *to_notify)
> +{
> +    uint32_t space;
> +
> +    ASSERT(rw_is_locked(&argo_lock));
> +    ASSERT(rw_is_locked(&d->argo->lock));
> +
> +    spin_lock(&ring_info->lock);
> +
> +    if ( ring_info->len )
> +        space = ringbuf_payload_space(d, ring_info);
> +    else
> +        space = 0;
> +
> +    spin_unlock(&ring_info->lock);
> +
> +    if ( space )
> +        pending_find(d, ring_info, space, to_notify);
> +}
> +
> +static void
> +notify_check_pending(struct domain *currd)
> +{
> +    unsigned int i;
> +    HLIST_HEAD(to_notify);
> +
> +    ASSERT(rw_is_locked(&argo_lock));
> +
> +    read_lock(&currd->argo->lock);
> +
> +    for ( i = 0; i < ARGO_HTABLE_SIZE; i++ )
> +    {
> +        struct hlist_node *node, *next;
> +        struct argo_ring_info *ring_info;
> +
> +        hlist_for_each_entry_safe(ring_info, node, next,
> +                                  &currd->argo->ring_hash[i], node)
> +        {
> +            notify_ring(currd, ring_info, &to_notify);
> +        }

No need for the braces.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.