[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [PATCH v2 11/18] argo: implement the register op



Used by a domain to register a region of memory for receiving messages from
either a specified other domain, or, if specifying a wildcard, any domain.

This operation creates a mapping within Xen's private address space that
will remain resident for the lifetime of the ring. In subsequent commits,
the hypervisor will use this mapping to copy data from a sending domain into
this registered ring, making it accessible to the domain that registered the
ring to receive data.

In this code, the p2m type of the memory supplied by the guest for the ring
must be p2m_ram_rw, which is a conservative choice made to defer the need to
reason about the other p2m types with this commit.

xen_argo_page_descr_t type is introduced as a page descriptor, to convey
both the physical address of the start of the page and its granularity. The
smallest granularity page is assumed to be 4096 bytes and the lower twelve
bits of the type are used for indicate an enumerated page size.
xen_argo_pfn_t type is introduced here to create a pfn_t type that is 64-bit
on all architectures, to assist with avoiding the need to add a compat ABI.

copy_field_to_guest_errno is added for guest access, performing the same
operation as copy_field_to_guest, but returning -EFAULT if the copy is
incomplete. Added to common code to simplify code at call sites.

Uses array_index_nospec to guard the result of the hash function.
This is out of an abundance of caution, since this is a very basic hash
function, chosen more for its bucket distribution properties to cluster
related rings rather than for cryptographic strength or any uniformness of
output, and it operates upon values supplied by the guest just before being
used as an array index.

Signed-off-by: Christopher Clark <christopher.clark6@xxxxxxxxxxxxxx>
---
Changes since v1:

v1 #13 feedback, Jan: register op : s/ECONNREFUSED/ESRCH/
v1 #5 (#13) feedback Paul: register op: use currd in do_message_op
v1 #13 feedback, Paul: register op: use mfn_eq comparator
v1 #5 (#13) feedback Paul: register op: use currd in argo_register_ring
v1 #13 feedback Paul: register op: whitespace, unsigned, bounds check
v1 #13 feedback Paul: use of hex in limit constant definition
v1 #13 feedback Paul, register op: set nmfns on loop termination
v1 #13 feedback Paul: register op: do/while -> gotos, reindent
v1 argo_ring_map_page: drop uint32_t for unsigned int
v1. #13 feedback Julien: use page descriptors instead of gpfns.
   - adds ABI support for pages with different granularity.
v1 feedback #13, Paul: adjust log level of message
v1 feedback #13, Paul: use gprintk for guest-triggered warning
v1 feedback #13, Paul: gprintk and XENLOG_DEBUG for ring registration
v1 feedback #13, Paul: use gprintk for errs in argo_ring_map_page
v1 feedback #13, Paul: use ENOMEM if global mapping fails
v1 feedback Paul: overflow check before shift
v1: add define for copy_field_to_guest_errno
v1: fix gprintk use for ARM as its defn dislikes split format strings
v1: use copy_field_to_guest_errno
v1 feedback #13, Jan: argo_hash_fn: no inline, rename, change type
v1 feedback #13, Paul, Jan: EFAULT -> ENOMEM in argo_ring_map_page
v1 feedback #13, Jan: rename page var in argo_ring_map_page
v1 feedback #13, Jan: switch uint8_t* to void* and drop cast
v1 feedback #13, Jan: switch memory barrier to smp_wmb
v1 feedback #13, Jan: make 'ring' comment comply with single-line style
v1 feedback #13, Jan: use xzalloc_array, drop loop NULL init
v1 feedback #13, Jan: init bool with false rather than 0
v1 feedback #13 Jan: use __copy; define and use __copy_field_to_guest_errno
v1 feedback #13, Jan: use xzalloc, drop individual init zeroes
v1 feedback #13, Jan: prefix public namespace with xen
v1 feedback #13, Jan: blank line after op case in do_argo_message_op
v1 self: reflow comment in argo_ring_map_page to within 80 char len
v1 feedback #13, Roger: use true not 1 in assign to update_tx_ptr bool
v1 feedback #21, Jan: fold in the array_index_nospec hash function guards
v1 feedback #18, Jan: fold the max ring count limit into the series
v1 self: use unsigned long type for XEN_ARGO_REGISTER_FLAG_MASK
v1: feedback #15 Jan: handle upper-halves of hypercall args
v1. feedback #13 Jan: add comment re: page alignment
v1. self: confirm ring magic presence in supplied page array
v1. feedback #13 Jan: add comment re: minimum ring size
v1. feedback #13 Roger: use ASSERT_UNREACHABLE
v1. feedback Roger: add comment to hash function

 xen/common/argo.c                  | 621 +++++++++++++++++++++++++++++++++++++
 xen/include/asm-arm/guest_access.h |  12 +
 xen/include/asm-x86/guest_access.h |  12 +
 xen/include/public/argo.h          |  71 +++++
 4 files changed, 716 insertions(+)

diff --git a/xen/common/argo.c b/xen/common/argo.c
index abfc1f0..81f8341 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -23,14 +23,19 @@
 #include <xen/event.h>
 #include <xen/domain_page.h>
 #include <xen/guest_access.h>
+#include <xen/nospec.h>
 #include <xen/time.h>
 #include <public/argo.h>
 
 #define ARGO_MAX_RINGS_PER_DOMAIN       128U
 
+DEFINE_XEN_GUEST_HANDLE(xen_argo_page_descr_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_addr_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_t);
 
+/* pfn type: 64-bit on all architectures */
+typedef uint64_t argo_pfn_t;
+
 /* Xen command line option to enable argo */
 static bool __read_mostly opt_argo_enabled;
 boolean_param("argo", opt_argo_enabled);
@@ -104,6 +109,31 @@ struct argo_domain
 };
 
 /*
+ * This hash function is used to distribute rings within the per-domain
+ * hash table (d->argo->ring_hash). The hash table will provide a
+ * 'ring_info' struct if a match is found with a 'xen_argo_ring_id' key:
+ * ie. the key is a (domain id, port, partner domain id) tuple.
+ * There aren't many hash table buckets, and this doesn't need to be
+ * cryptographically robust. Since port number varies the most in
+ * expected use, and the Linux driver allocates at both the high and
+ * low ends, incorporate high and low bits to help with distribution.
+ */
+static unsigned int
+argo_hash(const struct xen_argo_ring_id *id)
+{
+    unsigned int ret;
+
+    ret = (uint16_t)(id->addr.port >> 16);
+    ret ^= (uint16_t)id->addr.port;
+    ret ^= id->addr.domain_id;
+    ret ^= id->partner;
+
+    ret &= (ARGO_HTABLE_SIZE - 1);
+
+    return ret;
+}
+
+/*
  * locks
  */
 
@@ -177,6 +207,84 @@ argo_ring_unmap(struct argo_ring_info *ring_info)
     }
 }
 
+/* caller must have L3 or W(L2) */
+static int
+argo_ring_map_page(struct argo_ring_info *ring_info, unsigned int i,
+                   void **out_ptr)
+{
+    if ( i >= ring_info->nmfns )
+    {
+        gprintk(XENLOG_ERR,
+            "argo: ring (vm%u:%x vm%d) %p attempted to map page  %u of %u\n",
+            ring_info->id.addr.domain_id, ring_info->id.addr.port,
+            ring_info->id.partner, ring_info, i, ring_info->nmfns);
+        return -ENOMEM;
+    }
+
+    if ( !ring_info->mfns || !ring_info->mfn_mapping)
+    {
+        ASSERT_UNREACHABLE();
+        ring_info->len = 0;
+        return -ENOMEM;
+    }
+
+    if ( !ring_info->mfn_mapping[i] )
+    {
+        /*
+         * TODO:
+         * The first page of the ring contains the ring indices, so both read
+         * and write access to the page is required by the hypervisor, but
+         * read-access is not needed for this mapping for the remainder of the
+         * ring.
+         * Since this mapping will remain resident in Xen's address space for
+         * the lifetime of the ring, and following the principle of least
+         * privilege, it could be preferable to:
+         *  # add a XSM check to determine what policy is wanted here
+         *  # depending on the XSM query, optionally create this mapping as
+         *    _write-only_ on platforms that can support it.
+         *    (eg. Intel EPT/AMD NPT).
+         */
+        ring_info->mfn_mapping[i] = map_domain_page_global(ring_info->mfns[i]);
+
+        if ( !ring_info->mfn_mapping[i] )
+        {
+            gprintk(XENLOG_ERR,
+                "argo: ring (vm%u:%x vm%d) %p attempted to map page %u of 
%u\n",
+                ring_info->id.addr.domain_id, ring_info->id.addr.port,
+                ring_info->id.partner, ring_info, i, ring_info->nmfns);
+            return -ENOMEM;
+        }
+        argo_dprintk("mapping page %"PRI_mfn" to %p\n",
+               mfn_x(ring_info->mfns[i]), ring_info->mfn_mapping[i]);
+    }
+
+    if ( out_ptr )
+        *out_ptr = ring_info->mfn_mapping[i];
+
+    return 0;
+}
+
+/* caller must have L3 or W(L2) */
+static int
+argo_update_tx_ptr(struct argo_ring_info *ring_info, uint32_t tx_ptr)
+{
+    void *dst;
+    uint32_t *p;
+    int ret;
+
+    ret = argo_ring_map_page(ring_info, 0, &dst);
+    if ( ret )
+        return ret;
+
+    ring_info->tx_ptr = tx_ptr;
+
+    p = dst + offsetof(xen_argo_ring_t, tx_ptr);
+    write_atomic(p, tx_ptr);
+    smp_wmb();
+
+    return 0;
+}
+
 /*
  * pending
  */
@@ -240,6 +348,488 @@ argo_ring_remove_info(struct domain *d, struct 
argo_ring_info *ring_info)
     xfree(ring_info);
 }
 
+/* ring */
+
+static int
+argo_find_ring_mfn(struct domain *d, argo_pfn_t pfn, mfn_t *mfn)
+{
+    p2m_type_t p2mt;
+    int ret = 0;
+
+#ifdef CONFIG_X86
+    *mfn = get_gfn_unshare(d, pfn, &p2mt);
+#else
+    *mfn = p2m_lookup(d, _gfn(pfn), &p2mt);
+#endif
+
+    if ( !mfn_valid(*mfn) )
+        ret = -EINVAL;
+#ifdef CONFIG_X86
+    else if ( p2m_is_paging(p2mt) || (p2mt == p2m_ram_logdirty) )
+        ret = -EAGAIN;
+#endif
+    else if ( (p2mt != p2m_ram_rw) ||
+              !get_page_and_type(mfn_to_page(*mfn), d, PGT_writable_page) )
+        ret = -EINVAL;
+
+#ifdef CONFIG_X86
+    put_gfn(d, pfn);
+#endif
+
+    return ret;
+}
+
+static int
+argo_find_ring_mfns(struct domain *d, struct argo_ring_info *ring_info,
+                    uint32_t npage,
+                    XEN_GUEST_HANDLE_PARAM(xen_argo_page_descr_t) pg_descr_hnd,
+                    uint32_t len)
+{
+    unsigned int i;
+    int ret = 0;
+
+    /*
+     * first bounds check on npage here also serves as an overflow check
+     * before left shifting it
+     */
+    if ( (unlikely(npage > (XEN_ARGO_MAX_RING_SIZE >> PAGE_SHIFT))) ||
+         ((npage << PAGE_SHIFT) < len) )
+        return -EINVAL;
+
+    if ( ring_info->mfns )
+    {
+        /*
+         * Ring already existed. Check if it's the same ring,
+         * i.e. same number of pages and all translated gpfns still
+         * translating to the same mfns
+         */
+        if ( ring_info->npage != npage )
+            i = ring_info->nmfns + 1; /* forces re-register below */
+        else
+        {
+            for ( i = 0; i < ring_info->nmfns; i++ )
+            {
+                xen_argo_page_descr_t pg_descr;
+                argo_pfn_t pfn;
+                mfn_t mfn;
+
+                ret = copy_from_guest_offset_errno(&pg_descr, pg_descr_hnd,
+                                                   i, 1);
+                if ( ret )
+                    break;
+
+                /* Implementation currently only supports handling 4K pages */
+                if ( (pg_descr & XEN_ARGO_PAGE_DESCR_SIZE_MASK) !=
+                        XEN_ARGO_PAGE_DESCR_SIZE_4K )
+                {
+                    ret = -EINVAL;
+                    break;
+                }
+                pfn = pg_descr >> PAGE_SHIFT;
+
+                ret = argo_find_ring_mfn(d, pfn, &mfn);
+                if ( ret )
+                    break;
+
+                if ( !mfn_eq(mfn, ring_info->mfns[i]) )
+                    break;
+            }
+        }
+        if ( i != ring_info->nmfns )
+        {
+            /* Re-register is standard procedure after resume */
+            gprintk(XENLOG_INFO,
+        "argo: vm%u re-register existing ring (vm%u:%x vm%d) clears MFN 
list\n",
+                    current->domain->domain_id, ring_info->id.addr.domain_id,
+                    ring_info->id.addr.port, ring_info->id.partner);
+
+            argo_ring_remove_mfns(d, ring_info);
+            ASSERT(!ring_info->mfns);
+        }
+    }
+
+    if ( !ring_info->mfns )
+    {
+        mfn_t *mfns;
+        uint8_t **mfn_mapping;
+
+        mfns = xmalloc_array(mfn_t, npage);
+        if ( !mfns )
+            return -ENOMEM;
+
+        for ( i = 0; i < npage; i++ )
+            mfns[i] = INVALID_MFN;
+
+        mfn_mapping = xzalloc_array(uint8_t *, npage);
+        if ( !mfn_mapping )
+        {
+            xfree(mfns);
+            return -ENOMEM;
+        }
+
+        ring_info->npage = npage;
+        ring_info->mfns = mfns;
+        ring_info->mfn_mapping = mfn_mapping;
+    }
+    ASSERT(ring_info->npage == npage);
+
+    if ( ring_info->nmfns == ring_info->npage )
+        return 0;
+
+    for ( i = ring_info->nmfns; i < ring_info->npage; i++ )
+    {
+        xen_argo_page_descr_t pg_descr;
+        argo_pfn_t pfn;
+        mfn_t mfn;
+
+        ret = copy_from_guest_offset_errno(&pg_descr, pg_descr_hnd, i, 1);
+        if ( ret )
+            break;
+
+        /* Implementation currently only supports handling 4K pages */
+        if ( (pg_descr & XEN_ARGO_PAGE_DESCR_SIZE_MASK) !=
+                XEN_ARGO_PAGE_DESCR_SIZE_4K )
+        {
+            ret = -EINVAL;
+            break;
+        }
+        pfn = pg_descr >> PAGE_SHIFT;
+
+        ret = argo_find_ring_mfn(d, pfn, &mfn);
+        if ( ret )
+        {
+            gprintk(XENLOG_ERR,
+        "argo: vm%u: invalid gpfn %"PRI_xen_pfn" r:(vm%u:%x vm%d) %p %d/%d\n",
+                    d->domain_id, pfn, ring_info->id.addr.domain_id,
+                    ring_info->id.addr.port, ring_info->id.partner,
+                    ring_info, i, ring_info->npage);
+            break;
+        }
+
+        ring_info->mfns[i] = mfn;
+
+        argo_dprintk("%d: %"PRI_xen_pfn" -> %"PRI_mfn"\n",
+               i, pfn, mfn_x(ring_info->mfns[i]));
+    }
+
+    ring_info->nmfns = i;
+
+    if ( ret )
+        argo_ring_remove_mfns(d, ring_info);
+    else
+    {
+        ASSERT(ring_info->nmfns == ring_info->npage);
+
+        gprintk(XENLOG_DEBUG,
+        "argo: vm%u ring (vm%u:%x vm%d) %p mfn_mapping %p npage %d nmfns %d\n",
+                current->domain->domain_id,
+                ring_info->id.addr.domain_id, ring_info->id.addr.port,
+                ring_info->id.partner, ring_info, ring_info->mfn_mapping,
+                ring_info->npage, ring_info->nmfns);
+    }
+
+    return ret;
+}
+
+static struct argo_ring_info *
+argo_ring_find_info(const struct domain *d, const struct xen_argo_ring_id *id)
+{
+    unsigned int hash;
+    struct hlist_node *node;
+    struct argo_ring_info *ring_info;
+
+    ASSERT(rw_is_locked(&d->argo->lock));
+
+    hash = array_index_nospec(argo_hash(id), ARGO_HTABLE_SIZE);
+
+    argo_dprintk("d->argo=%p, d->argo->ring_hash[%u]=%p id=%p\n",
+                 d->argo, hash, d->argo->ring_hash[hash].first, id);
+    argo_dprintk("id.addr.port=%x id.addr.domain=vm%u"
+                 " id.addr.partner=vm%d\n",
+                 id->addr.port, id->addr.domain_id, id->partner);
+
+    hlist_for_each_entry(ring_info, node, &d->argo->ring_hash[hash], node)
+    {
+        xen_argo_ring_id_t *cmpid = &ring_info->id;
+
+        if ( cmpid->addr.port == id->addr.port &&
+             cmpid->addr.domain_id == id->addr.domain_id &&
+             cmpid->partner == id->partner )
+        {
+            argo_dprintk("ring_info=%p\n", ring_info);
+            return ring_info;
+        }
+    }
+    argo_dprintk("no ring_info found\n");
+
+    return NULL;
+}
+
+static int
+argo_verify_ring_magic(struct argo_ring_info *ring_info)
+{
+    void *dst;
+    xen_argo_ring_t *ring;
+    int ret;
+
+    ret = argo_ring_map_page(ring_info, 0, &dst);
+    if ( ret )
+        return ret;
+
+    ring = dst;
+    mb();
+
+    if ( ring->magic != XEN_ARGO_RING_MAGIC )
+        return -EINVAL;
+
+    return 0;
+}
+
+static long
+argo_register_ring(struct domain *currd,
+                   XEN_GUEST_HANDLE_PARAM(xen_argo_ring_t) ring_hnd,
+                   XEN_GUEST_HANDLE_PARAM(xen_argo_page_descr_t) pg_descr_hnd,
+                   uint32_t npage, bool fail_exist)
+{
+    struct xen_argo_ring ring;
+    struct argo_ring_info *ring_info;
+    int ret = 0;
+    bool update_tx_ptr = false;
+    uint64_t dst_domain_cookie = 0;
+
+    /*
+     * Verify the alignment of the ring data structure supplied with the
+     * understanding that the ring handle supplied points to the same memory as
+     * the first entry in the array of pages provided via pg_descr_hnd, where
+     * the head of the ring will reside.
+     * See argo_update_tx_ptr where the location of the tx_ptr is accessed at a
+     * fixed offset from head of the first page in the mfn array.
+     */
+    if ( !(guest_handle_is_aligned(ring_hnd, ~PAGE_MASK)) )
+        return -EINVAL;
+
+    read_lock(&argo_lock);
+
+    if ( !currd->argo )
+    {
+        ret = -ENODEV;
+        goto out_unlock;
+    }
+
+    if ( copy_from_guest(&ring, ring_hnd, 1) )
+    {
+        ret = -EFAULT;
+        goto out_unlock;
+    }
+
+    if ( ring.magic != XEN_ARGO_RING_MAGIC )
+    {
+        ret = -EINVAL;
+        goto out_unlock;
+    }
+
+    /*
+     * A ring must be large enough to transmit messages, which requires room 
for
+     * at least:
+     *  * one message header, and
+     *  * one payload slot (payload is always rounded to a multiple of 16 
bytes)
+     * and the ring does not allow filling to capacity with a single message --
+     * see logic in argo_ringbuf_insert -- so there must be space remaining 
when
+     * a single message is present. This determines minimum ring size.
+     * In addition, the ring size must be aligned with the payload rounding.
+     */
+    if ( (ring.len < (sizeof(struct xen_argo_ring_message_header)
+                      + XEN_ARGO_ROUNDUP(1) + XEN_ARGO_ROUNDUP(1))) ||
+         (XEN_ARGO_ROUNDUP(ring.len) != ring.len) )
+    {
+        ret = -EINVAL;
+        goto out_unlock;
+    }
+
+    if ( ring.len > XEN_ARGO_MAX_RING_SIZE )
+    {
+        ret = -EINVAL;
+        goto out_unlock;
+    }
+
+    if ( ring.id.partner == XEN_ARGO_DOMID_ANY )
+    {
+        ret = xsm_argo_register_any_source(currd,
+                                           argo_mac_bootparam_enforcing);
+        if ( ret )
+            goto out_unlock;
+    }
+    else
+    {
+        struct domain *dst_d = get_domain_by_id(ring.id.partner);
+
+        if ( !dst_d )
+        {
+            argo_dprintk("!dst_d, ESRCH\n");
+            ret = -ESRCH;
+            goto out_unlock;
+        }
+
+        ret = xsm_argo_register_single_source(currd, dst_d);
+        if ( ret )
+        {
+            put_domain(dst_d);
+            goto out_unlock;
+        }
+
+        if ( !dst_d->argo )
+        {
+            argo_dprintk("!dst_d->argo, ECONNREFUSED\n");
+            ret = -ECONNREFUSED;
+            put_domain(dst_d);
+            goto out_unlock;
+        }
+
+        dst_domain_cookie = dst_d->argo->domain_cookie;
+
+        put_domain(dst_d);
+    }
+
+    ring.id.addr.domain_id = currd->domain_id;
+    ret = __copy_field_to_guest_errno(ring_hnd, &ring, id);
+    if ( ret )
+        goto out_unlock;
+
+    /*
+     * no need for a lock yet, because only we know about this
+     * set the tx pointer if it looks bogus (we don't reset it
+     * because this might be a re-register after S4)
+     */
+
+    if ( ring.tx_ptr >= ring.len ||
+         XEN_ARGO_ROUNDUP(ring.tx_ptr) != ring.tx_ptr )
+    {
+        /*
+         * Since the ring is a mess, attempt to flush the contents of it
+         * here by setting the tx_ptr to the next aligned message slot past
+         * the latest rx_ptr we have observed. Handle ring wrap correctly.
+         */
+        ring.tx_ptr = XEN_ARGO_ROUNDUP(ring.rx_ptr);
+
+        if ( ring.tx_ptr >= ring.len )
+            ring.tx_ptr = 0;
+
+        /* ring.tx_ptr will be written back to the guest ring below. */
+        update_tx_ptr = true;
+    }
+
+    /* W(L2) protects all the elements of the domain's ring_info */
+    write_lock(&currd->argo->lock);
+
+    if ( currd->argo->ring_count >= ARGO_MAX_RINGS_PER_DOMAIN )
+    {
+        ret = -ENOSPC;
+        goto out_unlock2;
+    }
+
+    ring_info = argo_ring_find_info(currd, &ring.id);
+    if ( !ring_info )
+    {
+        unsigned int hash;
+
+        ring_info = xzalloc(struct argo_ring_info);
+        if ( !ring_info )
+        {
+            ret = -ENOMEM;
+            goto out_unlock2;
+        }
+
+        spin_lock_init(&ring_info->lock);
+
+        ring_info->partner_cookie = dst_domain_cookie;
+        ring_info->id = ring.id;
+
+        INIT_HLIST_HEAD(&ring_info->pending);
+
+        hash = array_index_nospec(argo_hash(&ring_info->id),
+                                  ARGO_HTABLE_SIZE);
+        hlist_add_head(&ring_info->node, &currd->argo->ring_hash[hash]);
+
+        gprintk(XENLOG_DEBUG, "argo: vm%u registering ring (vm%u:%x vm%d)\n",
+                currd->domain_id, ring.id.addr.domain_id,
+                ring.id.addr.port, ring.id.partner);
+    }
+    else
+    {
+        /*
+         * If the caller specified that the ring must not already exist,
+         * fail at attempt to add a completed ring which already exists.
+         */
+        if ( fail_exist && ring_info->len )
+        {
+            ret = -EEXIST;
+            goto out_unlock2;
+        }
+
+        gprintk(XENLOG_DEBUG,
+            "argo: vm%u re-registering existing ring (vm%u:%x vm%d)\n",
+             currd->domain_id, ring.id.addr.domain_id,
+             ring.id.addr.port, ring.id.partner);
+    }
+
+    /* Since we hold W(L2), there is no need to take L3 here */
+    ring_info->tx_ptr = ring.tx_ptr;
+
+    ret = argo_find_ring_mfns(currd, ring_info, npage, pg_descr_hnd,
+                              ring.len);
+    if ( ret )
+    {
+        gprintk(XENLOG_ERR,
+            "argo: vm%u failed to find ring mfns (vm%u:%x vm%d)\n",
+             currd->domain_id, ring.id.addr.domain_id,
+             ring.id.addr.port, ring.id.partner);
+
+        goto out_unlock2;
+    }
+
+    /*
+     * Safety check to confirm that the memory supplied is intended for
+     * use as a ring. This will map the first page of the ring.
+     */
+    ret = argo_verify_ring_magic(ring_info);
+    if ( ret )
+    {
+        gprintk(XENLOG_ERR,
+            "argo: vm%u register memory mismatch (vm%u:%x vm%d)\n",
+             currd->domain_id, ring.id.addr.domain_id,
+             ring.id.addr.port, ring.id.partner);
+
+        argo_ring_remove_info(currd, ring_info);
+        goto out_unlock2;
+    }
+
+    if ( update_tx_ptr )
+    {
+        ret = argo_update_tx_ptr(ring_info, ring.tx_ptr);
+        if ( ret )
+        {
+            gprintk(XENLOG_ERR,
+                "argo: vm%u failed to write tx_ptr (vm%u:%x vm%d)\n",
+                 currd->domain_id, ring.id.addr.domain_id,
+                 ring.id.addr.port, ring.id.partner);
+
+            argo_ring_remove_info(currd, ring_info);
+            goto out_unlock2;
+        }
+    }
+
+    ring_info->len = ring.len;
+    currd->argo->ring_count++;
+
+ out_unlock2:
+    write_unlock(&currd->argo->lock);
+
+ out_unlock:
+    read_unlock(&argo_lock);
+
+    return ret;
+}
+
 long
 do_argo_message_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
                    XEN_GUEST_HANDLE_PARAM(void) arg2,
@@ -261,6 +851,37 @@ do_argo_message_op(unsigned int cmd, 
XEN_GUEST_HANDLE_PARAM(void) arg1,
 
     switch (cmd)
     {
+    case XEN_ARGO_MESSAGE_OP_register_ring:
+    {
+        XEN_GUEST_HANDLE_PARAM(xen_argo_ring_t) ring_hnd =
+            guest_handle_cast(arg1, xen_argo_ring_t);
+        XEN_GUEST_HANDLE_PARAM(xen_argo_page_descr_t) pg_descr_hnd =
+            guest_handle_cast(arg2, xen_argo_page_descr_t);
+        /* arg3 is npage */
+        /* arg4 is flags */
+        bool fail_exist = arg4 & XEN_ARGO_REGISTER_FLAG_FAIL_EXIST;
+
+        if ( unlikely(!guest_handle_okay(ring_hnd, 1)) )
+            break;
+        if ( unlikely(arg3 > (XEN_ARGO_MAX_RING_SIZE >> PAGE_SHIFT)) )
+        {
+            rc = -EINVAL;
+            break;
+        }
+        if ( unlikely(!guest_handle_okay(pg_descr_hnd, arg3)) )
+            break;
+        /* arg4: reserve currently-undefined bits, require zero.  */
+        if ( unlikely(arg4 & ~XEN_ARGO_REGISTER_FLAG_MASK) )
+        {
+            rc = -EINVAL;
+            break;
+        }
+
+        rc = argo_register_ring(currd, ring_hnd, pg_descr_hnd, arg3,
+                                fail_exist);
+        break;
+    }
+
     default:
         rc = -EOPNOTSUPP;
         break;
diff --git a/xen/include/asm-arm/guest_access.h 
b/xen/include/asm-arm/guest_access.h
index 729f71e..5456d81 100644
--- a/xen/include/asm-arm/guest_access.h
+++ b/xen/include/asm-arm/guest_access.h
@@ -29,6 +29,8 @@ int access_guest_memory_by_ipa(struct domain *d, paddr_t ipa, 
void *buf,
 /* Is the guest handle a NULL reference? */
 #define guest_handle_is_null(hnd)        ((hnd).p == NULL)
 
+#define guest_handle_is_aligned(hnd, mask) (!((uintptr_t)(hnd).p & (mask)))
+
 /* Offset the given guest handle into the array it refers to. */
 #define guest_handle_add_offset(hnd, nr) ((hnd).p += (nr))
 #define guest_handle_subtract_offset(hnd, nr) ((hnd).p -= (nr))
@@ -112,6 +114,11 @@ int access_guest_memory_by_ipa(struct domain *d, paddr_t 
ipa, void *buf,
     raw_copy_to_guest(_d, _s, sizeof(*_s));             \
 })
 
+/* Errno-returning variant of copy_field_to_guest */
+#define copy_field_to_guest_errno(hnd, ptr, field)      \
+    (copy_field_to_guest((hnd), (ptr), field) ?         \
+        -EFAULT : 0)
+
 /* Copy sub-field of a structure from guest context via a guest handle. */
 #define copy_field_from_guest(ptr, hnd, field) ({       \
     const typeof(&(ptr)->field) _s = &(hnd).p->field;   \
@@ -151,6 +158,11 @@ int access_guest_memory_by_ipa(struct domain *d, paddr_t 
ipa, void *buf,
     __raw_copy_to_guest(_d, _s, sizeof(*_s));           \
 })
 
+/* Errno-returning variant of __copy_field_to_guest */
+#define __copy_field_to_guest_errno(hnd, ptr, field)    \
+    (__copy_field_to_guest((hnd), (ptr), field) ?       \
+        -EFAULT : 0)
+
 #define __copy_field_from_guest(ptr, hnd, field) ({     \
     const typeof(&(ptr)->field) _s = &(hnd).p->field;   \
     typeof(&(ptr)->field) _d = &(ptr)->field;           \
diff --git a/xen/include/asm-x86/guest_access.h 
b/xen/include/asm-x86/guest_access.h
index 9399480..9176150 100644
--- a/xen/include/asm-x86/guest_access.h
+++ b/xen/include/asm-x86/guest_access.h
@@ -41,6 +41,8 @@
 /* Is the guest handle a NULL reference? */
 #define guest_handle_is_null(hnd)        ((hnd).p == NULL)
 
+#define guest_handle_is_aligned(hnd, mask) (!((uintptr_t)(hnd).p & (mask)))
+
 /* Offset the given guest handle into the array it refers to. */
 #define guest_handle_add_offset(hnd, nr) ((hnd).p += (nr))
 #define guest_handle_subtract_offset(hnd, nr) ((hnd).p -= (nr))
@@ -117,6 +119,11 @@
     raw_copy_to_guest(_d, _s, sizeof(*_s));             \
 })
 
+/* Errno-returning variant of copy_field_to_guest */
+#define copy_field_to_guest_errno(hnd, ptr, field)      \
+    (copy_field_to_guest((hnd), (ptr), field) ?         \
+        -EFAULT : 0)
+
 /* Copy sub-field of a structure from guest context via a guest handle. */
 #define copy_field_from_guest(ptr, hnd, field) ({       \
     const typeof(&(ptr)->field) _s = &(hnd).p->field;   \
@@ -162,6 +169,11 @@
     __raw_copy_to_guest(_d, _s, sizeof(*_s));           \
 })
 
+/* Errno-returning variant of __copy_field_to_guest */
+#define __copy_field_to_guest_errno(hnd, ptr, field)    \
+    (__copy_field_to_guest((hnd), (ptr), field) ?       \
+        -EFAULT : 0)
+
 #define __copy_field_from_guest(ptr, hnd, field) ({     \
     const typeof(&(ptr)->field) _s = &(hnd).p->field;   \
     typeof(&(ptr)->field) _d = &(ptr)->field;           \
diff --git a/xen/include/public/argo.h b/xen/include/public/argo.h
index a32fb2d..e73faea 100644
--- a/xen/include/public/argo.h
+++ b/xen/include/public/argo.h
@@ -31,6 +31,27 @@
 
 #include "xen.h"
 
+#define XEN_ARGO_RING_MAGIC      0xbd67e163e7777f2fULL
+#define XEN_ARGO_DOMID_ANY       DOMID_INVALID
+
+/*
+ * The maximum size of an Argo ring is defined to be: 16GB
+ *  -- which is 0x1000000 bytes.
+ * A byte index into the ring is at most 24 bits.
+ */
+#define XEN_ARGO_MAX_RING_SIZE  (0x1000000ULL)
+
+/*
+ * Page descriptor: encoding both page address and size in a 64-bit value.
+ * Intended to allow ABI to support use of different granularity pages.
+ * example of how to populate:
+ * xen_argo_page_descr_t pg_desc =
+ *      (physaddr & PAGE_MASK) | XEN_ARGO_PAGE_DESCR_SIZE_4K;
+ */
+typedef uint64_t xen_argo_page_descr_t;
+#define XEN_ARGO_PAGE_DESCR_SIZE_MASK   0x0000000000000fffULL
+#define XEN_ARGO_PAGE_DESCR_SIZE_4K     0
+
 typedef struct xen_argo_addr
 {
     uint32_t port;
@@ -67,4 +88,54 @@ typedef struct xen_argo_ring
 #endif
 } xen_argo_ring_t;
 
+/*
+ * Messages on the ring are padded to 128 bits
+ * Len here refers to the exact length of the data not including the
+ * 128 bit header. The message uses
+ * ((len + 0xf) & ~0xf) + sizeof(argo_ring_message_header) bytes.
+ * Using typeof(a) make clear that this does not truncate any high-order bits.
+ */
+#define XEN_ARGO_ROUNDUP(a) (((a) + 0xf) & ~(typeof(a))0xf)
+
+struct xen_argo_ring_message_header
+{
+    uint32_t len;
+    xen_argo_addr_t source;
+    uint32_t message_type;
+#if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 199901L
+    uint8_t data[];
+#elif defined(__GNUC__)
+    uint8_t data[0];
+#endif
+};
+
+/*
+ * Hypercall operations
+ */
+
+/*
+ * XEN_ARGO_MESSAGE_OP_register_ring
+ *
+ * Register a ring using the indicated memory.
+ * Also used to reregister an existing ring (eg. after resume from sleep).
+ *
+ * arg1: XEN_GUEST_HANDLE(xen_argo_ring_t)
+ * arg2: XEN_GUEST_HANDLE(xen_argo_page_descr_t)
+ * arg3: unsigned long npages
+ * arg4: unsigned long flags
+ */
+#define XEN_ARGO_MESSAGE_OP_register_ring     1
+
+/* Register op flags */
+/*
+ * Fail exist:
+ * If set, reject attempts to (re)register an existing established ring.
+ * If clear, reregistration occurs if the ring exists, with the new ring
+ * taking the place of the old, preserving tx_ptr if it remains valid.
+ */
+#define XEN_ARGO_REGISTER_FLAG_FAIL_EXIST  0x1
+
+/* Mask for all defined flags. unsigned long type so ok for both 32/64-bit */
+#define XEN_ARGO_REGISTER_FLAG_MASK 0x1UL
+
 #endif
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.