[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [PATCH v7 07/15] argo: implement the register op



The register op is used by a domain to register a region of memory for
receiving messages from either a specified other domain, or, if specifying a
wildcard, any domain.

This operation creates a mapping within Xen's private address space that
will remain resident for the lifetime of the ring. In subsequent commits,
the hypervisor will use this mapping to copy data from a sending domain into
this registered ring, making it accessible to the domain that registered the
ring to receive data.

Wildcard any-sender rings are default disabled and registration will be
refused with EPERM unless they have been specifically enabled with the
new mac-permissive flag that is added to the argo boot option here. The
reason why the default for wildcard rings is 'deny' is that there is
currently no means to protect the ring from DoS by a noisy domain
spamming the ring, affecting other domains ability to send to it. This
will be addressed with XSM policy controls in subsequent work.

Since denying access to any-sender rings is a significant functional
constraint, the new option "mac-permissive" for the argo bootparam
enables overriding this. eg: "argo=1,mac-permissive=1"

The p2m type of the memory supplied by the guest for the ring must be
p2m_ram_rw and the memory will be pinned as PGT_writable_page while the ring
is registered.

This hypercall op and its interface currently only supports 4K-sized pages.

Signed-off-by: Christopher Clark <christopher.clark6@xxxxxxxxxxxxxx>
Tested-by: Chris Patterson <pattersonc@xxxxxxxxxxxx>
---
v6 #09 Jan: add compat ABI
v6 #07 Jan: add argo_message_header to xlat.lst and invoke the CHECK
v6 #07 Jan: xlat.lst: move argo struct entries to alphabetical position
v5 #07 Roger: add BUILD_BUG_ON for MAX_RING_SIZE, PAGE_SIZE
v5 #07 Roger: gprintk(XENLOG_ERR,.. for denied existing ring
v5: add compat validation macros to primary source file: common/argo.c
v5 : convert hypercall arg structs to struct form for compat checking
v5: dropped external file for compat macros: common/compat/argo.c
v4 v3#07 Jan: shrink critical sections in register_ring
v4 v3#07 Jan: revise register flag MASK in header, note 32-bitness of args
v4 feedback: use standard data structures per common code, not loop macros
v4 Andrew: use the single argo command line option list
v4 #07 Jan: rewrite find_ring_mfn to use check_get_page_from_gfn
v4 #07 Roger: add FIXME to ring_map_page for vmap contiguous ring mapping
v3 #07 Jan: comment: minimum ring size is based on minimum-sized message
v3 #04 Andrew: reference CONFIG_ARGO in the command line documentation
v3 #07 Jan: register_ring: fold else, if into else-if to drop indent
v3 #07 Jan: remove no longer used guest_handle_is_aligned macros
v3 #07 Jan: remove dead code from find_ring_mfns
v3 #07 Jan: fix format string indention in printks
v3 #07 Jan: remove redundant bounds check on npage in find_ring_mfns
v3 #08 self/Roger: improve dprintk output in find_ring_info like find_send_info
v3 #07 Jan: rename ring_find_info to find_ring_info
v3 #07 Jan: use array_index_nospec in ring_map_page
v3 #07 Jan: fix numeric entries in printk format strings
v3 #7 Jan: drop unneeded parentheses from ROUNDUP_MESSAGE defn
v3 #10 Roger: move find functions to top of file and drop prototypes
v3 #03 meld compat check for hypercall arg register struct
v3 #04 Roger/Jan: make lock names clearer and assert their state
v3 #04 Jan: port -> aport with type; distinguish argo port from evtchn
v3 feedback #07 Eric: fix header max ring size comment units
v3 feedback #04 Roger: mfn_mapping: void* instead of uint8_t*
v3 use %u for printing unsigned ints in find_ring_mfns
v3 feedback #04 Jan: uint32_t -> unsigned int for npage in register_ring
v3 feedback #04 Roger: drop npages struct member, calculate from len
v3 : register_ring: uint32_t -> unsigned int for private_tx_ptr
v3 feedback Roger/Jan: ASSERT currd is current->domain or use 'd' variable name
v3 feedback #07 Roger: use opt_argo_mac_permissive : a boolean opt
v3 feedback #04 Roger: reorder #includes to alphabetical order
v3 feedback #07 Roger: drop comment re: Intel EPT/AMD NPT for write-only mapping
v3 feedback #07 Roger: drop ptr arithmetic in update_tx_ptr, use ring struct 
cast
v3 feedback #07 Roger: drop newline in ring_map_page
v3 feedback #07 Roger: drop unneeded null check before xfree
v3 feedback #07 Roger: use return and drop out label in register_ring
v3 Stefano: add 4K page constraint to header file comment & commit msg
v3 Julien/Stefano: 4K granularity ok: use 64-bit gfns in register interface
v2 self: disallow ring resize via reregister
v2 feedback Jan: drop cookie, implement teardown
v2 feedback Jan: drop message from argo_message_op
v2 self: move hash_index function below locking comment
v2 self: OVERHAUL
v2 self/Jan: remove use of magic verification field and tidy up
v2 self: merge max and min ring size check clauses
v2 feedback v1#13 Roger: use OS-supplied roundup; drop from public header
v2 feedback #9, Jan: use the argo-mac bootparam at point of introduction
v2 feedback #9, Jan: rename boot opt variable to comply with convention
v2 feedback #9, Jan: rename the argo_mac bootparam to argo-mac
v2 feedback #9 Jan: document argo boot opt in xen-command-line.markdown
v1,2 feedback Jan/Roger/Paul: drop errno returning guest access functions
v1 feedback Roger, Jan: drop argo prefix on static functions
v1 feedback Roger: s/pfn/gfn/ and retire always-64-bit type
v2. feedback Jan: document the argo-mac boot opt
v2. feedback Jan: simplify re-register, drop mappings
v1 #13 feedback Jan: revise use of guest_handle_okay vs __copy ops
v1 #13 feedback, Jan: register op : s/ECONNREFUSED/ESRCH/
v1 #5 (#13) feedback Paul: register op: use currd in do_message_op
v1 #13 feedback, Paul: register op: use mfn_eq comparator
v1 #5 (#13) feedback Paul: register op: use currd in argo_register_ring
v1 #13 feedback Paul: register op: whitespace, unsigned, bounds check
v1 #13 feedback Paul: use of hex in limit constant definition
v1 #13 feedback Paul, register op: set nmfns on loop termination
v1 #13 feedback Paul: register op: do/while -> gotos, reindent
v1 argo_ring_map_page: drop uint32_t for unsigned int
v1. #13 feedback Julien: use page descriptors instead of gpfns.
   - adds ABI support for pages with different granularity.
v1 feedback #13, Paul: adjust log level of message
v1 feedback #13, Paul: use gprintk for guest-triggered warning
v1 feedback #13, Paul: gprintk and XENLOG_DEBUG for ring registration
v1 feedback #13, Paul: use gprintk for errs in argo_ring_map_page
v1 feedback #13, Paul: use ENOMEM if global mapping fails
v1 feedback Paul: overflow check before shift
v1: add define for copy_field_to_guest_errno
v1: fix gprintk use for ARM as its defn dislikes split format strings
v1: use copy_field_to_guest_errno
v1 feedback #13, Jan: argo_hash_fn: no inline, rename, change type
v1 feedback #13, Paul, Jan: EFAULT -> ENOMEM in argo_ring_map_page
v1 feedback #13, Jan: rename page var in argo_ring_map_page
v1 feedback #13, Jan: switch uint8_t* to void* and drop cast
v1 feedback #13, Jan: switch memory barrier to smp_wmb
v1 feedback #13, Jan: make 'ring' comment comply with single-line style
v1 feedback #13, Jan: use xzalloc_array, drop loop NULL init
v1 feedback #13, Jan: init bool with false rather than 0
v1 feedback #13 Jan: use __copy; define and use __copy_field_to_guest_errno
v1 feedback #13, Jan: use xzalloc, drop individual init zeroes
v1 feedback #13, Jan: prefix public namespace with xen
v1 feedback #13, Jan: blank line after op case in do_argo_message_op
v1 self: reflow comment in argo_ring_map_page to within 80 char len
v1 feedback #13, Roger: use true not 1 in assign to update_tx_ptr bool
v1 feedback #21, Jan: fold in the array_index_nospec hash function guards
v1 feedback #18, Jan: fold the max ring count limit into the series
v1 self: use unsigned long type for XEN_ARGO_REGISTER_FLAG_MASK
v1: feedback #15 Jan: handle upper-halves of hypercall args
v1. feedback #13 Jan: add comment re: page alignment
v1. self: confirm ring magic presence in supplied page array
v1. feedback #13 Jan: add comment re: minimum ring size
v1. feedback #13 Roger: use ASSERT_UNREACHABLE
v1. feedback Roger: add comment to hash function

 docs/misc/xen-command-line.pandoc |   9 +-
 xen/common/argo.c                 | 527 +++++++++++++++++++++++++++++++++++++-
 xen/include/public/argo.h         |  70 +++++
 xen/include/xlat.lst              |   2 +
 4 files changed, 606 insertions(+), 2 deletions(-)

diff --git a/docs/misc/xen-command-line.pandoc 
b/docs/misc/xen-command-line.pandoc
index 605c544..c8d1ced 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -183,7 +183,7 @@ in combination with cpuidle.  This option is only expected 
to be useful for
 developers wishing Xen to fall back to older timing methods on newer hardware.
 
 ### argo
-    = List of [ <bool> ]
+    = List of [ <bool>, mac-permissive=<bool> ]
 
 Controls for the Argo hypervisor-mediated interdomain communication service.
 
@@ -195,6 +195,13 @@ point of authority.  Guests may register memory rings to 
recieve messages,
 query the status of other domains, and send messages by hypercall, all subject
 to appropriate auditing by Xen.  Argo is disabled by default.
 
+*   The `mac-permissive` boolean controls whether wildcard receive rings may be
+    registered (`mac-permissive=1`) or may not be registered
+    (`mac-permissive=0`).
+
+    This option is disabled by default, to protect domains from a DoS by a
+    buggy or malicious other domain spamming the ring.
+
 ### asid (x86)
 > `= <boolean>`
 
diff --git a/xen/common/argo.c b/xen/common/argo.c
index 9f2d2e5..54256ae 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -22,6 +22,7 @@
 #include <xen/errno.h>
 #include <xen/event.h>
 #include <xen/guest_access.h>
+#include <xen/lib.h>
 #include <xen/nospec.h>
 #include <xen/sched.h>
 #include <xen/time.h>
@@ -31,13 +32,32 @@
 #ifdef CONFIG_COMPAT
 #include <compat/argo.h>
 CHECK_argo_addr;
+#undef CHECK_argo_addr
+#define CHECK_argo_addr struct xen_argo_addr
+CHECK_argo_register_ring;
 CHECK_argo_ring;
+CHECK_argo_ring_message_header;
 #endif
 
+#define MAX_RINGS_PER_DOMAIN            128U
+
+/* All messages on the ring are padded to a multiple of the slot size. */
+#define ROUNDUP_MESSAGE(a) ROUNDUP((a), XEN_ARGO_MSG_SLOT_SIZE)
+
+/* Number of PAGEs needed to hold a ring of a given size in bytes */
+#define NPAGES_RING(ring_len) \
+    (ROUNDUP((ROUNDUP_MESSAGE(ring_len) + sizeof(xen_argo_ring_t)), PAGE_SIZE) 
\
+     >> PAGE_SHIFT)
+
 DEFINE_XEN_GUEST_HANDLE(xen_argo_addr_t);
+DEFINE_XEN_GUEST_HANDLE(xen_argo_register_ring_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_t);
+#ifdef CONFIG_COMPAT
+DEFINE_XEN_GUEST_HANDLE(compat_pfn_t);
+#endif
 
 static bool __read_mostly opt_argo;
+static bool __read_mostly opt_argo_mac_permissive;
 
 static int __init parse_argo(const char *s)
 {
@@ -51,6 +71,8 @@ static int __init parse_argo(const char *s)
 
         if ( (val = parse_bool(s, ss)) >= 0 )
             opt_argo = val;
+        else if ( (val = parse_boolean("mac-permissive", s, ss)) >= 0 )
+            opt_argo_mac_permissive = val;
         else
             rc = -EINVAL;
 
@@ -366,6 +388,74 @@ ring_unmap(const struct domain *d, struct argo_ring_info 
*ring_info)
     }
 }
 
+static int
+ring_map_page(const struct domain *d, struct argo_ring_info *ring_info,
+              unsigned int i, void **out_ptr)
+{
+    ASSERT(LOCKING_L3(d, ring_info));
+
+    /*
+     * FIXME: Investigate using vmap to create a single contiguous virtual
+     * address space mapping of the ring instead of using the array of single
+     * page mappings.
+     * Affects logic in memcpy_to_guest_ring, the mfn_mapping array data
+     * structure, and places where ring mappings are added or removed.
+     */
+
+    if ( i >= ring_info->nmfns )
+    {
+        gprintk(XENLOG_ERR,
+               "argo: ring (vm%u:%x vm%u) %p attempted to map page %u of %u\n",
+                ring_info->id.domain_id, ring_info->id.aport,
+                ring_info->id.partner_id, ring_info, i, ring_info->nmfns);
+        return -ENOMEM;
+    }
+    i = array_index_nospec(i, ring_info->nmfns);
+
+    if ( !ring_info->mfns || !ring_info->mfn_mapping)
+    {
+        ASSERT_UNREACHABLE();
+        ring_info->len = 0;
+        return -ENOMEM;
+    }
+
+    if ( !ring_info->mfn_mapping[i] )
+    {
+        ring_info->mfn_mapping[i] = map_domain_page_global(ring_info->mfns[i]);
+        if ( !ring_info->mfn_mapping[i] )
+        {
+            gprintk(XENLOG_ERR, "argo: ring (vm%u:%x vm%u) %p attempted to map 
"
+                    "page %u of %u\n",
+                    ring_info->id.domain_id, ring_info->id.aport,
+                    ring_info->id.partner_id, ring_info, i, ring_info->nmfns);
+            return -ENOMEM;
+        }
+        argo_dprintk("mapping page %"PRI_mfn" to %p\n",
+                     mfn_x(ring_info->mfns[i]), ring_info->mfn_mapping[i]);
+    }
+
+    if ( out_ptr )
+        *out_ptr = ring_info->mfn_mapping[i];
+
+    return 0;
+}
+
+static void
+update_tx_ptr(const struct domain *d, struct argo_ring_info *ring_info,
+              uint32_t tx_ptr)
+{
+    xen_argo_ring_t *ringp;
+
+    ASSERT(LOCKING_L3(d, ring_info));
+    ASSERT(ring_info->mfn_mapping[0]);
+
+    ring_info->tx_ptr = tx_ptr;
+    ringp = ring_info->mfn_mapping[0];
+
+    write_atomic(&ringp->tx_ptr, tx_ptr);
+    smp_wmb();
+}
+
 static void
 wildcard_pending_list_remove(domid_t domain_id, struct pending_ent *ent)
 {
@@ -530,11 +620,400 @@ partner_rings_remove(struct domain *src_d)
     }
 }
 
+static int
+find_ring_mfn(struct domain *d, gfn_t gfn, mfn_t *mfn)
+{
+    struct page_info *page;
+    p2m_type_t p2mt;
+    int ret;
+
+    ret = check_get_page_from_gfn(d, gfn, false, &p2mt, &page);
+    if ( unlikely(ret) )
+        return ret;
+
+    *mfn = page_to_mfn(page);
+    if ( !mfn_valid(*mfn) )
+        ret = -EINVAL;
+#ifdef CONFIG_X86
+    else if ( p2mt == p2m_ram_logdirty )
+        ret = -EAGAIN;
+#endif
+    else if ( (p2mt != p2m_ram_rw) ||
+              !get_page_and_type(page, d, PGT_writable_page) )
+        ret = -EINVAL;
+
+    put_page(page);
+
+    return ret;
+}
+
+static int
+copy_gfn_from_handle(XEN_GUEST_HANDLE_PARAM(void) gfn_hnd, bool compat,
+                     unsigned int i, gfn_t *out_gfn)
+{
+    int ret;
+
+#ifdef CONFIG_COMPAT
+    if ( compat )
+    {
+        XEN_GUEST_HANDLE_PARAM(compat_pfn_t) c_gfn_hnd =
+            guest_handle_cast(gfn_hnd, compat_pfn_t);
+        compat_pfn_t c_gfn;
+
+        ret = __copy_from_guest_offset(&c_gfn, c_gfn_hnd, i, 1) ? -EFAULT : 0;
+        *out_gfn = _gfn(c_gfn);
+    }
+    else
+    {
+#endif
+        XEN_GUEST_HANDLE_PARAM(xen_pfn_t) x_gfn_hnd =
+            guest_handle_cast(gfn_hnd, xen_pfn_t);
+        xen_pfn_t x_gfn;
+
+        ret = __copy_from_guest_offset(&x_gfn, x_gfn_hnd, i, 1) ? -EFAULT : 0;
+        *out_gfn = _gfn(x_gfn);
+#ifdef CONFIG_COMPAT
+    }
+#endif
+    return ret;
+}
+
+static int
+find_ring_mfns(struct domain *d, struct argo_ring_info *ring_info,
+               const unsigned int npage,
+               XEN_GUEST_HANDLE_PARAM(void) gfn_hnd,
+               const unsigned int len, bool compat)
+{
+    unsigned int i;
+    int ret = 0;
+    mfn_t *mfns;
+    void **mfn_mapping;
+
+    ASSERT(LOCKING_Write_rings_L2(d));
+
+    if ( ring_info->mfns )
+    {
+        /* Ring already existed: drop the previous mapping. */
+        gprintk(XENLOG_INFO, "argo: vm%u re-register existing ring "
+                "(vm%u:%x vm%u) clears mapping\n",
+                d->domain_id, ring_info->id.domain_id,
+                ring_info->id.aport, ring_info->id.partner_id);
+
+        ring_remove_mfns(d, ring_info);
+        ASSERT(!ring_info->mfns);
+    }
+
+    mfns = xmalloc_array(mfn_t, npage);
+    if ( !mfns )
+        return -ENOMEM;
+
+    for ( i = 0; i < npage; i++ )
+        mfns[i] = INVALID_MFN;
+
+    mfn_mapping = xzalloc_array(void *, npage);
+    if ( !mfn_mapping )
+    {
+        xfree(mfns);
+        return -ENOMEM;
+    }
+
+    ring_info->mfns = mfns;
+    ring_info->mfn_mapping = mfn_mapping;
+
+    for ( i = 0; i < npage; i++ )
+    {
+        mfn_t mfn;
+        gfn_t gfn;
+
+        ret = copy_gfn_from_handle(gfn_hnd, compat, i, &gfn);
+        if ( ret )
+            break;
+
+        ret = find_ring_mfn(d, gfn, &mfn);
+        if ( ret )
+        {
+            gprintk(XENLOG_ERR, "argo: vm%u: invalid gfn %"PRI_gfn" "
+                    "r:(vm%u:%x vm%u) %p %u/%u\n",
+                    d->domain_id, gfn_x(gfn),
+                    ring_info->id.domain_id, ring_info->id.aport,
+                    ring_info->id.partner_id, ring_info, i, npage);
+            break;
+        }
+
+        ring_info->mfns[i] = mfn;
+
+        argo_dprintk("%u: %"PRI_gfn" -> %"PRI_mfn"\n",
+                     i, gfn_x(gfn), mfn_x(ring_info->mfns[i]));
+    }
+
+    ring_info->nmfns = i;
+
+    if ( ret )
+        ring_remove_mfns(d, ring_info);
+    else
+    {
+        ASSERT(ring_info->nmfns == NPAGES_RING(len));
+
+        gprintk(XENLOG_DEBUG, "argo: vm%u ring (vm%u:%x vm%u) %p "
+                "mfn_mapping %p len %u nmfns %u\n",
+                d->domain_id, ring_info->id.domain_id,
+                ring_info->id.aport, ring_info->id.partner_id, ring_info,
+                ring_info->mfn_mapping, ring_info->len, ring_info->nmfns);
+    }
+
+    return ret;
+}
+
+static long
+register_ring(struct domain *currd,
+              XEN_GUEST_HANDLE_PARAM(xen_argo_register_ring_t) reg_hnd,
+              XEN_GUEST_HANDLE_PARAM(void) gfn_hnd,
+              unsigned int npage, unsigned int flags, bool compat)
+{
+    xen_argo_register_ring_t reg;
+    struct argo_ring_id ring_id;
+    void *map_ringp;
+    xen_argo_ring_t *ringp;
+    struct argo_ring_info *ring_info, *new_ring_info = NULL;
+    struct argo_send_info *send_info = NULL;
+    struct domain *dst_d = NULL;
+    int ret = 0;
+    unsigned int private_tx_ptr;
+
+    ASSERT(currd == current->domain);
+
+    /* flags: reserve currently-undefined bits, require zero.  */
+    if ( unlikely(flags & ~XEN_ARGO_REGISTER_FLAG_MASK) )
+        return -EINVAL;
+
+    if ( copy_from_guest(&reg, reg_hnd, 1) )
+        return -EFAULT;
+
+    /*
+     * A ring must be large enough to transmit messages, so requires space for:
+     * * 1 message header, plus
+     * * 1 payload slot (payload is always rounded to a multiple of 16 bytes)
+     *   for the message payload to be written into, plus
+     * * 1 more slot, so that the ring cannot be filled to capacity with a
+     *   single minimum-size message -- see the logic in ringbuf_insert --
+     *   allowing for this ensures that there can be space remaining when a
+     *   message is present.
+     * The above determines the minimum acceptable ring size.
+     */
+    if ( (reg.len < (sizeof(struct xen_argo_ring_message_header)
+                      + ROUNDUP_MESSAGE(1) + ROUNDUP_MESSAGE(1))) ||
+         (reg.len > XEN_ARGO_MAX_RING_SIZE) ||
+         (reg.len != ROUNDUP_MESSAGE(reg.len)) ||
+         (NPAGES_RING(reg.len) != npage) ||
+         (reg.pad != 0) )
+        return -EINVAL;
+
+    ring_id.partner_id = reg.partner_id;
+    ring_id.aport = reg.aport;
+    ring_id.domain_id = currd->domain_id;
+
+    if ( reg.partner_id == XEN_ARGO_DOMID_ANY )
+    {
+        if ( !opt_argo_mac_permissive )
+            return -EPERM;
+    }
+    else
+    {
+        dst_d = get_domain_by_id(reg.partner_id);
+        if ( !dst_d )
+        {
+            argo_dprintk("!dst_d, ESRCH\n");
+            return -ESRCH;
+        }
+
+        send_info = xzalloc(struct argo_send_info);
+        if ( !send_info )
+        {
+            ret = -ENOMEM;
+            goto out;
+        }
+        send_info->id = ring_id;
+    }
+
+    /*
+     * Common case is that the ring doesn't already exist, so do the alloc here
+     * before picking up any locks.
+     */
+    new_ring_info = xzalloc(struct argo_ring_info);
+    if ( !new_ring_info )
+    {
+        ret = -ENOMEM;
+        goto out;
+    }
+
+    read_lock(&L1_global_argo_rwlock);
+
+    if ( !currd->argo )
+    {
+        ret = -ENODEV;
+        goto out_unlock;
+    }
+
+    if ( dst_d && !dst_d->argo )
+    {
+        argo_dprintk("!dst_d->argo, ECONNREFUSED\n");
+        ret = -ECONNREFUSED;
+        goto out_unlock;
+    }
+
+    write_lock(&currd->argo->rings_L2_rwlock);
+
+    if ( currd->argo->ring_count >= MAX_RINGS_PER_DOMAIN )
+    {
+        ret = -ENOSPC;
+        goto out_unlock2;
+    }
+
+    ring_info = find_ring_info(currd, &ring_id);
+    if ( !ring_info )
+    {
+        ring_info = new_ring_info;
+        new_ring_info = NULL;
+
+        spin_lock_init(&ring_info->L3_lock);
+
+        ring_info->id = ring_id;
+        INIT_LIST_HEAD(&ring_info->pending);
+
+        list_add(&ring_info->node,
+                 &currd->argo->ring_hash[hash_index(&ring_info->id)]);
+
+        gprintk(XENLOG_DEBUG, "argo: vm%u registering ring (vm%u:%x vm%u)\n",
+                currd->domain_id, ring_id.domain_id, ring_id.aport,
+                ring_id.partner_id);
+    }
+    else if ( ring_info->len )
+    {
+        /*
+         * If the caller specified that the ring must not already exist,
+         * fail at attempt to add a completed ring which already exists.
+         */
+        if ( flags & XEN_ARGO_REGISTER_FLAG_FAIL_EXIST )
+        {
+            gprintk(XENLOG_ERR, "argo: vm%u disallowed reregistration of "
+                    "existing ring (vm%u:%x vm%u)\n",
+                    currd->domain_id, ring_id.domain_id, ring_id.aport,
+                    ring_id.partner_id);
+            ret = -EEXIST;
+            goto out_unlock2;
+        }
+
+        if ( ring_info->len != reg.len )
+        {
+            /*
+             * Change of ring size could result in entries on the pending
+             * notifications list that will never trigger.
+             * Simple blunt solution: disallow ring resize for now.
+             * TODO: investigate enabling ring resize.
+             */
+            gprintk(XENLOG_ERR, "argo: vm%u attempted to change ring size "
+                    "(vm%u:%x vm%u)\n",
+                    currd->domain_id, ring_id.domain_id, ring_id.aport,
+                    ring_id.partner_id);
+            /*
+             * Could return EINVAL here, but if the ring didn't already
+             * exist then the arguments would have been valid, so: EEXIST.
+             */
+            ret = -EEXIST;
+            goto out_unlock2;
+        }
+
+        gprintk(XENLOG_DEBUG,
+                "argo: vm%u re-registering existing ring (vm%u:%x vm%u)\n",
+                currd->domain_id, ring_id.domain_id, ring_id.aport,
+                ring_id.partner_id);
+    }
+
+    ret = find_ring_mfns(currd, ring_info, npage, gfn_hnd, reg.len, compat);
+    if ( ret )
+    {
+        gprintk(XENLOG_ERR,
+                "argo: vm%u failed to find ring mfns (vm%u:%x vm%u)\n",
+                currd->domain_id, ring_id.domain_id, ring_id.aport,
+                ring_id.partner_id);
+
+        ring_remove_info(currd, ring_info);
+        goto out_unlock2;
+    }
+
+    /*
+     * The first page of the memory supplied for the ring has the xen_argo_ring
+     * structure at its head, which is where the ring indexes reside.
+     */
+    ret = ring_map_page(currd, ring_info, 0, &map_ringp);
+    if ( ret )
+    {
+        gprintk(XENLOG_ERR,
+                "argo: vm%u failed to map ring mfn 0 (vm%u:%x vm%u)\n",
+                currd->domain_id, ring_id.domain_id, ring_id.aport,
+                ring_id.partner_id);
+
+        ring_remove_info(currd, ring_info);
+        goto out_unlock2;
+    }
+    ringp = map_ringp;
+
+    private_tx_ptr = read_atomic(&ringp->tx_ptr);
+
+    if ( (private_tx_ptr >= reg.len) ||
+         (ROUNDUP_MESSAGE(private_tx_ptr) != private_tx_ptr) )
+    {
+        /*
+         * Since the ring is a mess, attempt to flush the contents of it
+         * here by setting the tx_ptr to the next aligned message slot past
+         * the latest rx_ptr we have observed. Handle ring wrap correctly.
+         */
+        private_tx_ptr = ROUNDUP_MESSAGE(read_atomic(&ringp->rx_ptr));
+
+        if ( private_tx_ptr >= reg.len )
+            private_tx_ptr = 0;
+
+        update_tx_ptr(currd, ring_info, private_tx_ptr);
+    }
+
+    ring_info->tx_ptr = private_tx_ptr;
+    ring_info->len = reg.len;
+    currd->argo->ring_count++;
+
+    if ( send_info )
+    {
+        spin_lock(&dst_d->argo->send_L2_lock);
+
+        list_add(&send_info->node,
+                 &dst_d->argo->send_hash[hash_index(&send_info->id)]);
+
+        spin_unlock(&dst_d->argo->send_L2_lock);
+    }
+
+ out_unlock2:
+    write_unlock(&currd->argo->rings_L2_rwlock);
+
+ out_unlock:
+    read_unlock(&L1_global_argo_rwlock);
+
+ out:
+    if ( dst_d )
+        put_domain(dst_d);
+
+    if ( ret )
+        xfree(send_info);
+
+    xfree(new_ring_info);
+
+    return ret;
+}
+
 long
 do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
            XEN_GUEST_HANDLE_PARAM(void) arg2, unsigned long arg3,
            unsigned long arg4)
 {
+    struct domain *currd = current->domain;
     long rc = -EFAULT;
 
     argo_dprintk("->do_argo_op(%u,%p,%p,%lu,0x%lx)\n", cmd,
@@ -545,6 +1024,31 @@ do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) 
arg1,
 
     switch (cmd)
     {
+    case XEN_ARGO_OP_register_ring:
+    {
+        XEN_GUEST_HANDLE_PARAM(xen_argo_register_ring_t) reg_hnd =
+            guest_handle_cast(arg1, xen_argo_register_ring_t);
+        /* arg2: gfn_hnd, arg3: npage, arg4: flags */
+
+        BUILD_BUG_ON(!IS_ALIGNED(XEN_ARGO_MAX_RING_SIZE, PAGE_SIZE));
+
+        if ( unlikely(arg3 > (XEN_ARGO_MAX_RING_SIZE >> PAGE_SHIFT)) )
+        {
+            rc = -EINVAL;
+            break;
+        }
+        /*
+         * Check access to the whole array here so we can use the faster __copy
+         * operations to read each element later.
+         */
+        if ( unlikely(!guest_handle_okay(guest_handle_cast(arg2, xen_pfn_t),
+                                         arg3)) )
+            break;
+
+        rc = register_ring(currd, reg_hnd, arg2, arg3, arg4, false);
+        break;
+    }
+
     default:
         rc = -EOPNOTSUPP;
         break;
@@ -561,6 +1065,7 @@ compat_argo_op(unsigned int cmd, 
XEN_GUEST_HANDLE_PARAM(void) arg1,
                XEN_GUEST_HANDLE_PARAM(void) arg2, unsigned long arg3,
                unsigned long arg4)
 {
+    struct domain *currd = current->domain;
     long rc = -EFAULT;
 
     argo_dprintk("->compat_argo_op(%u,%p,%p,%lu,0x%lx)\n", cmd,
@@ -571,6 +1076,27 @@ compat_argo_op(unsigned int cmd, 
XEN_GUEST_HANDLE_PARAM(void) arg1,
 
     switch (cmd)
     {
+    case XEN_ARGO_OP_register_ring:
+    {
+        XEN_GUEST_HANDLE_PARAM(xen_argo_register_ring_t) reg_hnd =
+            guest_handle_cast(arg1, xen_argo_register_ring_t);
+        /* arg2: gfn_hnd, arg3: npage, arg4: flags */
+
+        if ( unlikely(arg3 > (XEN_ARGO_MAX_RING_SIZE >> PAGE_SHIFT)) )
+        {
+            rc = -EINVAL;
+            break;
+        }
+
+        if ( unlikely(
+            !guest_handle_okay(guest_handle_cast(arg2, compat_pfn_t), arg3)) )
+            break;
+
+        rc = register_ring(currd, reg_hnd, arg2, arg3, arg4, true);
+        break;
+    }
+
+
     default:
         rc = -EOPNOTSUPP;
         break;
@@ -579,7 +1105,6 @@ compat_argo_op(unsigned int cmd, 
XEN_GUEST_HANDLE_PARAM(void) arg1,
     argo_dprintk("<-compat_argo_op(%u)=%ld\n", cmd, rc);
 
     return rc;
-
 }
 #endif
 
diff --git a/xen/include/public/argo.h b/xen/include/public/argo.h
index 530bb82..667a8ba 100644
--- a/xen/include/public/argo.h
+++ b/xen/include/public/argo.h
@@ -33,6 +33,13 @@
 
 #define XEN_ARGO_DOMID_ANY       DOMID_INVALID
 
+/*
+ * The maximum size of an Argo ring is defined to be: 16MB
+ *  -- which is 0x1000000 bytes.
+ * A byte index into the ring is at most 24 bits.
+ */
+#define XEN_ARGO_MAX_RING_SIZE  (0x1000000ULL)
+
 /* Fixed-width type for "argo port" number. Nothing to do with evtchns. */
 typedef uint32_t xen_argo_port_t;
 
@@ -61,4 +68,67 @@ typedef struct xen_argo_ring
 #endif
 } xen_argo_ring_t;
 
+typedef struct xen_argo_register_ring
+{
+    xen_argo_port_t aport;
+    domid_t partner_id;
+    uint16_t pad;
+    uint32_t len;
+} xen_argo_register_ring_t;
+
+/* Messages on the ring are padded to a multiple of this size. */
+#define XEN_ARGO_MSG_SLOT_SIZE 0x10
+
+struct xen_argo_ring_message_header
+{
+    uint32_t len;
+    struct xen_argo_addr source;
+    uint32_t message_type;
+#if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 199901L
+    uint8_t data[];
+#elif defined(__GNUC__)
+    uint8_t data[0];
+#endif
+};
+
+/*
+ * Hypercall operations
+ */
+
+/*
+ * XEN_ARGO_OP_register_ring
+ *
+ * Register a ring using the guest-supplied memory pages.
+ * Also used to reregister an existing ring (eg. after resume from hibernate).
+ *
+ * The first argument struct indicates the port number for the ring to register
+ * and the partner domain, if any, that is to be allowed to send to the ring.
+ * A wildcard (XEN_ARGO_DOMID_ANY) may be supplied instead of a partner domid,
+ * and if the hypervisor has wildcard sender rings enabled, this will allow
+ * any domain (XSM notwithstanding) to send to the ring.
+ *
+ * The second argument is an array of guest frame numbers and the third 
argument
+ * indicates the size of the array. This operation only supports 4K-sized 
pages.
+ *
+ * arg1: XEN_GUEST_HANDLE(xen_argo_register_ring_t)
+ * arg2: XEN_GUEST_HANDLE(xen_pfn_t)
+ * arg3: unsigned long npages
+ * arg4: unsigned long flags (32-bit value)
+ */
+#define XEN_ARGO_OP_register_ring     1
+
+/* Register op flags */
+/*
+ * Fail exist:
+ * If set, reject attempts to (re)register an existing established ring.
+ * If clear, reregistration occurs if the ring exists, with the new ring
+ * taking the place of the old, preserving tx_ptr if it remains valid.
+ */
+#define XEN_ARGO_REGISTER_FLAG_FAIL_EXIST  0x1
+
+#ifdef __XEN__
+/* Mask for all defined flags. */
+#define XEN_ARGO_REGISTER_FLAG_MASK XEN_ARGO_REGISTER_FLAG_FAIL_EXIST
+#endif
+
 #endif
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 16601d9..349fbad 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -31,7 +31,9 @@
 !      mc_physcpuinfo                  arch-x86/xen-mca.h
 ?      page_offline_action             arch-x86/xen-mca.h
 ?      argo_addr                       argo.h
+?      argo_register_ring              argo.h
 ?      argo_ring                       argo.h
+?      argo_ring_message_header        argo.h
 ?      evtchn_alloc_unbound            event_channel.h
 ?      evtchn_bind_interdomain         event_channel.h
 ?      evtchn_bind_ipi                 event_channel.h
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.