[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4 09/14] argo: implement the sendv op; evtchn: expose send_guest_global_virq



On Wed, Jan 16, 2019 at 10:48:52PM -0800, Christopher Clark wrote:
> On Tue, Jan 15, 2019 at 7:49 AM Roger Pau Monné <roger.pau@xxxxxxxxxx> wrote:
> >
> > On Tue, Jan 15, 2019 at 01:27:41AM -0800, Christopher Clark wrote:
> > > sendv operation is invoked to perform a synchronous send of buffers
> > > contained in iovs to a remote domain's registered ring.
> > >
> > > It takes:
> > >  * A destination address (domid, port) for the ring to send to.
> > >    It performs a most-specific match lookup, to allow for wildcard.
> > >  * A source address, used to inform the destination of where to reply.
> > >  * The address of an array of iovs containing the data to send
> > >  * .. and the length of that array of iovs
> > >  * and a 32-bit message type, available to communicate message context
> > >    data (eg. kernel-to-kernel, separate from the application data).
> > >
> > > If insufficient space exists in the destination ring, it will return
> > > -EAGAIN and Xen will notify the caller when sufficient space becomes
> > > available.
> > >
> > > Accesses to the ring indices are appropriately atomic. The rings are
> > > mapped into Xen's private address space to write as needed and the
> > > mappings are retained for later use.
> > >
> > > Fixed-size types are used in some areas within this code where caution
> > > around avoiding integer overflow is important.
> > >
> > > Notifications are sent to guests via VIRQ and send_guest_global_virq is
> > > exposed in the change to enable argo to call it. VIRQ_ARGO_MESSAGE is
> > > claimed from the VIRQ previously reserved for this purpose (#11).
> > >
> > > The VIRQ notification method is used rather than sending events using
> > > evtchn functions directly because:
> > >
> > > * no current event channel type is an exact fit for the intended
> > >   behaviour. ECS_IPI is closest, but it disallows migration to
> > >   other VCPUs which is not necessarily a requirement for Argo.
> > >
> > > * at the point of argo_init, allocation of an event channel is
> > >   complicated by none of the guest VCPUs being initialized yet
> > >   and the event channel logic expects that a valid event channel
> > >   has a present VCPU.
> > >
> > > * at the point of signalling a notification, the VIRQ logic is already
> > >   defensive: if d->vcpu[0] is NULL, the notification is just silently
> > >   dropped, whereas the evtchn_send logic is not so defensive: vcpu[0]
> > >   must not be NULL, otherwise a null pointer dereference occurs.
> > >
> > > Using a VIRQ removes the need for the guest to query to determine which
> > > event channel notifications will be delivered on. This is also likely to
> > > simplify establishing future L0/L1 nested hypervisor argo communication.
> > >
> > > Signed-off-by: Christopher Clark <christopher.clark6@xxxxxxxxxxxxxx>
> >
> > LGTM, one question below and one comment.
> >
> > > +static int
> > > +ringbuf_insert(const struct domain *d, struct argo_ring_info *ring_info,
> > > +               const struct argo_ring_id *src_id,
> > > +               XEN_GUEST_HANDLE_PARAM(xen_argo_iov_t) iovs_hnd,
> > > +               unsigned long niov, uint32_t message_type,
> > > +               unsigned long *out_len)
> > > +{
> > > +    xen_argo_ring_t ring;
> > > +    struct xen_argo_ring_message_header mh = { };
> > > +    int32_t sp;
> > > +    int32_t ret;
> > > +    uint32_t len = 0;
> > > +    xen_argo_iov_t iovs[XEN_ARGO_MAXIOV];
> > > +    xen_argo_iov_t *piov;
> > > +    XEN_GUEST_HANDLE(uint8_t) NULL_hnd =
> > > +       guest_handle_from_param(guest_handle_from_ptr(NULL, uint8_t), 
> > > uint8_t);
> > > +
> > > +    ASSERT(LOCKING_L3(d, ring_info));
> > > +
> > > +    ret = __copy_from_guest(iovs, iovs_hnd, niov) ? -EFAULT : 0;
> > > +    if ( ret )
> > > +        return ret;
> > > +
> > > +    /*
> > > +     * Obtain the total size of data to transmit -- sets the 'len' 
> > > variable
> > > +     * -- and sanity check that the iovs conform to size and number 
> > > limits.
> > > +     * Enforced below: no more than 'len' bytes of guest data
> > > +     * (plus the message header) will be sent in this operation.
> > > +     */
> > > +    ret = iov_count(iovs, niov, &len);
> > > +    if ( ret )
> > > +        return ret;
> > > +
> > > +    /*
> > > +     * Size bounds check against ring size and static maximum message 
> > > limit.
> > > +     * The message must not fill the ring; there must be at least one 
> > > slot
> > > +     * remaining so we can distinguish a full ring from an empty one.
> > > +     */
> >
> > NB: I think if you didn't wrap the ring indexes (so always increasing
> > them) you could always identify an empty ring from a full ring, and
> > you wouldn't require always having at least one empty slot, unless I'm
> > missing something.
> 
> I haven't yet worked it through to be sure on this one, with the
> rx_ptr under guest control, so possibly able to force the increment of
> the tx_ptr at some high rate? but I can see that it might be possible,
> yes.

AFAICT this part of the public protocol, so there's not much that can
be done to change it now I guess?

Shared memory rings used by most of the PV devices don't wrap the ring
indexes to the size of the ring, and the consumer is always chasing
the producer index. This avoids loosing one slot. Anyway I though it
was worth mentioning it for reference.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.