[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] evtchn: clean last_vcpu_id on EVTCHNOP_reset to avoid crash



On 08/08/14 15:22, Vitaly Kuznetsov wrote:
> When EVTCHNOP_reset is being performed last_vcpu_id attribute is not being
> cleaned by __evtchn_close(). In case last_vcpu_id != 0 for a particular
> event channel and this event channel is going to be used for event delivery
> (for another vcpu) before EVTCHNOP_init_control for vcpu == last_vcpu_id
> was done the following crash is observed:
> 
>  ...
>  (XEN) Xen call trace:
>  (XEN)    [<ffff82d080127785>] _spin_lock_irqsave+0x5/0x70
>  (XEN)    [<ffff82d0801097db>] evtchn_fifo_set_pending+0xdb/0x370
>  (XEN)    [<ffff82d080107146>] evtchn_send+0xd6/0x160
>  (XEN)    [<ffff82d080107df9>] do_event_channel_op+0x6a9/0x16c0
>  (XEN)    [<ffff82d0801ce800>] vmx_intr_assist+0x30/0x480
>  (XEN)    [<ffff82d080219e99>] syscall_enter+0xa9/0xae
> 
> This happens because lock_old_queue() does not check VCPU's control
> block existence and after EVTCHNOP_reset they are all cleaned.
> 
> I suggest we fix the issue twice: reset last_vcpu_id to 0 in __evtchn_close()
> and add appropriate check to lock_old_queue() as lost event is much better
> than hypervisor crash.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
> ---
>  xen/common/event_channel.c | 3 +++
>  xen/common/event_fifo.c    | 9 +++++++++
>  2 files changed, 12 insertions(+)
> 
> diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
> index a7becae..67b9d53 100644
> --- a/xen/common/event_channel.c
> +++ b/xen/common/event_channel.c
> @@ -578,6 +578,9 @@ static long __evtchn_close(struct domain *d1, int port1)
>      chn1->state          = ECS_FREE;
>      chn1->notify_vcpu_id = 0;
>  
> +    /* Reset last_vcpu_id to vcpu0 as control block can be freed */
> +    chn1->last_vcpu_id = 0;

This is broken if the event channel is closed and rebound while the
event is linked.

You can only safely clear chn->last_vcpu_id during evtchn_fifo_destroy().

You also need to clear last_priority.

> +
>      xsm_evtchn_close_post(chn1);
>  
>   out:
> diff --git a/xen/common/event_fifo.c b/xen/common/event_fifo.c
> index 51b4ff6..e4bef80 100644
> --- a/xen/common/event_fifo.c
> +++ b/xen/common/event_fifo.c
> @@ -61,6 +61,15 @@ static struct evtchn_fifo_queue *lock_old_queue(const 
> struct domain *d,
>      for ( try = 0; try < 3; try++ )
>      {
>          v = d->vcpu[evtchn->last_vcpu_id];
> +
> +        if ( !v->evtchn_fifo )
> +        {
> +            gdprintk(XENLOG_ERR,
> +                     "domain %d vcpu %d has no control block!\n",
> +                     d->domain_id, v->vcpu_id);
> +            return NULL;
> +        }

I think this check needs to be in evtchn_fifo_init() to prevent the
event from being bound to VCPU that does not have a control block.

> +
>          old_q = &v->evtchn_fifo->queue[evtchn->last_priority];
>  
>          spin_lock_irqsave(&old_q->lock, *flags);

David

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.