Xen project Mailing List

Re: [Xen-devel] [PATCH] evtchn: clean last_vcpu_id on EVTCHNOP_reset to avoid crash

To: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxx>

From: David Vrabel <david.vrabel@xxxxxxxxxx>

Date: Fri, 8 Aug 2014 16:05:46 +0100

Delivery-date: Fri, 08 Aug 2014 15:05:59 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 08/08/14 15:22, Vitaly Kuznetsov wrote: > When EVTCHNOP_reset is being performed last_vcpu_id attribute is not being > cleaned by __evtchn_close(). In case last_vcpu_id != 0 for a particular > event channel and this event channel is going to be used for event delivery > (for another vcpu) before EVTCHNOP_init_control for vcpu == last_vcpu_id > was done the following crash is observed: > > ... > (XEN) Xen call trace: > (XEN) [<ffff82d080127785>] _spin_lock_irqsave+0x5/0x70 > (XEN) [<ffff82d0801097db>] evtchn_fifo_set_pending+0xdb/0x370 > (XEN) [<ffff82d080107146>] evtchn_send+0xd6/0x160 > (XEN) [<ffff82d080107df9>] do_event_channel_op+0x6a9/0x16c0 > (XEN) [<ffff82d0801ce800>] vmx_intr_assist+0x30/0x480 > (XEN) [<ffff82d080219e99>] syscall_enter+0xa9/0xae > > This happens because lock_old_queue() does not check VCPU's control > block existence and after EVTCHNOP_reset they are all cleaned. > > I suggest we fix the issue twice: reset last_vcpu_id to 0 in __evtchn_close() > and add appropriate check to lock_old_queue() as lost event is much better > than hypervisor crash. > > Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> > --- > xen/common/event_channel.c | 3 +++ > xen/common/event_fifo.c | 9 +++++++++ > 2 files changed, 12 insertions(+) > > diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c > index a7becae..67b9d53 100644 > --- a/xen/common/event_channel.c > +++ b/xen/common/event_channel.c > @@ -578,6 +578,9 @@ static long __evtchn_close(struct domain *d1, int port1) > chn1->state = ECS_FREE; > chn1->notify_vcpu_id = 0; > > + /* Reset last_vcpu_id to vcpu0 as control block can be freed */ > + chn1->last_vcpu_id = 0; This is broken if the event channel is closed and rebound while the event is linked. You can only safely clear chn->last_vcpu_id during evtchn_fifo_destroy(). You also need to clear last_priority. > + > xsm_evtchn_close_post(chn1); > > out: > diff --git a/xen/common/event_fifo.c b/xen/common/event_fifo.c > index 51b4ff6..e4bef80 100644 > --- a/xen/common/event_fifo.c > +++ b/xen/common/event_fifo.c > @@ -61,6 +61,15 @@ static struct evtchn_fifo_queue *lock_old_queue(const > struct domain *d, > for ( try = 0; try < 3; try++ ) > { > v = d->vcpu[evtchn->last_vcpu_id]; > + > + if ( !v->evtchn_fifo ) > + { > + gdprintk(XENLOG_ERR, > + "domain %d vcpu %d has no control block!\n", > + d->domain_id, v->vcpu_id); > + return NULL; > + } I think this check needs to be in evtchn_fifo_init() to prevent the event from being bound to VCPU that does not have a control block. > + > old_q = &v->evtchn_fifo->queue[evtchn->last_priority]; > > spin_lock_irqsave(&old_q->lock, *flags); David _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.