[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [PATCH v2] evtchn: clean last_vcpu_id on EVTCHNOP_reset to avoid crash



When EVTCHNOP_reset is being performed last_vcpu_id attribute is not being
cleaned by __evtchn_close(). In case last_vcpu_id != 0 for a particular
event channel and this event channel is going to be used for event delivery
(for another vcpu) before EVTCHNOP_init_control for vcpu == last_vcpu_id
was done the following crash is observed:

 ...
 (XEN) Xen call trace:
 (XEN)    [<ffff82d080127785>] _spin_lock_irqsave+0x5/0x70
 (XEN)    [<ffff82d0801097db>] evtchn_fifo_set_pending+0xdb/0x370
 (XEN)    [<ffff82d080107146>] evtchn_send+0xd6/0x160
 (XEN)    [<ffff82d080107df9>] do_event_channel_op+0x6a9/0x16c0
 (XEN)    [<ffff82d0801ce800>] vmx_intr_assist+0x30/0x480
 (XEN)    [<ffff82d080219e99>] syscall_enter+0xa9/0xae

This happens because lock_old_queue() does not check VCPU's control
block existence for last_vcpu_id and after EVTCHNOP_reset they are all cleaned.

I suggest we fix the issue twice: reset last_vcpu_id to 0 in 
evtchn_fifo_destroy()
and check that evtchn->last_vcpu_id has its control block initialized in
evtchn_fifo_init() resetting to notify_vcpu_id in case it is not.

Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
---
Changes from v1:
- Make use of '%pv' when printing errors, use printk instead of gdprintk
  [Jan Beulich]
- Reset last_vcpu_id and last_priority in evtchn_fifo_destroy() to avoid
  breakage when the event channel is closed and rebound while the
  event is linked [David Vrabel]
- Check last_vcpu_id in evtchn_fifo_init() instead of lock_old_queue() to
  catch the issue earlier. It would be nice to check notify_vcpu_id here
  as well but unfortunatelly current linux kernel sets up timer VIRQ for
  all secondary VCPUs before calling EVTCHNOP_init_control for them but
  after switching to FIFO-base ABI (EVTCHNOP_init_control for VCPU0)
  [David Vrabel]
---
 xen/common/event_fifo.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/xen/common/event_fifo.c b/xen/common/event_fifo.c
index 51b4ff6..7d506bd 100644
--- a/xen/common/event_fifo.c
+++ b/xen/common/event_fifo.c
@@ -37,10 +37,24 @@ static inline event_word_t 
*evtchn_fifo_word_from_port(struct domain *d,
 static void evtchn_fifo_init(struct domain *d, struct evtchn *evtchn)
 {
     event_word_t *word;
+    struct vcpu *v;
 
     evtchn->priority = EVTCHN_FIFO_PRIORITY_DEFAULT;
 
     /*
+     * Check that and evtchn->last_vcpu_id has its control block initialized
+     * and reset to notify_vcpu_id in case it is not.
+     */
+    v = d->vcpu[evtchn->last_vcpu_id];
+    if ( !v->evtchn_fifo )
+    {
+        printk("%pv: event channel %d has last_vcpu_id=%d with uninitialized "
+               "control block, reset to VCPU %d !\n", v, evtchn->port,
+               evtchn->last_vcpu_id, evtchn->notify_vcpu_id);
+        evtchn->last_vcpu_id = evtchn->notify_vcpu_id;
+    }
+
+    /*
      * If this event is still linked, the first event may be delivered
      * on the wrong VCPU or with an unexpected priority.
      */
@@ -588,10 +602,24 @@ int evtchn_fifo_expand_array(const struct 
evtchn_expand_array *expand_array)
 void evtchn_fifo_destroy(struct domain *d)
 {
     struct vcpu *v;
+    struct evtchn *chn;
+    int i;
 
     for_each_vcpu( d, v )
         cleanup_control_block(v);
     cleanup_event_array(d);
+
+    /*
+     * Reset last_vcpu_id and last_priority as __evtchn_close() doesn't do it
+     * and channels can be reused again after EVTCHNOP_init_control (e.g. in
+     * kexec case).
+     */
+    for ( i = 0; port_is_valid(d, i); i++ )
+    {
+        chn = evtchn_from_port(d, i);
+        chn->last_vcpu_id = 0;
+        chn->last_priority = 0;
+    }
 }
 
 /*
-- 
1.9.3


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.