[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC PATCH] dpci: Put the dpci back on the list if running on another CPU.



On Tue, Jan 13, 2015 at 10:20:00AM +0000, Jan Beulich wrote:
> >>> On 12.01.15 at 17:45, <konrad.wilk@xxxxxxxxxx> wrote:
> > There is race when we clear the STATE_SCHED in the softirq
> > - which allows the 'raise_softirq_for' to progress and
> > schedule another dpci. During that time the other CPU could
> > receive an interrupt and calls 'raise_softirq_for' and put
> > the dpci on its per-cpu list. There would be two 'dpci_softirq'
> > running at the same time (on different CPUs) where the
> > dpci state is STATE_RUN (and STATE_SCHED is cleared). This
> > ends up hitting:
> > 
> >  if ( test_and_set_bit(STATE_RUN, &pirq_dpci->state) )
> >     BUG()
> > 
> > Instead of that put the dpci back on the per-cpu list to deal
> > with later.
> > 
> > The reason we can get his with this is when an interrupt
> > affinity is set over multiple CPUs.
> > 
> > Another potential fix would be to add a guard in the raise_softirq_for
> > to check for 'STATE_RUN' bit being set and not schedule the
> > dpci until that bit has been cleared.
> 
> I indeed think this should be investigated, because it would make
> explicit what ...
> 
> > --- a/xen/drivers/passthrough/io.c
> > +++ b/xen/drivers/passthrough/io.c
> > @@ -804,7 +804,17 @@ static void dpci_softirq(void)
> >          d = pirq_dpci->dom;
> >          smp_mb(); /* 'd' MUST be saved before we set/clear the bits. */
> >          if ( test_and_set_bit(STATE_RUN, &pirq_dpci->state) )
> > -            BUG();
> > +        {
> > +            unsigned long flags;
> > +
> > +            /* Put back on the list and retry. */
> > +            local_irq_save(flags);
> > +            list_add_tail(&pirq_dpci->softirq_list, &this_cpu(dpci_list));
> > +            local_irq_restore(flags);
> > +
> > +            raise_softirq(HVM_DPCI_SOFTIRQ);
> > +            continue;
> > +        }
> 
> ... this does implicitly - spin until the bad condition cleared.

Here is a patch that does this. I don't yet have an setup to test
the failing scenario (working on that). I removed the part in
the softirq because with this patch I cannot see a way it would
ever get there (in the softirq hitting the BUG). 

From e53185762ce184e835121ab4fbf7897568c0cdc4 Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Date: Mon, 12 Jan 2015 11:35:03 -0500
Subject: [PATCH] dpci: Don't schedule the dpci when it is in STATE_RUN on any
 per-cpu list.

There is race when we clear the STATE_SCHED in the softirq
- which allows the 'raise_softirq_for' to progress and
schedule another dpci. During that time the other CPU could
receive an interrupt and calls 'raise_softirq_for' and put
the dpci on its per-cpu list. There would be two 'dpci_softirq'
running at the same time (on different CPUs) where the
dpci state is STATE_RUN (and STATE_SCHED is cleared). This
ends up hitting:

 if ( test_and_set_bit(STATE_RUN, &pirq_dpci->state) )
        BUG()

The reason we can get his with this is when an interrupt
affinity is set over multiple CPUs.

Instead of the BUG() we can put the dpci back on the per-cpu
list to deal with later (when the softirq are activated again).
However since this putting the 'dpci' back on the per-cpu list
is basically spin until the bad condition cleared which is
not nice.

Hence this patch adds an guard in the raise_softirq_for to
check for 'STATE_RUN' bit being set and not schedule the dpci
until that bit is cleared.

We also moved in the 'raise_softirq_for' the insertion of
dpci in the list in the case statement to make it easier
to read.

Reported-by: Sander Eikelenboom <linux@xxxxxxxxxxxxxx>
Reported-by: Malcolm Crossley <malcolm.crossley@xxxxxxxxxx>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
---
 xen/drivers/passthrough/io.c | 22 +++++++++++++++-------
 1 file changed, 15 insertions(+), 7 deletions(-)

diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
index ae050df..802127a 100644
--- a/xen/drivers/passthrough/io.c
+++ b/xen/drivers/passthrough/io.c
@@ -64,16 +64,24 @@ static void raise_softirq_for(struct hvm_pirq_dpci 
*pirq_dpci)
 {
     unsigned long flags;
 
-    if ( test_and_set_bit(STATE_SCHED, &pirq_dpci->state) )
+    switch ( cmpxchg(&pirq_dpci->state, 0, 1 << STATE_SCHED) )
+    {
+    case (1 << STATE_SCHED):
+    case (1 << STATE_RUN):
+    case (1 << STATE_RUN) | (1 << STATE_SCHED):
         return;
+    case 0:
+        get_knownalive_domain(pirq_dpci->dom);
 
-    get_knownalive_domain(pirq_dpci->dom);
+        local_irq_save(flags);
+        list_add_tail(&pirq_dpci->softirq_list, &this_cpu(dpci_list));
+        local_irq_restore(flags);
 
-    local_irq_save(flags);
-    list_add_tail(&pirq_dpci->softirq_list, &this_cpu(dpci_list));
-    local_irq_restore(flags);
-
-    raise_softirq(HVM_DPCI_SOFTIRQ);
+        raise_softirq(HVM_DPCI_SOFTIRQ);
+        break;
+    default:
+        BUG();
+    }
 }
 
 /*
-- 
2.1.0

> 
> Additionally I think it should be considered whether the bitmap
> approach of interpreting ->state is the right one, and we don't
> instead want a clean 3-state (idle, sched, run) model.

Could you elaborate a bit more please? As in three different unsigned int
(or bool_t) that set in what state we are in?
> 
> Jan
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.