[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen 4.5 random freeze question



On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote:
> On Wed, Nov 19, 2014 at 6:01 PM, Andrii Tseglytskyi
> <andrii.tseglytskyi@xxxxxxxxxxxxxxx> wrote:
> > On Wed, Nov 19, 2014 at 5:41 PM, Stefano Stabellini
> > <stefano.stabellini@xxxxxxxxxxxxx> wrote:
> >> On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote:
> >>> Hi Stefano,
> >>>
> >>> On Wed, Nov 19, 2014 at 4:52 PM, Stefano Stabellini
> >>> <stefano.stabellini@xxxxxxxxxxxxx> wrote:
> >>> > On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote:
> >>> >> Hi Stefano,
> >>> >>
> >>> >> > >      if ( !list_empty(&current->arch.vgic.lr_pending) && 
> >>> >> > > lr_all_full() )
> >>> >> > > -        GICH[GICH_HCR] |= GICH_HCR_UIE;
> >>> >> > > +        GICH[GICH_HCR] |= GICH_HCR_NPIE;
> >>> >> > >      else
> >>> >> > > -        GICH[GICH_HCR] &= ~GICH_HCR_UIE;
> >>> >> > > +        GICH[GICH_HCR] &= ~GICH_HCR_NPIE;
> >>> >> > >
> >>> >> > >  }
> >>> >> >
> >>> >> > Yes, exactly
> >>> >>
> >>> >> I tried, hang still occurs with this change
> >>> >
> >>> > We need to figure out why during the hang you still have all the LRs
> >>> > busy even if you are getting maintenance interrupts that should cause
> >>> > them to be cleared.
> >>> >
> >>>
> >>> I see that I have free LRs during maintenance interrupt
> >>>
> >>> (XEN) gic.c:871:d0v0 maintenance interrupt
> >>> (XEN) GICH_LRs (vcpu 0) mask=0
> >>> (XEN)    HW_LR[0]=9a015856
> >>> (XEN)    HW_LR[1]=0
> >>> (XEN)    HW_LR[2]=0
> >>> (XEN)    HW_LR[3]=0
> >>> (XEN) Inflight irq=86 lr=0
> >>> (XEN) Inflight irq=2 lr=255
> >>> (XEN) Pending irq=2
> >>>
> >>> But I see that after I got hang - maintenance interrupts are generated
> >>> continuously. Platform continues printing the same log till reboot.
> >>
> >> Exactly the same log? As in the one above you just pasted?
> >> That is very very suspicious.
> >
> > Yes exactly the same log. And looks like it means that LRs are flushed
> > correctly.
> >
> >>
> >> I am thinking that we are not handling GICH_HCR_UIE correctly and
> >> something we do in Xen, maybe writing to an LR register, might trigger a
> >> new maintenance interrupt immediately causing an infinite loop.
> >>
> >
> > Yes, this is what I'm thinking about. Taking in account all collected
> > debug info it looks like once LRs are overloaded with SGIs -
> > maintenance interrupt occurs.
> > And then it is not handled properly, and occurs again and again - so
> > platform hangs inside its handler.
> >
> >> Could you please try this patch? It disable GICH_HCR_UIE immediately on
> >> hypervisor entry.
> >>
> >
> > Now trying.
> >
> >>
> >> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> >> index 4d2a92d..6ae8dc4 100644
> >> --- a/xen/arch/arm/gic.c
> >> +++ b/xen/arch/arm/gic.c
> >> @@ -701,6 +701,8 @@ void gic_clear_lrs(struct vcpu *v)
> >>      if ( is_idle_vcpu(v) )
> >>          return;
> >>
> >> +    GICH[GICH_HCR] &= ~GICH_HCR_UIE;
> >> +
> >>      spin_lock_irqsave(&v->arch.vgic.lock, flags);
> >>
> >>      while ((i = find_next_bit((const unsigned long *) &this_cpu(lr_mask),
> >> @@ -821,12 +823,8 @@ void gic_inject(void)
> >>
> >>      gic_restore_pending_irqs(current);
> >>
> >> -
> >>      if ( !list_empty(&current->arch.vgic.lr_pending) && lr_all_full() )
> >>          GICH[GICH_HCR] |= GICH_HCR_UIE;
> >> -    else
> >> -        GICH[GICH_HCR] &= ~GICH_HCR_UIE;
> >> -
> >>  }
> >>
> >>  static void do_sgi(struct cpu_user_regs *regs, int othercpu, enum gic_sgi 
> >> sgi)
> >
> 
> Heh - I don't see hangs with this patch :) But also I see that
> maintenance interrupt doesn't occur (and no hang as result)
> Stefano - is this expected?

No maintenance interrupts at all? That's strange. You should be
receiving them when LRs are full and you still have interrupts pending
to be added to them.

You could add another printk here to see if you should be receiving
them:

     if ( !list_empty(&current->arch.vgic.lr_pending) && lr_all_full() )
+    {
+        gdprintk(XENLOG_DEBUG, "requesting maintenance interrupt\n");
         GICH[GICH_HCR] |= GICH_HCR_UIE;
-    else
-        GICH[GICH_HCR] &= ~GICH_HCR_UIE;
-
+    }
 }


> >
> >
> > --
> >
> > Andrii Tseglytskyi | Embedded Dev
> > GlobalLogic
> > www.globallogic.com
> 
> 
> 
> -- 
> 
> Andrii Tseglytskyi | Embedded Dev
> GlobalLogic
> www.globallogic.com
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.