WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] RE: Recurring OOPS in latest -unstable kernel

To: "Keir Fraser" <Keir.Fraser@xxxxxxxxxxxx>, "Kip Macy" <kip.macy@xxxxxxxxx>
Subject: [Xen-devel] RE: Recurring OOPS in latest -unstable kernel
From: "Ian Pratt" <m+Ian.Pratt@xxxxxxxxxxxx>
Date: Sun, 3 Jul 2005 20:36:29 +0100
Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>, Ian Pratt <Ian.Pratt@xxxxxxxxxxxx>
Delivery-date: Sun, 03 Jul 2005 19:35:20 +0000
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcV/qbXT76UQF8iWTr+2BAejgqbbrwAXFPOg
Thread-topic: Recurring OOPS in latest -unstable kernel
 

> -----Original Message-----

> > I hit the following oops a couple of times a day - it seems to 
> > correspond to tearing down a vif:
> 
> Are you actually trying to tear down a vif when the crash 
> occurs, or is its refcnt falling to zero because of a bug?
> 
> We've had this bug report at least once before, but I 
> couldn;t find any obvious problem from reading through the 
> backtrace...

This sounds rather like the bug that's being seen with the ported SuSE
kernel. Appended is a summary of the info we have on it.

Ian


The problem really looks obscure to me, a requests seems to be routed to
the wrong netback(vifX.0) device, the refcount drops to 0 and then we
OOps. (The normal oops path is the BUG() in line
101 of netback/interface.c, I patched the kernel to get a backtrace at
the place where we schedule the work.)

The same code (in netback) works in 2.6.9rc2/2.6.11.x, so something
screws up the ringbuffers -- should we start reviewing the path down
from hypervisor_callback?

Something strange seems to happen there with ringbuffer assignment to
interfaces and I guess we need to review the upcall path.
Somewhere, we may clobber an argument, possibly involving CONFIG_REGPARM
...
I don't know the code well enough see it without adding a lot of
instrumentation to the code.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>