WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

Re: [Xen-users] Nasty kernel panic

To: Steven Timm <timm@xxxxxxxx>
Subject: Re: [Xen-users] Nasty kernel panic
From: Tim Post <echo@xxxxxxxxxxxx>
Date: Fri, 29 Aug 2008 10:38:35 +0800
Cc: xen-users@xxxxxxxxxxxxxxxxxxx
Delivery-date: Thu, 28 Aug 2008 19:39:27 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <Pine.LNX.4.64.0808281643210.7510@xxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
Organization: echoreply.us
References: <Pine.LNX.4.64.0808281643210.7510@xxxxxxxxxxxxxxxxx>
Reply-to: echo@xxxxxxxxxxxx
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
Hi Steve,

On Thu, 2008-08-28 at 16:52 -0500, Steven Timm wrote:
> I have seen the following kernel panic 5 times today on
> three different machines, two of which had been stable
> for months and one of which is a brand new install.

[snip]

> <Aug/28 12:21 pm> [<ffffffff88107a79>] 
> :e1000:e1000_clean_rx_irq+0x430/0x4d5
> <Aug/28 12:21 pm> [<ffffffff881074ec>] :e1000:e1000_clean+0x82/0x160
> <Aug/28 12:21 pm> [<ffffffff80395f51>] net_rx_action+0xe7/0x254
> <Aug/28 12:21 pm> [<ffffffff80233d97>] __do_softirq+0x7b/0x10d
> <Aug/28 12:21 pm> [<ffffffff8020b094>] call_softirq+0x1c/0x28
> <Aug/28 12:21 pm> [<ffffffff8020cdfd>] do_softirq+0x62/0xd9
> <Aug/28 12:21 pm> [<ffffffff8020cc9c>] do_IRQ+0x68/0x71
> <Aug/28 12:21 pm> [<ffffffff8034b347>] evtchn_do_upcall+0xee/0x165
> <Aug/28 12:21 pm> [<ffffffff8020abca>] do_hypervisor_callback+0x1e/0x2c
> <Aug/28 12:21 pm> <EOI>
> 
> <Aug/28 12:21 pm>Code: 41 8b 85 f4 00 00 00 4d 85 ed 4d 89 ec 89 44 24 0c 
> 0f 84
> 36
> <Aug/28 12:21 pm>RIP  [<ffffffff88256375>] :ipv6:rt6_select+0x38/0x1f4
> <Aug/28 12:21 pm> RSP <ffffffff80526b00>
> <Aug/28 12:21 pm>CR2: 00000000000000f4
> <Aug/28 12:21 pm> <0>Kernel panic - not syncing: Aiee, killing interrupt 
> handler

It looks like e1000 might be being spit out. From what I gather in your
message, the only thing that changed was you are now putting a much
higher I/O demand on the drives (rsyncing everything), by extension this
increases the demand on the NIC.

If the e1000 nic is the one enslaved to the bridge, it could be clean up
that's making it freak when a guest stops. If its ejected uncleanly, the
PID next in line with pending i/o for the device will likely be
identified as the culprit.

I had a very similar problem with a buggy Areca driver on dom-0 a couple
of years ago.

Can you post a link to your kernel's .config, or perhaps try the latest
stable version of that module from:

http://sourceforge.net/project/showfiles.php?group_id=42302

As for ipv6, if its being set up you'll see it in /etc/sysconfig
or /etc/network (depending on the distro) pretty clearly. However, that
shouldn't make a difference .. it should work either way.

Hope this helps :)


Cheers!
--Tim

-- 
Monkey + Typewriter = Echoreply ( http://echoreply.us )


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users