[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Network stops responding after some time



Hi Krik,

Your problem seems to be identical to mine. Is your kernel x86-64 as well? I 
was wondering if this
problem also manifested in 32bit as well and I never got around to trying to 
myself.

Hope we can find a fix for this.

Adnan

----- Original Message -----
From: Kirk Allan <kallan@xxxxxxxxxx>
To: xen-devel@xxxxxxxxxxxxxxxxxxx
Cc: Ian.Pratt@xxxxxxxxxxxx, Keir Fraser <Keir.Fraser@xxxxxxxxxxxx>, 
adnan@xxxxxxxxxxxxxxxxxxx
Sent: Tue,  8 Aug 2006 18:04:33 -0500
Subject: Re: [Xen-devel] Network stops responding after some time


> I am seeing the same dom0 problem on a box that also has an integrated nForce3
> Ethernet adapter.  The box started out with SLES 10 but I removed the SUSE
> supplied Xen components and am using Xen-unstable change set 10982.
> 
> I am not starting xend, so bridging is not involved.  To eliminate the LAN
> driver issue, I used an Intel E100 adapter.
> 
> It seems that networking goes along OK as long as traffic is light.  To 
> trigger
> the problem, I download a CD ISO image.  Downloading an ISO causes the failure
> every time.  The download progress varies from not even getting started to 426
> MB out of the 676 MB.  When the download progress stops, checking
> /proc/interrupts shows that interrupts for the LAN adapter have stopped.
> 
> As suggested, I tried the 'ioapic_ack=old' Xen boot parameter.  It didn't 
> help.
> 
> I don't believe it is a LAN driver problem because doing a 'rmmod e100' and an
> 'insmod e100.ko' does not get things going again.
> 
> Attached are various log files that will hopefully be of some use.
> 
> xendebug * contains the serial output as xen is booting up.  It also contains
> the dump_irqs from the 'i' and th print_IO_APIC_keyhandler from the 'z' after
> doing the <ctrl a> <ctrl a><ctrl a> from the debug terminal.  I added counters
> to mask_and_ack_level_ioapic_vector and end_level_ioapic_vector.  The
> interesting thing here is that the e100 is using int 19 and 
> mask_and_ack_level_ioapic_vector, end_level_ioapic_vector, and int 19 from
> /proc/interrupts all have the save value (317059) when interrupts stop.  Also
> the irr is set to 1 when interrupts stop.
> 
> cpuinfo * contents of /proc/cpuinfo
> 
> hwinfo * results of doing 'hwinfo'
> 
> dmesg_native * dmesg from a native kernel boot.
> 
> dmesg * dmesg from booting the xen kernel.  After interrupts stop, there were 
> no
> new messages.
> 
> messages - the /var/log/messages for the xen booting kernel.  It also contains
> the 'rmmod e100' and the 'insmod e100' messages after interrupts stopped for 
> the
> e100. 
> 
> ifconfig * the results of doing ifconfig after interrupts stop, after 'rmmod
> e100', and after 'insmod e100' all catted together.
> 
> ints * the results of /proc/interrupts on a native boot, after a xen kernel
> boot, and after interrupts stop on the xen kernel.  Native uses int  201 for 
> the
> e100 and xen uses int 19.
> 
> Any help on this issue is greatly appricated.
> 
> Thanks,
> Kirk
> 
> >>> On Wed, Jul 26, 2006 at  4:07 AM, in message
> <00812518ddfbf4c535e7a4d25d8bbab3@xxxxxxxxxxxx>, Keir Fraser
> <Keir.Fraser@xxxxxxxxxxxx> wrote: 
> 
> > On 24 Jul 2006, at 19:49, Adnan Khaleel wrote:
> > 
> >> I need help in trying to understand why the ethernet driver has locked 
> >> up and how I can go about outputting debug messages. I see in 
> >> /proc/interrrupts that the interrupt count of eth0 just stops 
> >> incrementing. I've tried different (3Com, Realtek 8169, Realtek 8139) 
> >> based network cards and this happens with all. I'm not sure what is so 
> >> unique about my system that might be causing this to lockup, its a 
> >> regular Nforce3 based system with 512MB ram. This problem does not 
> >> happen if I'm not using a Xen enabled kernel. This is entirely 
> >> happening in dom0 and there aren't any user domains so its not a 
> >> bridging issue. I've also disabled the Xen backend drivers (netbk.ko 
> >> and netloop.ko) so its talking to the network chip directly.
> >>
> >> Any help? Please?
> > 
> > Firstly, you need to repro on the kernel from our xen- unstable 
> > repository, which is based on kernel.org Linux 2.6.16. Then build the 
> > same kernel for native i386 and get boot output. Send the unified diff 
> > (diff - u) of the two boot outputs. You may need to tweak the 
> > configuration of the native build to get the output similar to that of 
> > the Xen- based kernel --  we can tell you how to do that when we see your 
> > initial diff.
> > 
> > Another thing worth trying is 'ioapic_ack=old' as a Xen boot parameter 
> > in your bootloader configuration file. It probably won't help, but 
> > worth a try.
> > 
> >   --  Keir
> > 
> > 
> > _______________________________________________
> > Xen- devel mailing list
> > Xen- devel@xxxxxxxxxxxxxxxxxxx
> > http://lists.xensource.com/xen- devel
> 
> 
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.