WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

AW: [Xen-devel] [pv_ops] e1000e: "Detected Tx Unit Hang"

To: "'Jeremy Fitzhardinge'" <jeremy@xxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: AW: [Xen-devel] [pv_ops] e1000e: "Detected Tx Unit Hang"
From: "Heiko Wundram" <modelnine@xxxxxxxxxxxxx>
Date: Fri, 21 May 2010 01:21:29 +0200
Cc: 'Stefan Kuhne' <stefan.kuhne@xxxxxxx>
Delivery-date: Thu, 20 May 2010 16:22:23 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4BF5BF3E.2010708@xxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Organization: modelnine.org
References: <4BF5AD97.4000907@xxxxxxxxxxxxx> <4BF5B547.60700@xxxxxxxx> <4BF5BE8A.9060509@xxxxxxxxxxxxx> <4BF5BF3E.2010708@xxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: Acr4cHSdMyZEfuX5QHWTyR+NLIEM4gAAl3UQ
I'm pretty sure the problem you're seeing is related to a broken firmware of
the specific chipset used for this Intel network card, not to Xen/pv_ops
kernel. I've had the same problems under high load with "semi-old"
Supermicro-Boxens I'm administering.

There's an Intel utility to patch the respective Firmware issue (i.e., the
network controller EEPROM), but it's not available online anymore (at least
last time I looked for it, I couldn't find it on the Intel site, where it
was prominently featured when I first looked for it).

I'll try to get access to it from the last machine that I applied this patch
to, but I'll only be able to do this some time during the (European) day
tomorrow.

--- Heiko.


-----Ursprüngliche Nachricht-----
Von: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
[mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] Im Auftrag von Jeremy
Fitzhardinge
Gesendet: Freitag, 21. Mai 2010 01:01
An: xen-devel@xxxxxxxxxxxxxxxxxxx
Cc: Stefan Kuhne
Betreff: Re: [Xen-devel] [pv_ops] e1000e: "Detected Tx Unit Hang"

On 05/20/2010 03:58 PM, Stefan Kuhne wrote:
> Am 21.05.2010 00:18, schrieb Jeremy Fitzhardinge:
>
> Hello Jeremy,
>
>   
>> e1000e works fine for me.  However, I did have problems with my Ibex
>> Peak-based system and the integrated ethernet devices; they would drop
>> off the PCIe bus (lspci -vx would show all 0xff for the config space),
>> which turned out to be some problem with ALPM (PCIe active link power
>> management).  Could this be what you're seeing?
>>
>>     
> my "lspci -vx" output:
>
> 02:00.0 Ethernet controller: Intel Corporation 82573E Gigabit Ethernet
> Controller (Copper)
>         Subsystem: FIRST INTERNATIONAL Computer Inc Unknown device 4720
>         Flags: bus master, fast devsel, latency 0, IRQ 409
>         Memory at d0000000 (32-bit, non-prefetchable) [size=128K]
>         I/O ports at 2000 [size=32]
>         Capabilities: [c8] Power Management version 2
>         Capabilities: [d0] Message Signalled Interrupts: Mask- 64bit+
> Queue=0/0 Enable+
>         Capabilities: [e0] Express Endpoint IRQ 0
>         Capabilities: [100] Advanced Error Reporting
>         Capabilities: [140] Device Serial Number c6-a9-09-ff-ff-0b-14-00
> 00: 86 80 8c 10 07 05 10 00 00 00 00 02 10 00 00 00
> 10: 00 00 00 d0 00 00 00 00 01 20 00 00 00 00 00 00
> 20: 00 00 00 00 00 00 00 00 00 00 00 00 09 15 20 47
> 30: 00 00 00 00 c8 00 00 00 00 00 00 00 0b 01 00 00
>
> and the complete dmesg output:
> [ 9620.997466] 0000:02:00.0: peth0: Detected Tx Unit Hang:
> [ 9620.997469]   TDH                  <fc>
> [ 9620.997471]   TDT                  <1f>
> [ 9620.997473]   next_to_use          <1f>
> [ 9620.997475]   next_to_clean        <fc>
> [ 9620.997477] buffer_info[next_to_clean]:
> [ 9620.997479]   time_stamp           <8e2ec3>
> [ 9620.997481]   next_to_watch        <fc>
> [ 9620.997483]   jiffies              <8e3a25>
> [ 9620.997485]   next_to_watch.status <0>
> [ 9622.997490] 0000:02:00.0: peth0: Detected Tx Unit Hang:
> [ 9622.997496]   TDH                  <fc>
> [ 9622.997500]   TDT                  <1f>
> [ 9622.997503]   next_to_use          <1f>
> [ 9622.997507]   next_to_clean        <fc>
> [ 9622.997511] buffer_info[next_to_clean]:
> [ 9622.997515]   time_stamp           <8e2ec3>
> [ 9622.997519]   next_to_watch        <fc>
> [ 9622.997522]   jiffies              <8e41f5>
> [ 9622.997526]   next_to_watch.status <0>
> [ 9624.997536] 0000:02:00.0: peth0: Detected Tx Unit Hang:
> [ 9624.997541]   TDH                  <fc>
> [ 9624.997545]   TDT                  <1f>
> [ 9624.997549]   next_to_use          <1f>
> [ 9624.997553]   next_to_clean        <fc>
> [ 9624.997557] buffer_info[next_to_clean]:
> [ 9624.997561]   time_stamp           <8e2ec3>
> [ 9624.997565]   next_to_watch        <fc>
> [ 9624.997568]   jiffies              <8e49c5>
> [ 9624.997572]   next_to_watch.status <0>
> [ 9626.065848] eth0: port 1(peth0) entering disabled state
> [ 9629.910292] e1000e: peth0 NIC Link is Up 1000 Mbps Full Duplex, Flow
> Control: None
> [ 9629.910854] eth0: port 1(peth0) entering forwarding state
>   


OK, definitely different problem.  Does it happen immediately, or after
a while?  Under load?  Can you provide the full boot output, and cat
/proc/interrupts?

Thanks,
    J

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel