[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] x86_64 eth0 e1000_clean_tx_irq tx hang

  • To: "Chris Wright" <chrisw@xxxxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxx>
  • From: "Ian Pratt" <m+Ian.Pratt@xxxxxxxxxxxx>
  • Date: Wed, 8 Feb 2006 20:01:08 -0000
  • Delivery-date: Wed, 08 Feb 2006 20:12:17 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>
  • Thread-index: AcYsy9zWyjVQ8LGqSPyeatkMloKPqQADYgTw
  • Thread-topic: [Xen-devel] x86_64 eth0 e1000_clean_tx_irq tx hang

> This is against current x86_64 defconfig build:
> e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
>   Tx Queue             <0>
>   TDH                  <2b>
>   TDT                  <31>
>   next_to_use          <31>
>   next_to_clean        <2b>
> buffer_info[next_to_clean]
>   time_stamp           <10004d5f2>
>   next_to_watch        <2d>
>   jiffies              <10004d7ce>
>   next_to_watch.status <0>
> ... repeat until eventually ...
> NETDEV WATCHDOG: eth0: transmit timed out
> this is on simple scp to dom0 from external box.  after a bit 
> watchdog resets, and ping works, only to repeat itself when a 
> try to scp again

Yep, this is the bug I warned y'all about at the summit, but you asked
for the code to be checked in anyway... 

A bug shared is a bug fixed quicker? :-)

For us, this only manifests on x86_64, and arrived with the subarch xen
version of 2.6.12. Extensive inspection of the arch->subarch conversion
suggests that nothing should have changed, so this is likely a latent
bug being triggered by slight timing changes.

It sounds like it's rather easier for you to trigger than it was for us
-- we had to run xm-test several times to get it to happen. Happy
hunting, and good luck :-)


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.