[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] network hang trigger



After some investigation, I seem to be pin down the cause as
skbuff_head_cache overflow. do 'cat /proc/slabinfo | grep skbuff', the
first column is the number of active skbuff_head. On my machine, it's
like this:

(1) after normal 'ping dom0', skbuff_head_cache active is 210
(2) after 'ping -s 3000 dom0' and network hang, it shoots up to 254
(3) further 'ping -s 3000 dom0' and it shoots up to 340. Get a few
replies. Entire network hang.
(4) wait for several minutes, skbuff_head_cache gets
garbage-collected, drops down to 120. Network recovers.

This problem also occurs to linux, freebsd, netbsd kernels with faulty
device drivers. It's likely that network frontend driver is not
freeing skbuff properly. Skbuff cache gets overflowed until some
threshold that it's gc'ed by force.

I'm taking a look at the netback source codes.

-- Bin Ren

On Wed, 15 Sep 2004 09:29:46 -0600, Charles Coffing <ccoffing@xxxxxxxxxx> wrote:
> I was able to reproduce the hang easily.  "ping -s 1473" worked, but
> "ping -s 1499" hung.  While it was hung, I tried pinging the other
> direction and that hung too.
> 
> My setup:
> Pinging from DOMU to DOM0
> Changeset 1.1307
> DOM0: 2.6.8.1, Stock configuration
> DOMU: 2.6.8.1, Stock, except writable pagetables are disabled
> 
> ccoffing2:~ # ping -s 1499 137.65.171.60
> PING 137.65.171.60 (137.65.171.60) 1499(1527) bytes of data.
> 1507 bytes from 137.65.171.60: icmp_seq=1 ttl=64 time=0.455 ms
> ping: sendmsg: No buffer space available
> ping: sendmsg: No buffer space available
> ping: sendmsg: No buffer space available
> ping: sendmsg: No buffer space available
> ping: sendmsg: No buffer space available
> ping: sendmsg: No buffer space available
> ping: sendmsg: No buffer space available
> From 137.65.171.60: icmp_seq=2 Frag reassembly time exceeded
> From 137.65.171.60 icmp_seq=2 Frag reassembly time exceeded
> 1507 bytes from 137.65.171.60: icmp_seq=3 ttl=64 time=28980 ms
> 1507 bytes from 137.65.171.60: icmp_seq=4 ttl=64 time=27980 ms
> 1507 bytes from 137.65.171.60: icmp_seq=5 ttl=64 time=26980 ms
> 
> --- 137.65.171.60 ping statistics ---
> 15 packets transmitted, 4 received, +2 errors, 73% packet loss, time
> 33213ms
> rtt min/avg/max/mdev = 0.455/20985.849/28980.992/12136.540 ms, pipe 11


-------------------------------------------------------
This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170
Project Admins to receive an Apple iPod Mini FREE for your judgement on
who ports your project to Linux PPC the best. Sponsored by IBM.
Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.