[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] kernel oops/IRQ exception when networking betweenmany domUs



Am Mittwoch, den 08.06.2005, 16:01 +0100 schrieb Ian Pratt:
> You might want to compile the kernel with CONFIG_FRAME_POINTERS to get more 
> accurate call traces. 
> 
> I'd also add some printk's to the 'unusual' paths in netback.c:net_rx_action, 
> such as the memory squeeze path.

? (have no clue about kernel programming. I'm only good for panicking
kernels ;) 

> It's also worth building Xen with debug=y and connect a serial line.

That'll be some work ... i'll do so if there is interest in this bug and
if somebody else can confirm that "ping -f -b $BROADCAST_IP" on xen-br0
kills dom0 when there are enough domUs.

/nils.

> Best,
> Ian
> > -----Original Message-----
> > From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx 
> > [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of 
> > Nils Toedtmann
> > Sent: 08 June 2005 15:40
> > To: Birger Tödtmann
> > Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
> > Subject: Re: [Xen-devel] kernel oops/IRQ exception when 
> > networking betweenmany domUs
> > 
> > Am Mittwoch, den 08.06.2005, 14:34 +0200 schrieb Nils Toedtmann:
> > [...] 
> > > Ok, reproduced the dom0 kernel panic in a simpler situation:
> > > 
> > > * create some domUs, each having 1 interface in the same subnet
> > > * bridge all the interfaces together (dom0 not having an ip on that
> > >   bridge)
> > > * trigger unicast traffic as much as you want (like unicast flood
> > >   pings): No problem.
> > > * Now trigger some broadcast traffic between the domUs:
> > > 
> > >     ping -i 0,1 -b 192.168.0.255
> > > 
> > >   BOOOM.
> > > 
> > > 
> > > Instead, you may down all vifs first, start the flood 
> > broadcast ping 
> > > in the first domU and bring up one vif after the other 
> > (wait each time
> > > >15sec until the bridge put the added port in forwarding 
> > state). After
> > > bringing up 10-15 vifs, dom0 panics. 
> > > 
> > > I could _not_ reproduce this with massive unicast traffic. 
> > The problem 
> > > disappears if i set "net.ipv4.icmp_echo_ignore_broadcasts=1" in all 
> > > domains. Maybe the probem rises if to many domUs answer to 
> > broadcasts 
> > > at the same time (collisions?).
> > 
> > 
> > More testing: again doing a 
> > 
> >   [root@domUtest01 ~]# ping -f -b 192.168.0.255
> > 
> > into the bridged vif-subnet. With all domains having 
> > "net.ipv4.icmp_echo_ignore_broadcasts=1" (so noone answers 
> > the pings) everything is fine. When i switch in the pinging 
> > domUtest01 itself (and _only_ in that domain) to 
> > "net.ipv4.icmp_echo_ignore_broadcasts=0", dom0 immediately 
> > panics (if there are 15-20 domUs in that bridged subnet).
> > 
> > Another test: putting dom0's vif0.0 on the bridge too, 
> > pinging from dom0. Then in needed (yet) all domains to have 
> > "net.ipv4.icmp_echo_ignore_broadcasts=0" to get my oops.
> > 
> > The oopses happen in different places, not all contain 
> > "net_rx_action" (all are "Fatal exception in interupt". These "dumps"
> > may contain typos because i copied them from monitor by hand):
> > 
> >   [...]
> >   error_code
> >   kfree_skbmem
> >   __kfree_skb
> >   net_rx_action
> >   tasklet_action
> >   __do_softirq
> >   soft_irq
> >   irq_exit
> >   do_IRQ
> >   evtchn_do_upcall
> >   hypervisor_callback
> >   __wake_up
> >   sock_def_readable
> >   unix_stream_sendmsg
> >   sys_sendto
> >   sys_send
> >   sys_socketcall
> >   syscall_call
> > 
> > or
> > 
> >   [...]
> >   error_code
> >   tasklet_action
> >   __do_softirq
> >   soft_irq
> >   irq_exit
> >   do_IRQ
> >   evtchn_do_upcall
> >   hypervisor_callback
> > 
> > or
> > 
> >   [...]
> >   error_code
> >   tasklet_action
> >   __do_softirq
> >   soft_irq
> >   evtchn_do_upcall
> >   hypervisor_callback
> >   cpu_idle
> >   start_kernel
> > 
> > or
> > 
> >   [...]
> >   error_code
> >   kfree_skbmem
> >   __kfree_skb
> >   net_rx_action
> >   tasklet_action
> >   __do_softirq
> >   soft_irq
> >   irq_exit
> >   do_IRQ
> >   evtchn_do_upcall
> >   hypervisor_callback
> >   __mmx_memcpy
> >   memcpy
> >   dup_task_struct
> >   copy_process
> >   do_fork
> >   sys_clone
> >   syscall_call
> > 
> > or
> > 
> >   [...]
> >   error_code
> >   kfree_skbmem
> >   __kfree_skb
> >   net_rx_action
> >   tasklet_action
> >   __do_softirq
> >   soft_irq
> >   irq_exit
> >   do_IRQ
> >   evtchn_do_upcall
> >   hypervisor_callback
> >   __wake_up
> >   sock_def_readable
> >   unix_stream_sendmsg
> >   sys_sendto
> >   sys_send
> >   sys_socketcall
> >   syscall_call
> > 
> > or
> > 
> >   [...]
> >   error_code
> >   kfree_skbmem
> >   __kfree_skb
> >   net_rx_action
> >   tasklet_action
> >   __do_softirq
> >   do_softirq
> >   local_bh_enable
> >   dev_queue_xmit
> >   nf_hook_slow
> >   ip_finish_output
> >   dst_output
> >   ip_push_pending_frames
> >   raw_sendmsg
> >   sock_sendmsg
> >   sys_sendmsg
> >   sys_socketcall
> >   syscall_call
> > 
> > and more ...
> > 
> > 
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@xxxxxxxxxxxxxxxxxxx
> > http://lists.xensource.com/xen-devel
> > 


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.