WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

Re: [Xen-users] Network Issues on Migration

On Fri, Jan 09, 2009 at 02:17:34PM -0500, Wendell Dingus wrote:
> 
> I've read and experimented extensively and being in desperate need of 
> "finishing" this setup and getting it deployed live, would like to see if 
> anyone has any suggestions on the last hangup we seem to have. 
> 
> Two SuperMicro 1U servers with dual quad-core CPUs and 16GB RAM each. CentOS 
> 5.2 x86_64 and it's xen implementation. The only thing non "stock" CentOS at 
> this point are the Intel IGB drivers. The RHEL/CentOS drivers for Intel IGB 
> appear to have a bug with DHCP over a bridged interface which the latest 
> drivers downloaded straight from Intel cured for us. 
> 
> Anyway, both are attached to shared FC storage and are doing RHCS with both 
> IP and disk-based quorum. CLVMD with a shared VG for creating LV's in as 
> containers for VMs. That part is all working very good. 
> 
> Each DOM0 has 2 physical NICs and both are bridged. Additionally we added a 
> virbr0 as a bridged per-DOM0 local network as well. 
> 
> When any VM boots up it can ping and traceroute on any of it's respective 
> networks perfectly. Inbound/outbound data flow of any kind appears perfect as 
> well. Once a VM is migrated or live-migrated to the other DOM0 though the 
> ability to ping or traceroute ceases. Sessions via ssh or httpd either 
> inbound or outbound continue to work fine though. 
> 
> When a VM boots I see this in dmesg: 
> netfront: Initialising virtual ethernet driver. 
> netfront: device eth0 has flipping receive path. 
> 
> I read something about a CRC problem and had each of them do "ethtool -K 
> eth{n} tx off" but don't think that was necessary in this instance, I've 
> never seen any error messages about CRC errors. The described problem and 
> solution I followed was not heavily detailed and it was just an attempt to 
> see if that helped with the problem. 
> 
> The following was added to the end of /etc/sysctl.conf on both DOM0's only 
> (per the excellent wiki article): 
> net.ipv4.icmp_echo_ignore_broadcasts = 1 
> net.ipv4.conf.all.accept_redirects = 0 
> net.ipv4.conf.all.send_redirects = 0 
> 
> The other oddity about this is that a VM started on server1 and live migrated 
> to server2, a running ping only pauses a short while then picks right back up 
> and continues to be successful. Migrating it back to server1 or initially 
> starting a VM on server2 and migrating it to server1 is where the ping 
> "stuck" issue comes into play. We were very careful and documented well as we 
> installed both boxes, in an attempt to keep them as identical as possible. I 
> fear this behavior proves that's not the case though, ugh... 
> 
> After migrating from 2 to 1 and then trying a ping (and waiting a good logn 
> while before ctrl-c'ing this): 
> PING 192.168.77.1 (192.168.77.1) 56(84) bytes of data. 
> 64 bytes from 192.168.77.1: icmp_seq=1 ttl=64 time=0.000 ms 
> 
> --- 192.168.77.1 ping statistics --- 
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms 
> rtt min/avg/max/mdev = 0.000/0.000/0.000/0.000 ms 
> 
> Very strange... Additionally a "service network restart" at this point 
> results in all interfaces going down, loopback being reinitialized and then 
> it hangs on trying to bring up eth0. I can ctrl-c it three times as it pauses 
> on each interface, then "ifconfig" and see all the IPs are still there. Still 
> can't ping but can "telnet google.com 80" for instance. Odd... 
> 
> So anyway, any pointers or suggestions you might have, would be greatly 
> appreciated... 
> 

https://www.redhat.com/archives/rhelv5-announce/2008-October/msg00000.html

Some entries from the RHEL 5.3 beta changelog:

+ Timer problems after migration were fixed
+ Lengthy network outage after migrations was fixed

Dunno if it's that what you're seeing.. 

-- Pasi

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

<Prev in Thread] Current Thread [Next in Thread>