I had a similar problem and I got a kind response from Matej, but I was
busy and coudn't try what Matej suggested. I haven't tried the latest
xen 4/dom0 using a revision control tool, but I suspect that would be
the next logical step. I'd appreciate if you shared your findings on
this matter. I have attached my previous post.
Regards,
-- Mahdavian
Sauro Saltini wrote:
Hi everybody.
I've got a strange problem with PV networking on Linux.
My current configuration is :
- Xen 4.0.0
- DOM0 kernel 2.6.31.13
- DOMU kernels either 2.6.31.13 or 2.6.36-rc3 (vanilla)
one of my DomU's is intended to act as a router/firewall for all the
other ones.
I have configured 2 distinct bridges in Dom0 :
br0 - connects the firewall DomU "external" nic to the external faced
host NIC (phisical)
br1 - connects the "internal" virtual nic of firewall DomU (eth1) with
the other DomU's virtual nics.
Each "guest" DomU has defined the firewall DomU's "internal" address
as default gateway, the firewall by now acts simply as a NAT gateway,
with ip_forward active and a single NAT rule to SNAT outgoing packets
with his own external IP.
I've first installed the fw DomU as an hvm domain (nic's = ioemu) with
slackware 13.0 and tried the whole thing connecting from one of the
other DomU's to the external network and all worked smoothly.
As soon as I've converted the fw DomU's to a PV domain (using either
2.6.31.13 or 2.6.36-rc3 kernels with PV drivers) something changed in
a weird way...
I can still ping the firewall DomU both from "internal" domU's network
and from the external lan, but packets from a DomU can't reach the
external network anymore !
Running "tcpdump -nvvi" on both firewall's NICs and pinging an
external host from one of the other domU's reveals that packets arrive
on the firewall, are correctly NATted and appear on the external
connected interface, but then simply disappear !
on Dom0 "tcpdump -nvvi br0" (br0 = external bridge) never shows up any
traffic !
I've already tried to configure tx checksum offloading = off (ethtool
-K <nic> tx off) on all the involved interfaces without any success.
Please help...
Many thanks in advance.
Sauro Saltini.
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
-------------------------------------------------
This message sent via VFEmail.net
http://www.vfemail.net
$14.95 Lifetime accounts - 1GB disk, No bandwidth quotas!
Hello,
looks similar like the infamous incorrect checksum error -
http://wiki.xensource.com/xenwiki/XenFaq#head-4ce9767df34fe1c9cf4f85f7e07cb10110eae9b7.
Is there something in the logs of the Dom0?
How recent is the pvops 2.6.31.13 dom0 pvops kernel?
You can check the wrong checksum problem with tcpdump from dom0 (tcpdump -nvvi
interface). In some cases it can be solved with ethtool (ethtool -K eth0 tx
off), in some cases one had to use another Dom0 kernel (latest pvops 2.6.32.x
or patch the one in use).
Regards
Matej
-----Original Message-----
From: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
[mailto:xen-users-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of S.M.R. Mahdavian
Sent: Tuesday, July 13, 2010 9:04 AM
To: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-users] Networking bug in xen 4.0.0 ?
Hello all.
I have observed a strange behaviour in xen 4.0.0 networking. I Compiled xen on
an Ubuntu 9.10 and created two virtual machines on it ("Router" and "Server")
with the following network configuration:
Xen host 192.168.1.5
################################################
# #
# #
# #
# Server #
# 192.168.2.20 Router # Client
# 192.168.2.1 # 192.168.1.10
# ########## 192.168.1.1 #
# # # # ##########
# # # ### ############## # # #
# # # # # # # # host # #
# # # # ####### eth1 # # eth0 # #
# # # # # # eth0 ################# eth0 #
# # eth0 ####### # # # # # #
# # # # # # # # # #
# # # ### # # # # #
# # # ############## # # #
# ########## Linux # ##########
# Bridge #
# (sw0) #
# #
# #
# #
# #
################################################
Both "Router" and "Server" are Ubuntu 9.04 with 2.6.24-27-xen kernel. The two
domU's have an old-style xen kernel, whereas dom0 has a new 2.6.31.13 pvops
kernel. As can be seen from the figure above, "Router" has two interfaces.
One interface is connected to the "host's" eth0 and the other one is
internally connected to "Server" via a created linux bridge named sw0. IP
forwarding is enabled on "Router" so that packets coming from the external
"Client" can be forwarded to "Server". "Client" is running Windows XP.
The strange behavior is that packets are forwarded correctly from "Client" to
"Server" when packet sizes are small, but if they are larger than a certain
size, then packets coming out of Router's eth1 do not reach the Server or the
linux bridge. I fact tcpdump suggests that these packets do not make it to
vif1.1 (assuming that Router id number is 1). This happens only if packets
are *forwarded* by "Router". If packets are *generated* within Router, then
no problem is observed. Therefore pinging "Server" from "Client" using large
size packets fails, whereas pinging "Server" from "Router" is OK regardless
of the size of the packet.
I suspect that this is a bug in either xen 4.0.0 or the pvops kernel that
comes with it. Has anyone experienced something similar? Any recommendations?
Regards,
-- Mahdavian
P.S. The following shows some of the results from the experiment:
root@xen-host:~# xm list
Name ID Mem VCPUs State Time(s)
Domain-0 0 3023 2 r----- 36.9
router 1 256 1 -b---- 5.8
server 2 256 1 -b---- 5.7
root@xen-host:~# brctl show
bridge name bridge id STP enabled interfaces
eth0 8000.002354cd2c7f no peth0
vif1.0
sw0 8000.feffffffffff no vif1.1
vif2.0
# Ping from "Client" using packet size of 157 bytes (that's the limit!):
C:\Documents and Settings\Admin>ping -l 157 192.168.2.20
Pinging 192.168.2.20 with 157 bytes of data:
Reply from 192.168.2.20: bytes=157 time<1ms TTL=63
Reply from 192.168.2.20: bytes=157 time<1ms TTL=63
Reply from 192.168.2.20: bytes=157 time<1ms TTL=63
Reply from 192.168.2.20: bytes=157 time<1ms TTL=63
# Ping from "Client" using packet size of 158 bytes:
C:\Documents and Settings\Admin>ping -l 158 192.168.2.20
Pinging 192.168.2.20 with 158 bytes of data:
Request timed out.
Request timed out.
Request timed out.
Request timed out.
# tcpdump suggests that packets do leave the router interface:
root@router:~# tcpdump -n -v -i eth1
tcpdump: listening on eth1, link-type EN10MB (Ethernet), capture size 96 bytes
11:18:02.787099 IP (tos 0x0, ttl 127, id 560, offset 0, flags [none], proto
ICMP (1), length 186) 192.168.1.10 > 192.168.2.20: ICMP echo request, id 512,
seq 6400, length 166
11:18:07.785175 arp who-has 192.168.2.20 tell 192.168.2.1
11:18:07.785280 arp reply 192.168.2.20 is-at 00:16:3e:1e:67:9f
11:18:08.043239 IP (tos 0x0, ttl 127, id 564, offset 0, flags [none], proto
ICMP (1), length 186) 192.168.1.10 > 192.168.2.20: ICMP echo request, id 512,
seq 6656, length 166
11:18:13.516628 IP (tos 0x0, ttl 127, id 566, offset 0, flags [none], proto
ICMP (1), length 186) 192.168.1.10 > 192.168.2.20: ICMP echo request, id 512,
seq 6912, length 166
11:18:19.003203 IP (tos 0x0, ttl 127, id 568, offset 0, flags [none], proto
ICMP (1), length 186) 192.168.1.10 > 192.168.2.20: ICMP echo request, id 512,
# tcpdump suggests that the link between the frontend and the backend
interfaceseq 7168, length 166
# is broken:
root@xen-host:~# tcpdump -n -i vif1.1
tcpdump: WARNING: vif1.1: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on vif1.1, link-type EN10MB (Ethernet), capture size 96 bytes
11:18:07.552453 ARP, Request who-has 192.168.2.20 tell 192.168.2.1, length 28
11:18:07.552524 ARP, Reply 192.168.2.20 is-at 00:16:3e:1e:67:9f, length 28
# No problem is observed when packets are *generated* rather that *forwarded*
# by "Router".
root@router:~# ping -c 4 -s 1400 192.168.2.20
PING 192.168.2.20 (192.168.2.20) 1400(1428) bytes of data.
1408 bytes from 192.168.2.20: icmp_seq=1 ttl=64 time=0.885 ms
1408 bytes from 192.168.2.20: icmp_seq=2 ttl=64 time=0.068 ms
1408 bytes from 192.168.2.20: icmp_seq=3 ttl=64 time=0.066 ms
1408 bytes from 192.168.2.20: icmp_seq=4 ttl=64 time=0.068 ms
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|