RE: [Xen-users] Bad TCP accept performance

In my experience, small packet performance (header congestion) is a critical issue on all networking gear. An example is an ethernet switch that needs to apply and strip 802.1q VLAN and COS tags, or for remarking DSCP. Small Business class switches don’t support monitoring of the switching CPU making it nearly impossible to gauge if your gear is suffering due to this. I am a dealer for a remote performance management suite of testing tools geared towards monitoring performance of hybrid physical and virtual networks and has the capability to detect (with a high degree of certainty) if your network has gear that is susceptible to small packet header congestion.

"Small packet congestion detected"

Summary

Congestion caused by densely arriving packet headers has been detected.

Recommended action

· Identify devices such as switches, gateways, etc. associated with the Layer 3 hop where the loss first appears.

· Assess the impact of the problem, i.e. determine whether you expect to have dense small packet bursts or streams across that segment.

· If possible, perform intrusive flooding tests across the segment to isolate the device or software responsible.

· Upgrade hardware or software of limiting device and/or turn off the software feature that is responsible.

Detailed explanation

This diagnostic involves a specific form of small packet loss that is attributed to some devices having difficulties with the handling of densely arriving packet headers. Unlike regular congestion which is sensitive to the amount of data, not the number of headers, this "header congestion" condition will affect applications specific to small packets, such as real-time voice and video, but only when there are many densely aggregated streams. A single voice stream is unlikely to generate this condition. The NIC, or some other device in the path, is unable to process headers at sufficiently high rates, and packet loss/corruption is the consequence.

Small packet congestion is distinct from regular congestion, which is attributed more to large packets filling queues/buffers at store-forward devices (e.g. routers) or receiving NICs.

Possible secondary messages

· "Limiting network processor or other small packet sensitive constriction detected"

· "May impact real-time traffic such as voice"

Effectively, Xen makes ‘virtual switches’ to connect the VMs. It’s quite likely that performance will suffer vs bare metal as the networking connections need to traverse many layers of virtual bridging to reach the VM and to get returned.

I don’t know if this might be fixable by increasing Dom0 CPU access or by giving higher priority to the network processes. (I’m not sure what they are named).

From: xen-users-bounces@xxxxxxxxxxxxxxxxxxx [mailto:xen-users-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Carl Byström
Sent: Monday, May 23, 2011 12:31 PM
To: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-users] Bad TCP accept performance

I've been running some simple tests trying to find out why the TCP accept() rate has been so low on my Xen guest.

The rate at which I can accept new TCP connections is about five times better on a bare metal machine compared to my guest.

Been using netperf with the TCP_CRR test to simulate this behavior.

I originally posted this question at Server Fault (http://serverfault.com/questions/272483/why-is-tcp-accept-performance-so-bad-under-xen) along with lots of more details how I have performed these tests.

After a suggestion from a user there, I decided to try this list. Judging from the number of views the questions did receive at Server Fault and being top-3 voted at Hacker News, I presume this issue is something a lot of users care about.

One user at HN also reported that this apparently is a known issue and is due to small packet performance, affecting both Xen and KVM.

After collecting feedback from SF and HN users, my question is: what can you do to improve small packet performance in Xen?

Is this a fundamentally difficult problem to solve with Xen or is there a "quick fix"?

Thanks!

Carl Byström

http://cgbystrom.com

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

WARNING - OLD ARCHIVES

xen-users

RE: [Xen-users] Bad TCP accept performance