[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen cluster n/w performance (again!)



> 
> " Is anyone else still seeing network performance anomalies? 
> 
> I still cannot get over 500Mbps into xenU with any hardware that I have.
> This holds using kernels I build, or using kernel binaries from
> the 2.0.1 tarball.
> 
> I tried tuning a e1000 driver, which greatly reduced the interrupt count,
> but had no effect on bandwidth.
> 
> I can get 600 to 750 Mbps into domain-0 from a stock linux host. 
> That rate then drops to around 500 after starting the etherbridge.
> Running top on Domain-0 claims the domain is over 60% idle.

With the domain 0 otherwise idle, what happens if you run 'slurp'
(attached).

The only time I've ever seen the bridge burn CPU is if you try
setting some of its delay parameters to zero in which case it can
cause it to periodically loop.

> This is with e1000 and bcm5703 NICs on IBM x335, Dell 1650 and
> other platforms, and a variety of CPU clock speeds.
> Running iperf under stock linux 2.4.25 gets 940Mbps between any of them.

What's the spec of the most modern machines you've tried Xen on? 

Ian



/******************************************************************************
 * slurp.c
 * 
 * Slurps spare CPU cycles and prints a percentage estimate every second.
 */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <err.h>


/* rpcc: get full 64-bit Pentium TSC value */
static __inline__ unsigned long long int rpcc(void) 
{
    unsigned int __h, __l;
    __asm__ __volatile__ ("rdtsc" :"=a" (__l), "=d" (__h));
    return (((unsigned long long)__h) << 32) + __l;
}


/*
 * find_cpu_speed:
 *   Interrogates /proc/cpuinfo for the processor clock speed.
 * 
 *   Returns: speed of processor in MHz, rounded down to nearest whole MHz.
 */
#define MAX_LINE_LEN 50
int find_cpu_speed(void)
{
    FILE *f;
    char s[MAX_LINE_LEN], *a, *b;

    if ( (f = fopen("/proc/cpuinfo", "r")) == NULL ) goto out;

    while ( fgets(s, MAX_LINE_LEN, f) )
    {
        if ( strstr(s, "cpu MHz") )
        {
            /* Find the start of the speed value, and stop at the dec point. */
            if ( !(a=strpbrk(s,"0123456789")) || !(b=strpbrk(a,".")) ) break;
            *b = '\0';
            fclose(f);
            return(atoi(a));
        }
    }

 out:
    fprintf(stderr, "find_cpu_speed: error parsing /proc/cpuinfo for cpu MHz");
    exit(1);
}


int main( int argc, char **argv )
{
    int mhz, i, cpu=-1;

    /*
     * no_preempt_estimate is our estimate, in clock cycles, of how long it
     * takes to execute one iteration of the main loop when we aren't
     * preempted. 50000 cycles is an overestimate, which we want because:
     *  (a) On the first pass through the loop, diff will be almost 0,
     *      which will knock the estimate down to <40000 immediately.
     *  (b) It's safer to approach real value from above than from below --
     *      note that this algorithm is unstable if n_p_e gets too small!
     */
    unsigned int no_preempt_estimate = 50000;

    /*
     * prev  = timestamp on previous iteration;
     * this  = timestamp on this iteration;
     * diff  = difference between the above two stamps;
     * start = timestamp when we last printed CPU % estimate;
     */
    unsigned long long int prev, this, diff, start;

    /*
     * preempt_time = approx. cycles we've been preempted for since last stats
     *                display.
     */
    unsigned long long int preempt_time = 0;

    if ( argc > 1 )
        cpu = atoi(argv[1]);
    else if ( argc > 2 )
        exit(-1);


    /* Required in order to print intermediate results at fixed period. */
    mhz = find_cpu_speed();
    printf("CPU speed = %d MHz, using cpu %d\n", mhz, cpu);

    if (cpu>=0)
    {
        int rc; 
        unsigned long bs = 0; 
        bs = 1<<cpu;

        rc=sched_setaffinity( getpid(), sizeof(bs)*8, &bs );

        if(rc)
            err(rc,"sched_getaffinity failed\n.");
            
    }
                                                               

    start = prev = rpcc();

    for ( ; ; )
    {
        /*
         * By looping for a while here we hope to reduce affect of getting
         * preempted in critical "timestamp swapping" section of the loop.
         * In addition, it should ensure that 'no_preempt_estimate' stays
         * reasonably large which helps keep this algorithm stable.
         */
        for ( i = 0; i < 10000; i++ );

        /*
         * The critical bit! Getting preempted here will shaft us a bit,
         * but the loop above should make this a rare occurrence.
         */
        this = rpcc();
        diff = this - prev;
        prev = this;

        /* if ( diff > (1.5 * preempt_estimate) */
        if ( diff > no_preempt_estimate + (no_preempt_estimate>>1) )
        {
            /* We were probably preempted for a while. */
            preempt_time += diff - no_preempt_estimate;            
        }
        else
        {
            /*
             * Looks like we weren't preempted -- update our time estimate:
             * New estimate = 0.75*old_est + 0.25*curr_diff
             */
            no_preempt_estimate =
                (no_preempt_estimate>>1) + (no_preempt_estimate>>2) +
                (diff>>2);
        }
            
        /* Dump CPU time every second. */
        if ( (this - start) / mhz > 1000000 ) 
        { 
            printf("Slurped %.2f%% CPU, cpu %d\n", 
                   100.0*((this-start-preempt_time)/((double)this-start)),
                   cpu);
            start = this;
            preempt_time = 0;
        }
    }

    return(0);
}


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now. 
http://productguide.itmanagersjournal.com/
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.