[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Paravirtualised drivers for fully virtualised domains, rev9



Steven Smith <sos22@xxxxxxxxxxxxxxxx> wrote on 08/12/2006 03:32:23 AM:

> > > > Here is what I have found so far in trying to chase down the cause
of
> > the
> > > > slowdown.
> > > > The qemu-dm process is running 99.9% of the CPU on dom0.
> > > That seems very wrong.  When I try this, the device model is almost
> > > completely idle.  Could you see what strace says, please, or if there
> > > are any strange messages in the /var/log/qemu-dm. file?
> > Looks like I jumped the gun in relating the 99.9% CPU usage for qemu-dm
and
> > the network.  I start up the HVM domain and without running any tests
> > qemu-dm is chewing up 99.9% of the CPU in dom0.  So it appears that the
> > 100% CPU qemu usage is a problem by itself.  Looks like the same
problem
> > Harry Butterworth is seeing.
> qemu-dm misbehaving could certainly lead to the netif going very
> slowly.

Agreed.  I applied the patch to sent to Harry.  I appears to fix the 99.9%
CPU usage problem.

> > > 2) How often is the event channel interrupt firing according to
> > >    /proc/interrupts?  I see about 50k-150k/second.
> > I'm seeing ~ 500/s when netpipe-tcp reports decent throughput at
smaller
> > buffer sizes and then ~50/s when the throughput drops at larger buffer
> > sizes.
> How large do they have to be to cause problems?

I'm noticing a drop off in throughput at a buffer size of 3069.  Here is a
snip from the output from netpipe-tcp.

 43:    1021 bytes    104 times -->     20.27 Mbps in     384.28 usec
 44:    1024 bytes    129 times -->     20.14 Mbps in     387.86 usec
 45:    1027 bytes    129 times -->     20.17 Mbps in     388.46 usec
 46:    1533 bytes    129 times -->     22.94 Mbps in     509.95 usec
 47:    1536 bytes    130 times -->     23.00 Mbps in     509.48 usec
 48:    1539 bytes    130 times -->     23.12 Mbps in     507.92 usec
 49:    2045 bytes     66 times -->     30.02 Mbps in     519.66 usec
 50:    2048 bytes     96 times -->     30.50 Mbps in     512.35 usec
 51:    2051 bytes     97 times -->     30.61 Mbps in     511.24 usec
 52:    3069 bytes     98 times -->      0.61 Mbps in   38672.52 usec
 53:    3072 bytes      3 times -->      0.48 Mbps in   48633.50 usec
 54:    3075 bytes      3 times -->      0.48 Mbps in   48542.50 usec
 55:    4093 bytes      3 times -->      0.64 Mbps in   48516.35 usec
 56:    4096 bytes      3 times -->      0.65 Mbps in   48449.48 usec
 57:    4099 bytes      3 times -->      0.64 Mbps in   48575.84 usec

The throughput remains low for the remainder of the buffer sizes which go
to 49155 before the benchmarks exits due to the requests taking more than a
second.

> > > The other thing is that these drivers seem to be very sensitive to
> > > kernel debugging options in the domU.  If you've got anything enabled
> > > in the kernel hacking menu it might be worth trying again with that
> > > switched off.
> > Kernel debugging is on.  I also have Oprofile enabled.  I'll build a
kernel
> > without those and see if it helps.
> Worth a shot.  It shouldn't cause the problems with qemu, though.

I built a kernel without kernel debugging and without instrumentation.  The
results were very similar.

 43:    1021 bytes    104 times -->     20.27 Mbps in     384.28 usec
 44:    1024 bytes    129 times -->     20.30 Mbps in     384.91 usec
 45:    1027 bytes    130 times -->     20.19 Mbps in     388.02 usec
 46:    1533 bytes    129 times -->     22.97 Mbps in     509.25 usec
 47:    1536 bytes    130 times -->     23.02 Mbps in     509.12 usec
 48:    1539 bytes    131 times -->     23.04 Mbps in     509.65 usec
 49:    2045 bytes     65 times -->     30.41 Mbps in     513.07 usec
 50:    2048 bytes     97 times -->     30.49 Mbps in     512.49 usec
 51:    2051 bytes     97 times -->     30.45 Mbps in     513.85 usec
 52:    3069 bytes     97 times -->      0.75 Mbps in   31141.34 usec
 53:    3072 bytes      3 times -->      0.48 Mbps in   48596.50 usec
 54:    3075 bytes      3 times -->      0.48 Mbps in   48876.17 usec
 55:    4093 bytes      3 times -->      0.64 Mbps in   48489.33 usec
 56:    4096 bytes      3 times -->      0.64 Mbps in   48606.63 usec
 57:    4099 bytes      3 times -->      0.64 Mbps in   48568.33 usec

Again, the throughput remains low for the remainder of the buffer sizes
which go to 49155

The above tests were run to netpipe-tcp running on another machine.  When I
run to netpipe-tcp running in dom0 I get better throughput but also some
strange behavior.  Again, a snip from the output.

 43:    1021 bytes    606 times -->    140.14 Mbps in      55.58 usec
 44:    1024 bytes    898 times -->    141.16 Mbps in      55.35 usec
 45:    1027 bytes    905 times -->    138.93 Mbps in      56.40 usec
 46:    1533 bytes    890 times -->    133.74 Mbps in      87.45 usec
 47:    1536 bytes    762 times -->    132.82 Mbps in      88.23 usec
 48:    1539 bytes    756 times -->    132.01 Mbps in      88.95 usec
 49:    2045 bytes    376 times -->    172.36 Mbps in      90.52 usec
 50:    2048 bytes    552 times -->    177.41 Mbps in      88.07 usec
 51:    2051 bytes    568 times -->    176.12 Mbps in      88.85 usec
 52:    3069 bytes    564 times -->      0.44 Mbps in   53173.74 usec
 53:    3072 bytes      3 times -->      0.44 Mbps in   53249.32 usec
 54:    3075 bytes      3 times -->      0.50 Mbps in   46639.64 usec
 55:    4093 bytes      3 times -->    321.94 Mbps in      97.00 usec
 56:    4096 bytes    515 times -->    287.05 Mbps in     108.87 usec
 57:    4099 bytes    459 times -->      2.69 Mbps in   11615.94 usec
 58:    6141 bytes      4 times -->      0.63 Mbps in   74535.64 usec
 59:    6144 bytes      3 times -->      0.35 Mbps in  133242.01 usec
 60:    6147 bytes      3 times -->      0.35 Mbps in  133311.47 usec
 61:    8189 bytes      3 times -->      0.62 Mbps in  100391.51 usec
 62:    8192 bytes      3 times -->      1.05 Mbps in   59535.66 usec
 63:    8195 bytes      3 times -->      0.63 Mbps in   99598.69 usec
 64:   12285 bytes      3 times -->      0.47 Mbps in  199974.34 usec
 65:   12288 bytes      3 times -->      4.70 Mbps in   19933.34 usec
 66:   12291 bytes      3 times -->      4.70 Mbps in   19933.30 usec
 67:   16381 bytes      3 times -->      0.71 Mbps in  176984.35 usec
 68:   16384 bytes      3 times -->      0.93 Mbps in  134929.50 usec
 69:   16387 bytes      3 times -->      0.93 Mbps in  134930.33 usec

The throughput drops at a buffer size of 3069 as in the prior runs, but it
regains at 4093 and 4096, and then drops off again for the remainder of the
test.

I don't know offhand why the throughput drops off.  I'll look into it.  Any
tips would be helpful.

For comparison, an FV domU running netpipe-tcp to another machine will ramp
up to about 20 Mbps at a buffer size of around 128 KB and then taper off to
17 Mbps.  A PV domU will ramp up to around 750 Mbps at a buffer size of
about 2 MB and maintain that throughput to an 8 MB buffer when the test
stopped.  On dom0 netpipe-tcp running to another machine ramps up to around
850 Mbps at a buffer sizes from 3 MB to 8 MB where the test stopped.

Steve D.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.