[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Possible bug? DOM-U network stopped working after fatal error reported in DOM0



> > > > But seems like this patch is not stable enough yet and has its own
> > > > issue -- memory is not properly released?
> > >
> > > I know. I've been working on improving it this morning and I'm
> > > attaching an updated version below.
> > >
> > Good news.
> > With this  new patch, the NAS domU can serve iSCSI disk without OOM
> > panic, at least for a little while.
> > I'm going to keep it up and running for a while to see if it's stable over 
> > time.
>
> Thanks again for all the testing. Do you see any difference
> performance wise?
I'm still on a *debug* kernel build to capture any potential panic --
none so far -- no performance testing yet.
Since I'm a home user with a relatively lightweight workload, so far I
didn't observe any difference in daily usage.

I did some quick iperf3 testing just now.
1. between nas domU <=> Linux dom0 running on an old i7-3770 based box.
The peak is roughly 12 Gbits/s when domU is the server.
But I do see regression down to ~8.5 Gbits/s when I repeat the test in
a short burst.
The regression can recover when I leave the system idle for a while.

When dom0 is the iperf3 server, the transfer rate is much lower, down
all the way to 1.x Gbits/s.
Sometimes, I can see the following kernel log repeats during the
testing, likely contributing to the slowdown.
             interrupt storm detected on "irq2328:"; throttling interrupt source
Another thing that looks alarming is the retransmission is high:
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   212 MBytes  1.78 Gbits/sec  110    231 KBytes
[  5]   1.00-2.00   sec   230 MBytes  1.92 Gbits/sec    1    439 KBytes
[  5]   2.00-3.00   sec   228 MBytes  1.92 Gbits/sec    3    335 KBytes
[  5]   3.00-4.00   sec   204 MBytes  1.71 Gbits/sec    1    486 KBytes
[  5]   4.00-5.00   sec   201 MBytes  1.69 Gbits/sec  812    258 KBytes
[  5]   5.00-6.00   sec   179 MBytes  1.51 Gbits/sec    1    372 KBytes
[  5]   6.00-7.00   sec  50.5 MBytes   423 Mbits/sec    2    154 KBytes
[  5]   7.00-8.00   sec   194 MBytes  1.63 Gbits/sec  339    172 KBytes
[  5]   8.00-9.00   sec   156 MBytes  1.30 Gbits/sec  854    215 KBytes
[  5]   9.00-10.00  sec   143 MBytes  1.20 Gbits/sec  997   93.8 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1.76 GBytes  1.51 Gbits/sec  3120             sender
[  5]   0.00-10.45  sec  1.76 GBytes  1.44 Gbits/sec                  receiver

2. between a remote box <=> nas domU, through a 1Gbps ethernet cable.
Roughly saturate the link when domU is the server, without obvious perf drop
When domU running as a client, the achieved BW is ~30Mbps lower than the peak.
Retransmission sometimes also shows up in this scenario, more
seriously when domU is the client.

I cannot test with the stock kernel nor with your patch in release
mode immediately.
But according to the observed imbalance between inbounding and
outgoing path, non-trivial penalty applies I guess?

> > BTW, an irrelevant question:
> > What's the current status of HVM domU on top of storage driver domain?
> > About 7 years ago, one user on the list was able to get this setup up
> > and running with your help (patch).[1]
> > When I attempted to reproduce a similar setup two years later, I
> > discovered that the patch was not submitted.
> > And even with that patch the setup cannot be reproduced successfully.
> > We spent some time debugging on the problem together[2], but didn't
> > bottom out the root cause at that time.
> > In case it's still broken and you still have the interest and time, I
> > can launch a separate thread on this topic and provide required
> > testing environment.
>
> Yes, better as a new thread please.
>
> FWIW, I haven't looked at this since a long time, but I recall some
> fixes in order to be able to use driver domains with HVM guests, which
> require attaching the disk to dom0 in order for the device model
> (QEMU) to access it.
>
> I would give it a try without using stubdomains and see what you get.
> You will need to run `xl devd` inside of the driver domain, so you
> will need to install xen-tools on the domU. There's an init script to
> launch `xl devd` at boot, it's called 'xendriverdomain'.
Looks like I'm unlucky once again. Let's follow up in a separate thread.

> Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.