[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] NFS related netback hang

On Fri, Apr 12, 2013 at 5:27 PM, Wei Liu <wei.liu2@xxxxxxxxxx> wrote:
> On Fri, Apr 12, 2013 at 04:10:34AM +0100, G.R. wrote:
>> Yes, it's not specific to NFS page, but I'm just bad luck enough.
>> I agree with your suspect, the chance depends on the memory pressure in dom0.
>> So here is a proper setup to reproduce the issue:
>> 1. dom0 with SWAP disabled and with limited memory allocated.
>> 2. domU serves storage and exports NFS
>> 3. dom0 mounts the domU storage and writes to it.
>> 4. You need to achieve high speed to expose this issue.
> My setup is almost the same. The write speed is around
> 35-45MB/s if I do:
> dd if=/dev/zero of=/mnt/t bs=1 count=200
> However if I do count=2000, the speed slows down to 24MB/s. I suspect
> that's the memory pressure in Dom0 - my Dom0 only has 1024MB Ram. But I
> still didn't see any error.
That's weird, the stack trace can prove that the issue exists. And the
issue stands theoretically.
But why this is common in my build and cannot be reproduced in yours?
There must be some factor got missed here.
Is there any kernel config affecting the memory management behavior in
dom0? What's your dom0 kernel version?
Is there anything that could matter in nfs config?
Do you enable memory ballooning for dom0? I do. But does it matter?

I still believe the key factor is to stress the memory.
Maybe you can try further limit the memory size and use a larger file size.

I become uncertain about how the transfer speed affects.
I can achieve 10GB/s in iperf test without issue.
And ftp transfer also works without problem at 50MB/s
But may be the higher net speed is a negative factor here -- NFS may
be able to commit changes in faster speed.
Probably we should feed data faster than NFS can handle so that memory
is used up quickly?
But the back pressure from down stream should slow down the speed that
upstream is eating the memory.
How does the throttling works? Anyway to control?

I'll check why my dom0 reported OOM, may be that's one factor too.


>> In my case, domU owns a dedicated SATA controller so there is no
>> blkback overhead. Not sure if this is important factor to achieve high
>> speed.
>> And the transfer is a normal file copy instead of O_SYNC / O_DIRECT
>> access so they can be cached in client side for some short period.
>> Finally the transfer speed && memory size is crucial.
>> With a 4GB memory allocate to dom0, I can copy a file (> 2GB) from a
>> USB2 port without problem at about 32MB/s.
>> But using a USB3 port, the same file generally sucks at 1.2GB. And the
>> 'dd if=/dev/zero' sucks ever quicker.
>> With around 1ï2GB memory to dom0, the freeze happens much earlier, but
>> I did not check the exact time.
> And I also use iperf, which can achieve 7GB/s transfer between Dom0 and
> DomU, presumably that's fast enough?
>> I'm on a custom build of xen 4.2.1 testing release (built around Jan
>> this year?), with some patches related to graphics pass-through. But I
>> guess the patch is not relevant.
>> The dom0 kernel is version 3.6.11, 64 bit version.
>> One thing I forgot to mention is the sign of memory leakage.
>> I'm not very sure about it, but my dom0 reported OOM several days before.
>> And typically I don't use dom0 for other purpose other than serve
>> backends. The allocated memory should be around 2GB and that should be
>> plenty for a dom0.
>> Are there any known leakage bug out there?
> Not that I know of, page allocation / deallocation in netback is quite
> simple.
> Wei.
>> Thanks,
>> Timothy

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.