[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] NFS related netback hang



On Thu, Apr 11, 2013 at 02:55:48PM +0100, G.R. wrote:
> Hi,
> I'm suffering from strange NFS related network issue for a while.
> 
> The issue shows up when copying from dom0 to domU through a NFS mount.
> After a short while, the transfer suddenly freezes and the domU
> network simply stops any response. Force mounting the NFS mount
> generally resolves the freeze. But some times you can really be in
> bad luck that the trick does not work.
> 
> Lucky enough, I captured the following log in a recent instance. It
> appears to be a dead-lock when the netback tries to get some free
> pages from NFS. I'm not sure if this is the whole story. Any
> suggestion how to solve the issue?
> 

BTW xen_netbk_alloc_page tries to allocate page from generic page pool.
It is not specific to NFS.

> Thanks,
> Timothy
> 
> Apr 11 21:22:27 gaia kernel: [429242.015643] INFO: task netback/0:2255
> blocked for more than 120 seconds.
> Apr 11 21:22:27 gaia kernel: [429242.015665] "echo 0 >
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Apr 11 21:22:27 gaia kernel: [429242.015690] netback/0       D
> ffff880210213900     0  2255      2 0x00000000
> Apr 11 21:22:27 gaia kernel: [429242.015693]  ffff8801fee04ea0
> 0000000000000246 0000000000000000 ffffffff818133f0
> Apr 11 21:22:27 gaia kernel: [429242.015697]  0000000000013900
> ffff8801fed87fd8 ffff8801fed87fd8 ffff8801fee04ea0
> Apr 11 21:22:27 gaia kernel: [429242.015700]  ffff8801fed87488
> ffff880210213900 ffff8801fee04ea0 ffff8801fed87488
> Apr 11 21:22:27 gaia kernel: [429242.015703] Call Trace:
> Apr 11 21:22:27 gaia kernel: [429242.015711]  [<ffffffff810c1bb5>] ?
> __lock_page+0x66/0x66
> Apr 11 21:22:27 gaia kernel: [429242.015715]  [<ffffffff814d06cb>] ?
> io_schedule+0x55/0x6b
> Apr 11 21:22:27 gaia kernel: [429242.015718]  [<ffffffff810c1bbc>] ?
> sleep_on_page+0x7/0xc
> Apr 11 21:22:27 gaia kernel: [429242.015720]  [<ffffffff814cf6c0>] ?
> __wait_on_bit_lock+0x3c/0x85
> Apr 11 21:22:27 gaia kernel: [429242.015723]  [<ffffffff810c3f7a>] ?
> find_get_pages+0xea/0x100
> Apr 11 21:22:27 gaia kernel: [429242.015726]  [<ffffffff810c1bb0>] ?
> __lock_page+0x61/0x66
> Apr 11 21:22:27 gaia kernel: [429242.015729]  [<ffffffff81058364>] ?
> autoremove_wake_function+0x2a/0x2a
> Apr 11 21:22:27 gaia kernel: [429242.015732]  [<ffffffff810cd110>] ?
> truncate_inode_pages_range+0x28b/0x2f8
> Apr 11 21:22:27 gaia kernel: [429242.015737]  [<ffffffff811c91d2>] ?
> nfs_evict_inode+0x12/0x23
> Apr 11 21:22:27 gaia kernel: [429242.015740]  [<ffffffff8111cdae>] ?
> evict+0xa3/0x153
> Apr 11 21:22:27 gaia kernel: [429242.015743]  [<ffffffff8111ce85>] ?
> dispose_list+0x27/0x31
> Apr 11 21:22:27 gaia kernel: [429242.015746]  [<ffffffff8111db6b>] ?
> evict_inodes+0xe7/0xf4
> Apr 11 21:22:27 gaia kernel: [429242.015749]  [<ffffffff8110b3af>] ?
> generic_shutdown_super+0x3e/0xc5
> Apr 11 21:22:27 gaia kernel: [429242.015752]  [<ffffffff8110b49e>] ?
> kill_anon_super+0x9/0x11
> Apr 11 21:22:27 gaia kernel: [429242.015755]  [<ffffffff811ca7b0>] ?
> nfs_kill_super+0xd/0x16
> Apr 11 21:22:27 gaia kernel: [429242.015758]  [<ffffffff8110b717>] ?
> deactivate_locked_super+0x2c/0x5c
> Apr 11 21:22:27 gaia kernel: [429242.015761]  [<ffffffff811c901d>] ?
> __put_nfs_open_context+0xbf/0xe1
> Apr 11 21:22:27 gaia kernel: [429242.015764]  [<ffffffff811d07db>] ?
> nfs_commitdata_release+0x10/0x19
> Apr 11 21:22:27 gaia kernel: [429242.015766]  [<ffffffff811d0f8c>] ?
> nfs_initiate_commit+0xd9/0xe4
> Apr 11 21:22:27 gaia kernel: [429242.015769]  [<ffffffff811d1bae>] ?
> nfs_commit_inode+0x81/0x111
> Apr 11 21:22:27 gaia kernel: [429242.015772]  [<ffffffff811c86f4>] ?
> nfs_release_page+0x40/0x4f
> Apr 11 21:22:27 gaia kernel: [429242.015775]  [<ffffffff810d0940>] ?
> shrink_page_list+0x4f5/0x6d8
> Apr 11 21:22:27 gaia kernel: [429242.015780]  [<ffffffff810d0f03>] ?
> shrink_inactive_list+0x1dd/0x33f
> Apr 11 21:22:27 gaia kernel: [429242.015783]  [<ffffffff810d15fa>] ?
> shrink_lruvec+0x2e0/0x44d
> Apr 11 21:22:27 gaia kernel: [429242.015787]  [<ffffffff810d17ba>] ?
> shrink_zone+0x53/0x8a
> Apr 11 21:22:27 gaia kernel: [429242.015790]  [<ffffffff810d1bcd>] ?
> do_try_to_free_pages+0x1c6/0x3f4
> Apr 11 21:22:27 gaia kernel: [429242.015794]  [<ffffffff810d20a3>] ?
> try_to_free_pages+0xc4/0x11e
> Apr 11 21:22:27 gaia kernel: [429242.015797]  [<ffffffff810c9018>] ?
> __alloc_pages_nodemask+0x440/0x72f
> Apr 11 21:22:27 gaia kernel: [429242.015801]  [<ffffffff810f592d>] ?
> alloc_pages_current+0xb2/0xcd

Judging from the stack trace above, it looks like the system is trying
to squeeze some memory out from NFS. Probably it is just that your
system is suffering from OOM? Then NFS failed to commit its changes to
disk for some reason and hung.


Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.