[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] Kernel OOPS in xen_netbk_rx_action / xenvif_gop_skb



Hello Wei Liu,

On 27.06.2014 20:24, Philipp Hahn wrote:
> On 27.06.2014 19:48, Philipp Hahn wrote:> I guess we found the problem
> ourselves: For thus removed skb's the
>> reference counter on the associated vif was not decremented, as it is
>> normally done in two locations at the end of the function
>> xen_netbk_rx_action():
> ...
>> The test is currently running again for the weekend and on Monday we
>> will hopefully know more.
> 
> FYI: The test VM survived the first reboot without locking up:
...
> Jun 27 19:49:23 xenmbint05b01 kernel: [ 2055.898349] UniDEBUG
> vif->mapped is false

The host survived the weekend with the problematic VM rebooting every 5
minutes; the log shows the shared ring being accessed unmapped, where
the kernel crashed previously.

So the attached patch fixes the bug (or at least prevents the OOPS).

@Wei Liu: You said that the patch is only a quick hack to detect, if my
analysis is correct and a proper fix would be needed. For us the
attached patch works, as the problem does not happen that often and is
hard to reproduce anyway, so spending more time on that issue is
probably not worth it. And that flag doesn't look that ugly.

@stable: at least 3.10 has the bug, but other long-term-stable kernels
have it too. The code in current is different as multi-queue was added,
so the patch wouldn't be in current.

Sincerely
Philipp Hahn

Attachment: 0001-xen-netback-skip-pending-packets-in-unmapped-ring.patch
Description: Text Data

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.