[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] netback BUG_ON when using copy_skb=1

Hi Wei Liu,

I am doing some network performance on Xen4.1.2 and kernel 3.0, and get a crash 
with BUG_ON(netbk->mmap_pages[idx] != page) in netbk_gop_frag() accidentally.

By analyzing the module drivers/xen/netback, I think the reason is as follows 
when sending packets from VM1 to VM2:
1) The two netback thread(the first for VM1 sending, second for VM2 receiving) 
run concurrently.
2) In first netback thread, it will do delayed copy from a foreign granted page 
to local memory when some outstanding packets have been pending too long( above 
half of one HZ).
   Then netbk->mmap_pages[idx] will be replaced with new allocated page.
3) If the packets are forwarded to VM2 by virtual switch, netbk_gop_frag() will 
be called in second netback thread.
   And that function will judge whether the pages in skb frags[] is foreign in 
order to make sure how to do grant copy.
4) If the page replacing was done after the page foreign judge in 
netbk_gop_frag(), the BUG will be invoked because the page from skb frags[] are 
different with mmap_pages[idx].

I tried to using spin_lock to protect the page accessing, but no appropriate 
solutions was found.
How to fix this problem?  Would you like to share some opinions?

In addition, I have tried to turn off copy_skb. Then the vif netdevice may not 
be released after shutting down VM,
that's because outstanding packets hold the reference count of the device too 
long for some unknown reason.
The reason may be that the NIC does not release packets after DMA.
Does anyone have met such problems? Thanks.

Best regards,

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.