|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] netback BUG_ON when using copy_skb=1
>>> On 17.10.13 at 12:26, jerry <jerry.lilijun@xxxxxxxxxx> wrote:
> Hi Jan,
please don't top post.
> In my test, the grant table copy error may cause that VM crash.
> The stack is as follows:
> kernel BUG at /linux/driver/redhat6.2/xen-vnif/xen-netfront.c:372!
> ...
> The BUG code in xen-netfront.c xennet_tx_buf_gc() is:
> if (unlikely(gnttab_query_foreign_access(
> np->grant_tx_ref[id]) != 0)) {
> printk(KERN_ALERT "xennet_tx_buf_gc: warning "
> "-- grant still in use by backend "
> "domain.\n");
> BUG();
>
> In my guess the reason may be as follows:
> 1) XEN: The function _set_status() called in hypercall __gnttab_copy() and
> __acquire_grant_for_copy() is executed failed and the grant ref is not ended.
> So GTF_reading bit cannot be cleared.
> 2) Netfront: this module invokes a BUG when it checks the GTF_reading bit is
> still set.
If that was the case, this would be a hypervisor bug: a grant copy
operation is supposed to hold the grant active only for as long as
the copy operation takes. You'll in particular notice that
__acquire_grant_for_copy() in its error path clears GTF_reading
(and GTF_writing, as appropriate) again. You'd likely need to
instrument the code to demonstrate (via a couple of extra log
messages) what you think is not working properly here.
Jan
> On 2013/10/17 16:00, Jan Beulich wrote:
>>>>> On 17.10.13 at 09:41, jerry <jerry.lilijun@xxxxxxxxxx> wrote:
>>> But there may be still concurrency problems in my test.
>>> If the page replacing in copy_pending_req() was done after
>>> netif_get_page_ext() in netbk_gop_frag(), copy_gop->flags is wrongly marked
>>> with GNTCOPY_source_gref.
>>> Here the memory of that page in skb has been replaced with Dom0 local
>>> memory, so the later HYPERVISOR_multicall() with GNTTABOP_copy in
>>> netbk_rx_actions() will get errors.
>>> The messages is shown as:
>>>
>>> (XEN) grant_table.c:305:d0 Bad flags (0) or dom (0). (expected dom 0)
>>>
>>> Would you like to share some opinions?
>>
>> At a first glance that seems possible, but the question is - does it
>> cause any problems other than the quoted message to be issued
>> (and the problematic packet getting re-transmitted)? I'm asking
>> mainly because fixing this would appear to imply adding locking to
>> these paths - with the risk of adversely affecting performance.
>>
>> Jan
>>
>>
>>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |