[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8

To: Jan Beulich <JBeulich@xxxxxxxx>
From: Dion Kant <g.w.kant@xxxxxxxxxx>
Date: Fri, 05 Jul 2013 21:46:25 +0200
Cc: Wei Liu <wei.liu2@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxx
Delivery-date: Fri, 05 Jul 2013 19:47:00 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 07/05/2013 12:56 PM, Jan Beulich wrote:
>>>> On 05.07.13 at 12:40, Dion Kant <g.w.kant@xxxxxxxxxx> wrote:
>> On 07/04/2013 05:01 PM, Wei Liu wrote:
>>> --- a/drivers/xen/netfront/netfront.c
>>> +++ b/drivers/xen/netfront/netfront.c
>>> @@ -1306,6 +1306,7 @@ static RING_IDX xennet_fill_frags(struct 
>>> netfront_info *np,
>>>         struct sk_buff *nskb;
>>>
>>>         while ((nskb = __skb_dequeue(list))) {
>>> +               BUG_ON(nr_frags >= MAX_SKB_FRAGS);
>>>                 struct netif_rx_response *rx =
>>>                         RING_GET_RESPONSE(&np->rx, ++cons);
>>>
>>
>> Integrated the patch. I obtained a crash dump and the log in it did not
>> show this BUG_ON. Here is the relevant section from the log
>>
>> var/lib/xen/dump/domUA # crash /root/vmlinux-p1
>> 2013-0705-1347.43-domUA.1.core
>>
>> [    7.670132] Adding 4192252k swap on /dev/xvda1.  Priority:-1 extents:1 
>> across:4192252k SS
>> [   10.204340] NET: Registered protocol family 17
>> [  481.534979] netfront: Too many frags
>> [  487.543946] netfront: Too many frags
>> [  491.049458] netfront: Too many frags
>> [  491.491153] ------------[ cut here ]------------
>> [  491.491628] kernel BUG at drivers/xen/netfront/netfront.c:1295!
> 
> So if not the BUG_ON() from the patch above, what else does that
> line have in your sources?

Nothing else, but thanks for pointing this out to me.

After obtaining results with your patch

1285 static RING_IDX xennet_fill_frags(struct netfront_info *np,
1286                                   struct sk_buff *skb,
1287                                   struct sk_buff_head *list)
1288 {
1289         struct skb_shared_info *shinfo = skb_shinfo(skb);
1290         int nr_frags = shinfo->nr_frags;
1291         RING_IDX cons = np->rx.rsp_cons;
1292         struct sk_buff *nskb;
1293
1294         while ((nskb = __skb_dequeue(list))) {
1295                 struct netif_rx_response *rx =
1296                         RING_GET_RESPONSE(&np->rx, ++cons);
1297
1298
1299 if (nr_frags == MAX_SKB_FRAGS) {
1300         unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
1301
1302         BUG_ON(pull_to <= skb_headlen(skb));
1303         __pskb_pull_tail(skb, pull_to - skb_headlen(skb));
1304                 nr_frags = shinfo->nr_frags;
1305 }
1306 BUG_ON(nr_frags >= MAX_SKB_FRAGS);
1307
1308  __skb_fill_page_desc(skb, nr_frags,
1309                       skb_frag_page(skb_shinfo(nskb)->frags),
1310                       rx->offset, rx->status);

Can I conclude that nr_frags == MAX_SKB_FRAGS, pull_to <=
skb_headlen(skb) and the panic happens before the next BUG_ON is reached ?

> [  717.568040] netfront: Too many frags
> [  723.768738] ------------[ cut here ]------------
> [  723.769226] kernel BUG at drivers/xen/netfront/netfront.c:1302!
> [  723.769724] invalid opcode: 0000 [#1] SMP 
> [  723.770203] Modules linked in: af_packet autofs4 xennet xenblk cdrom
> [  723.770697] CPU 0 
> [  723.770710] Pid: 1309, comm: sshd Not tainted 3.7.10-1.16-dbg-jbp1-xen #9  
> [  723.771667] RIP: e030:[<ffffffffa0023b17>]  [<ffffffffa0023b17>] 
> netif_poll+0xe77/0xf70 [xennet]
> [  723.772057] RSP: e02b:ffff8800fb403c60  EFLAGS: 00010213
> [  723.772057] RAX: 00000000ffff0986 RBX: 0000000000049b7d RCX: 
> ffff8800f78a5428
> [  723.772057] RDX: 000000000000007d RSI: 0000000000000042 RDI: 
> ffff8800f98026c0
> [  723.772057] RBP: ffff8800fb403e20 R08: 0000000000000001 R09: 
> 0000000000000000
> [  723.772057] R10: 0000000000000000 R11: 0000000000000000 R12: 
> ffff8800f79709c0
> [  723.772057] R13: 0000000000000011 R14: ffff8800f866bc00 R15: 
> ffff8800f85a5800
> [  723.772057] FS:  00007fc8f97e87c0(0000) GS:ffff8800fb400000(0000) 
> knlGS:0000000000000000
> [  723.772057] CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  723.772057] CR2: 00007fc8f3aafff0 CR3: 00000000f8a82000 CR4: 
> 0000000000002660
> [  723.772057] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
> 0000000000000000
> [  723.772057] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
> 0000000000000400
> [  723.772057] Process sshd (pid: 1309, threadinfo ffff8800f88f4000, task 
> ffff8800f8279f80)
> [  723.772057] Stack:
> [  723.772057]  ffff8800fb403d60 ffff8800f7970a40 ffff8800f7970000 
> 0000004000000002
> [  723.772057]  0000000000049b7e ffff8800fb410570 ffff8800f7970a78 
> 0000000000000012
> [  723.772057]  ffff8800f7971fb0 0000001200000000 ffff8800f8b8d2c0 
> ffff8800fb403d50
> [  723.772057] Call Trace:
> [  723.772057]  [<ffffffff8041ee35>] net_rx_action+0xd5/0x250
> [  723.772057]  [<ffffffff800376d8>] __do_softirq+0xe8/0x230
> [  723.772057]  [<ffffffff8051151c>] call_softirq+0x1c/0x30
> [  723.772057]  [<ffffffff80008a75>] do_softirq+0x75/0xd0
> [  723.772057]  [<ffffffff800379f5>] irq_exit+0xb5/0xc0
> [  723.772057]  [<ffffffff8036c225>] evtchn_do_upcall+0x295/0x2d0
> [  723.772057]  [<ffffffff8051114e>] do_hypervisor_callback+0x1e/0x30
> [  723.772057]  [<00007fc8f8c5529b>] 0x7fc8f8c5529a
> [  723.772057] Code: 85 7c fe ff ff ea ff ff ff e9 69 f4 ff ff ba 12 00 00 00 
> 48 01 d0 48 39 c1 0f
>                      82 b2 fc ff ff e9 e0 fe ff ff ba 08 00 00 00 eb e8 <0f> 
> 0b 0f b7 33 48 c7 c7
>                      70 6c 02 a0 31 c0 e8 40 7a 4d e0 c7 85 
> [  723.772057] RIP  [<ffffffffa0023b17>] netif_poll+0xe77/0xf70 [xennet]
> [  723.772057]  RSP <ffff8800fb403c60>
> [  723.790939] ---[ end trace 846551a77e015655 ]---
> [  723.791847] Kernel panic - not syncing: Fatal exception in interrupt


Dion.



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

Follow-Ups:
- Re: [Xen-devel] [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  - From: Jan Beulich
- Re: [Xen-devel] [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  - From: Wei Liu

References:
- Re: [Xen-devel] [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  - From: Dion Kant
- Re: [Xen-devel] [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  - From: Wei Liu
- Re: [Xen-devel] [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  - From: Dion Kant
- Re: [Xen-devel] [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  - From: Jan Beulich

Prev by Date: Re: [Xen-devel] [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
Next by Date: Re: [Xen-devel] [PATCH] libxl_json: Fix backport of JSON_BOOL to 4.2.2
Previous by thread: Re: [Xen-devel] [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
Next by thread: Re: [Xen-devel] [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.