[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Dom0 crash with apache bench (ab)



On 31/07/15 11:24, Stefano Stabellini wrote:
> This is a Linux Dom0 crash on x86 (Dell PowerEdge R320, Xeon E5-2450),
> CC'ing relevant people. As you can see from the links below the crash
> is:
> 
> [ 253.619326] Call Trace:
> [ 253.619330] <IRQ>
> [ 253.619332] [<ffffffff815d7c25>] ? skb_copy_ubufs+0xa5/0x230
> [ 253.619347] [<ffffffff815e8525>] __netif_receive_skb_core+0x6f5/0x940
> [ 253.619353] [<ffffffff815e8788>] __netif_receive_skb+0x18/0x60
> [ 253.619360] [<ffffffff815e87f8>] netif_receive_skb_internal+0x28/0x90
> [ 253.619366] [<ffffffff815e91f5>] napi_gro_frags+0x125/0x1a0
> [ 253.619378] [<ffffffffa01b1173>] mlx4_en_process_rx_cq+0x753/0xb50 [mlx4_en]
> [ 253.619387] [<ffffffffa01b1657>] mlx4_en_poll_rx_cq+0x97/0x160 [mlx4_en]

What makes you think this is Xen specific?  I suggest raising this the
the mlx4 maintainers.

David

> [ 253.619393] [<ffffffff815e8bcd>] net_rx_action+0x13d/0x2f0
> [ 253.619400] [<ffffffff8109fdea>] __do_softirq+0xda/0x1f0
> [ 253.619406] [<ffffffff810a013d>] irq_exit+0x9d/0xb0
> [ 253.619412] [<ffffffff813e3825>] xen_evtchn_do_upcall+0x35/0x50
> [ 253.619420] [<ffffffff816c7bce>] xen_do_hypervisor_callback+0x1e/0x40
> [ 253.619423] <EOI>
> [ 253.619426] [<ffffffff811a7870>] ? shrink_dcache_for_umount+0x90/0x90
> [ 253.619437] [<ffffffff811a7ad9>] ? d_alloc_pseudo+0x9/0x10
> [ 253.619443] [<ffffffff815cbbed>] ? sock_alloc_file+0x4d/0x120
> [ 253.619448] [<ffffffff815cdf78>] ? SYSC_accept4+0xb8/0x200
> [ 253.619454] [<ffffffff811d0377>] ? SyS_epoll_wait+0x87/0xe0
> [ 253.619459] [<ffffffff815cf5c9>] ? SyS_accept4+0x9/0x10
> [ 253.619465] [<ffffffff816c630d>] ? system_call_fastpath+0x16/0x1b
> [ 253.619469] Code: 4e 48 83 c4 08 5b 5d c3 66 0f 1f 44 00 00 e8 6b fc
> ff ff eb e1 90 90 90 90 90 90 90 90 90 48 89 f8 48 89 d1 48 c1 e9 03 83
> e2 07 <f3
>> 48 a5 89 d1 f3 a4 c3 20 4c 8b 06 4c 8b 4e 08 4c 8b 56 10 4c 
> [ 253.619513] RIP [<ffffffff81318b0d>] __memcpy+0xd/0x110
> [ 253.619520] RSP <ffff88006b823c60>
> [ 253.619524] ---[ end trace ba5d35a466b03856 ]---
> 
> On Tue, 28 Jul 2015, Christoffer Dall wrote:
>> On Tue, Jul 28, 2015 at 4:55 PM, Ian Campbell <ian.campbell@xxxxxxxxxx> 
>> wrote:
>>       On Tue, 2015-07-28 at 10:50 -0400, Konrad Rzeszutek Wilk wrote:
>>       > On Tue, Jul 28, 2015 at 03:09:31PM +0200, Christoffer Dall wrote:
>>       > > Hi,
>>       > >
>>       > > I've been doing some performance comparisons lately, and wanted to
>>       > > compare
>>       > > the performance overhead of using Xen with apache bench, but
>>       > > unfortunately
>>       > > the Dom0 kernel crashes when hitting it with ab from a remote 
>> machine.
>>       > > Most other workloads seem to be stable, however, I do see similar
>>       > > crashes
>>       > > if hitting Dom0 mysql with a mysql benchmark with a high level of
>>       > > parallelism.
>>       > >
>>       > > I use a 10G Mellanox MX354A Dual port FDR CX3 adapter for 
>> networking on
>>       > > a
>>       > > Dell PowerEdge R320 system with a Xeon E5-2450 and 16 GB of RAM.
>>       > >
>>       > > Interestingly, we had a similarly looking issue on arm64 recently, 
>> but
>>       > > that
>>       > > was fixed with an APM-soecific fix to the hypervisor, so I am 
>> guessing
>>       > > this
>>       > > is unrelated, see:
>>       > > http://lists.xenproject.org/archives/html/xen-devel/2015
>>       > > -03/msg02731.html
>>       > > and the fix:
>>       > > 
>> http://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=50dcb3de603927db2fd
>>       > > 87ba09e29c817415aaa44
>>       > >
>>       > > I have tried with several Linux versions, v3.13, v3.18, v4.0-rc4, 
>> and
>>       > > v4.1,
>>       > > same issue.  I have tried with Xen 4.5-0 release, and the Ubuntu
>>       > > packaged
>>       > > Xen 4.4 release, same issue.
>>       > >
>>       > > Examples of crash:
>>       > > http://pastebin.ubuntu.com/11953498/
>>       > > http://pastebin.ubuntu.com/11953443/
>>       >
>>       > 4.0-rc4?
>>       >
>>       > Have you tried 4.1?
>>
>> According to the previous paragraph, yes he has.
>>
>> yes, I have.  Just for clarify, I used 4.0-rc4 because that's a branch which 
>> contained arm64 PCI support and has
>> been used for other measurements, so this was simply my 'working tree'.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.