[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Linux DomU freezes and dies under heavy memory shuffling



+ x86 maintainers

It looks like the tlbflush is getting stuck?


On Sat, 6 Feb 2021, Roman Shaposhnik wrote:
> Hi!
> 
> all of a sudden (but only after a few days of running normally), on a stock
> Ubuntu 18.04 (Bionic with 4.15.0 kernel) DomU I'm seeing Microsoft's .net
> runtime go into a heave GC cycle and then freeze and die like what is
> shown below. This is under stock Xen 4.14.0 on a pretty unremarkable
> x86_64 box made by Supermicro.
> 
> I would really appreciate any thoughts on the subject or at least directions
> in which I should go to investigate this. At this point -- this part
> of Xen is a
> bit of a mystery to me -- but I'm very much willing to learn ;-)
> 
> >From my completely uneducated guess it feels like some kind of an issue
> between DomU shuffling memory much more than normal and Xen somehow
> getting unhappy about that:
> 
> [376900.874560] watchdog: BUG: soft lockup - CPU#0 stuck for 23s! 
> [dotnet:3518]
> [376900.874764] Kernel panic - not syncing: softlockup: hung tasks
> [376900.874793] CPU: 0 PID: 3518 Comm: dotnet Tainted: G L
> 4.15.0-112-generic #113-Ubuntu
> [376900.874824] Hardware name: Xen HVM domU, BIOS 4.14.0 12/15/2020
> [376900.874847] Call Trace:
> [376900.874860] <IRQ>
> [376900.874874] dump_stack+0x6d/0x8e
> [376900.874892] panic+0xe4/0x254
> [376900.874911] watchdog_timer_fn+0x21e/0x230
> [376900.874928] ? watchdog+0x30/0x30
> [376900.874947] __hrtimer_run_queues+0xdf/0x230
> [376900.874970] hrtimer_interrupt+0xa0/0x1d0
> [376900.874989] xen_timer_interrupt+0x20/0x30
> [376900.875008] __handle_irq_event_percpu+0x44/0x1a0
> [376900.875031] handle_irq_event_percpu+0x32/0x80
> [376900.875053] handle_percpu_irq+0x3d/0x60
> [376900.875071] generic_handle_irq+0x28/0x40
> [376900.875090] __evtchn_fifo_handle_events+0x172/0x190
> [376900.875112] evtchn_fifo_handle_events+0x10/0x20
> [376900.875133] __xen_evtchn_do_upcall+0x49/0x80
> [376900.875156] xen_evtchn_do_upcall+0x2b/0x50
> [376900.875177] xen_hvm_callback_vector+0x90/0xa0
> [376900.875197] </IRQ>
> [376900.875211] RIP: 0010:smp_call_function_single+0xdc/0x100
> [376900.875230] RSP: 0018:ffffaaa3c1807c20 EFLAGS: 00000202 ORIG_RAX:
> ffffffffffffff0c
> [376900.875261] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
> 0000000000000000
> [376900.875288] RDX: 0000000000000001 RSI: 0000000000000003 RDI:
> 0000000000000003
> [376900.875314] RBP: ffffaaa3c1807c70 R08: fffffffffffffffc R09:
> 0000000000000002
> [376900.875341] R10: 0000000000000040 R11: 0000000000000000 R12:
> ffff8e0ab2c1de70
> [376900.875368] R13: 0000000000000000 R14: ffffffff95a7ecd0 R15:
> ffffaaa3c1807d08
> [376900.875396] ? flush_tlb_func_common.constprop.10+0x230/0x230
> [376900.875424] ? flush_tlb_func_common.constprop.10+0x230/0x230
> [376900.875449] ? unmap_page_range+0xbbc/0xd00
> [376900.875470] smp_call_function_many+0x1cc/0x250
> [376900.875491] ? smp_call_function_many+0x1cc/0x250
> [376900.875513] native_flush_tlb_others+0x3c/0xf0
> [376900.875534] flush_tlb_mm_range+0xae/0x110
> [376900.875552] tlb_flush_mmu_tlbonly+0x5f/0xc0
> [376900.875574] arch_tlb_finish_mmu+0x3f/0x80
> [376900.875592] tlb_finish_mmu+0x23/0x30
> [376900.875610] unmap_region+0xf7/0x130
> [376900.875629] do_munmap+0x276/0x450
> [376900.875647] vm_munmap+0x69/0xb0
> [376900.875664] SyS_munmap+0x22/0x30
> [376900.875682] do_syscall_64+0x73/0x130
> [376900.875701] entry_SYSCALL_64_after_hwframe+0x41/0xa6
> [376900.875721] RIP: 0033:0x7f05ad52dd59
> [376900.875737] RSP: 002b:00007f05a8037150 EFLAGS: 00000246 ORIG_RAX:
> 000000000000000b
> [376900.875765] RAX: ffffffffffffffda RBX: 000056517e2a08c0 RCX:
> 00007f05ad52dd59
> [376900.875791] RDX: 0000000000000000 RSI: 0000000000006a00 RDI:
> 00007f05aad8f000
> [376900.875818] RBP: 0000000000006a00 R08: 0000000000020b18 R09:
> 0000000000000000
> [376900.875844] R10: 0000000000020ad0 R11: 0000000000000246 R12:
> 0000000000000001
> [376900.875870] R13: 0000000000000000 R14: 000056517eb02300 R15:
> 00007f05aad8f000
> 
> Thanks,
> Roman.
> 



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.