[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 00/60] xen: add core scheduling support



On 22.07.19 16:22, Sergey Dyasli wrote:
On 19/07/2019 14:57, Juergen Gross wrote:

I have now a git branch with the two problems corrected and rebased to
current staging available:

github.com/jgross1/xen.git sched-v1b

Many thanks for the branch! As for the crashes, vcpu_sleep_sync() one
seems to be fixed now. But I can still reproduce the shutdown one.
Interestingly, it now happens only if a host has running VMs (which
are automatically powered off via PV tools):

(XEN) [  332.981355] Preparing system for ACPI S5 state.
(XEN) [  332.981419] Disabling non-boot CPUs ...
(XEN) [  337.703896] Watchdog timer detects that CPU1 is stuck!
(XEN) [  337.709532] ----[ Xen-4.13.0-8.0.6-d  x86_64  debug=y   Not tainted 
]----
(XEN) [  337.716808] CPU:    1
(XEN) [  337.719582] RIP:    e008:[<ffff82d08024041c>] 
sched_context_switched+0xaf/0x101
(XEN) [  337.727384] RFLAGS: 0000000000000202   CONTEXT: hypervisor
(XEN) [  337.733364] rax: 0000000000000002   rbx: ffff83081cc615b0   rcx: 
0000000000000001
(XEN) [  337.741338] rdx: ffff83081cc61634   rsi: ffff83081cc72000   rdi: 
ffff83081cc72000
(XEN) [  337.749312] rbp: ffff83081cc8fdc0   rsp: ffff83081cc8fda0   r8:  
0000000000000000
(XEN) [  337.757284] r9:  0000000000000000   r10: 0000004d88fc535e   r11: 
0000004df8675ce7
(XEN) [  337.765256] r12: ffff83081cc72000   r13: ffff83081cc72000   r14: 
ffff83081ccb0e80
(XEN) [  337.773232] r15: ffff83081cc615b0   cr0: 000000008005003b   cr4: 
00000000001526e0
(XEN) [  337.781206] cr3: 00000000dd2a1000   cr2: ffff88809ed1fb80
(XEN) [  337.787100] fsb: 0000000000000000   gsb: ffff8880a38c0000   gss: 
0000000000000000
(XEN) [  337.795072] ds: 002b   es: 002b   fs: 0000   gs: 0000   ss: e010   cs: 
e008
(XEN) [  337.802525] Xen code around <ffff82d08024041c> 
(sched_context_switched+0xaf/0x101):
(XEN) [  337.810672]  00 00 eb 18 f3 90 8b 02 <85> c0 75 f8 eb 0e 49 8b 7e 30 
48 85 ff 74 05 e8
(XEN) [  337.819080] Xen stack trace from rsp=ffff83081cc8fda0:
(XEN) [  337.824713]    ffff83081cc72000 ffff83081cc72000 0000000000000000 
ffff83081cc615b0
(XEN) [  337.832772]    ffff83081cc8fe00 ffff82d0802404e0 0000000000000082 
ffff83081ccb0e98
(XEN) [  337.840832]    0000000000000001 ffff83081ccb0e98 0000000000000001 
ffff82d080602628
(XEN) [  337.848895]    ffff83081cc8fe60 ffff82d080240aca 0000004d873bd669 
0000000000000001
(XEN) [  337.856952]    ffff83081cc72000 0000004d873bdc1c ffff8308000000ff 
ffff82d0805bba00
(XEN) [  337.865012]    ffff82d0805bb980 ffffffffffffffff ffff83081cc8ffff 
0000000000000001
(XEN) [  337.873072]    ffff83081cc8fe90 ffff82d080242315 0000000000000080 
ffff82d0805bb980
(XEN) [  337.881132]    0000000000000001 ffff82d0806026f0 ffff83081cc8fea0 
ffff82d08024236a
(XEN) [  337.889196]    ffff83081cc8fef0 ffff82d08027a151 ffff82d080242315 
000000010665f000
(XEN) [  337.897256]    ffff83081cc72000 ffff83081cc72000 ffff83080665f000 
ffff83081cc63000
(XEN) [  337.905313]    0000000000000001 ffff830806684000 ffff83081cc8fd78 
ffff88809ee08000
(XEN) [  337.913373]    ffff88809ee08000 0000000000000000 0000000000000000 
0000000000000003
(XEN) [  337.921434]    ffff88809ee08000 0000000000000246 aaaaaaaaaaaaaaaa 
0000000000000000
(XEN) [  337.929497]    0000000096968abe 0000000000000000 ffffffff810013aa 
ffffffff8203c190
(XEN) [  337.937554]    deadbeefdeadf00d deadbeefdeadf00d 0000010000000000 
ffffffff810013aa
(XEN) [  337.945615]    000000000000e033 0000000000000246 ffffc900400afeb0 
000000000000e02b
(XEN) [  337.953674]    000000000000beef 000000000000beef 000000000000beef 
000000000000beef
(XEN) [  337.961736]    0000e01000000001 ffff83081cc72000 000000379c66db80 
00000000001526e0
(XEN) [  337.969797]    0000000000000000 0000000000000000 0000060000000000 
0000000000000000
(XEN) [  337.977856] Xen call trace:
(XEN) [  337.981152]    [<ffff82d08024041c>] sched_context_switched+0xaf/0x101
(XEN) [  337.988083]    [<ffff82d0802404e0>] 
schedule.c#sched_context_switch+0x72/0x151
(XEN) [  337.995796]    [<ffff82d080240aca>] schedule.c#sched_slave+0x2a3/0x2b2
(XEN) [  338.002817]    [<ffff82d080242315>] softirq.c#__do_softirq+0x85/0x90
(XEN) [  338.009664]    [<ffff82d08024236a>] do_softirq+0x13/0x15
(XEN) [  338.015471]    [<ffff82d08027a151>] domain.c#idle_loop+0xb2/0xc9
(XEN) [  338.021970]
(XEN) [  338.023965] CPU7 @ e008:ffff82d080242f94 
(stop_machine.c#stopmachine_action+0x30/0xa0)
(XEN) [  338.032372] CPU5 @ e008:ffff82d080242f94 
(stop_machine.c#stopmachine_action+0x30/0xa0)
(XEN) [  338.040776] CPU4 @ e008:ffff82d080242f94 
(stop_machine.c#stopmachine_action+0x30/0xa0)
(XEN) [  338.049182] CPU2 @ e008:ffff82d080242f9a 
(stop_machine.c#stopmachine_action+0x36/0xa0)
(XEN) [  338.057591] CPU6 @ e008:ffff82d080242f9a 
(stop_machine.c#stopmachine_action+0x36/0xa0)
(XEN) [  338.065999] CPU3 @ e008:ffff82d080242f9a 
(stop_machine.c#stopmachine_action+0x36/0xa0)
(XEN) [  338.074406] CPU0 @ e008:ffff82d0802532d1 
(ns16550.c#ns_read_reg+0x21/0x42)
(XEN) [  338.081773]
(XEN) [  338.083764] ****************************************
(XEN) [  338.089226] Panic on CPU 1:
(XEN) [  338.092521] FATAL TRAP: vector = 2 (nmi)
(XEN) [  338.096940] [error_code=0000]
(XEN) [  338.100491] ****************************************
(XEN) [  338.105951]
(XEN) [  338.107946] Reboot in five seconds...
(XEN) [  338.112105] Executing kexec image on cpu1
(XEN) [  338.117383] Shot down all CPUs

And since Igor managed to fix kdump, I can now post backtraces from
all CPUs as well: https://paste.debian.net/1092609/

Thanks for the test (and report).

The fix is a one-liner. :-)

diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index f0bc5b3161..da9efb147f 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -2207,6 +2207,7 @@ static struct sched_unit *sched_wait_rendezvous_in(struct sched_unit *prev,
         if ( unlikely(!scheduler_active) )
         {
             ASSERT(is_idle_unit(prev));
+            atomic_set(&prev->next_task->rendezvous_out_cnt, 0);
             prev->rendezvous_in_cnt = 0;
         }
     }


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.