[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 0/4] xen/rcu: let rcu work better with core scheduling



On 23/02/2020 14:14, Jürgen Groß wrote:
> On 22.02.20 17:42, Igor Druzhinin wrote:
>> (XEN) [  120.891143] *** Dumping CPU0 host state: ***
>> (XEN) [  120.895909] ----[ Xen-4.13.0  x86_64  debug=y   Not tainted ]----
>> (XEN) [  120.902487] CPU:    0
>> (XEN) [  120.905269] RIP:    e008:[<ffff82d0802aa750>] 
>> smp_send_call_function_mask+0x40/0x43
>> (XEN) [  120.913415] RFLAGS: 0000000000000286   CONTEXT: hypervisor
>> (XEN) [  120.919389] rax: 0000000000000000   rbx: ffff82d0805ddb78   rcx: 
>> 0000000000000001
>> (XEN) [  120.927362] rdx: ffff82d0805cdb00   rsi: ffff82d0805c7cd8   rdi: 
>> 0000000000000007
>> (XEN) [  120.935341] rbp: ffff8300920bfbc0   rsp: ffff8300920bfbb8   r8:  
>> 000000000000003b
>> (XEN) [  120.943310] r9:  0444444444444432   r10: 3333333333333333   r11: 
>> 0000000000000001
>> (XEN) [  120.951282] r12: ffff82d0805ddb78   r13: 0000000000000001   r14: 
>> ffff8300920bfc18
>> (XEN) [  120.959251] r15: ffff82d0802af646   cr0: 000000008005003b   cr4: 
>> 00000000003506e0
>> (XEN) [  120.967223] cr3: 00000000920b0000   cr2: ffff88820dffe7f8
>> (XEN) [  120.973125] fsb: 0000000000000000   gsb: ffff88821e3c0000   gss: 
>> 0000000000000000
>> (XEN) [  120.981094] ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   
>> cs: e008
>> (XEN) [  120.988548] Xen code around <ffff82d0802aa750> 
>> (smp_send_call_function_mask+0x40/0x43):
>> (XEN) [  120.997037]  85 f9 ff fb 48 83 c4 08 <5b> 5d c3 9c 58 f6 c4 02 74 
>> 02 0f 0b 55 48 89 e5
>> (XEN) [  121.005442] Xen stack trace from rsp=ffff8300920bfbb8:
>> (XEN) [  121.011080]    ffff8300920bfc18 ffff8300920bfc00 ffff82d080242c84 
>> ffff82d080389845
>> (XEN) [  121.019145]    ffff8300920bfc18 ffff82d0802af178 0000000000000000 
>> 0000001c1d27aff8
>> (XEN) [  121.027200]    0000000000000000 ffff8300920bfc80 ffff82d0802af1fa 
>> ffff82d080289adf
>> (XEN) [  121.035255]    fffffffffffffd55 0000000000000000 0000000000000000 
>> 0000000000000000
>> (XEN) [  121.043320]    0000000000000000 0000000000000000 0000000000000000 
>> 0000000000000000
>> (XEN) [  121.051375]    000000000000003b 0000001c25e54bf1 0000000000000000 
>> ffff8300920bfc80
>> (XEN) [  121.059443]    ffff82d0805c7300 ffff8300920bfcb0 ffff82d080245f4d 
>> ffff82d0802af4a2
>> (XEN) [  121.067498]    ffff82d0805c7300 ffff83042bb24f60 ffff82d08060f400 
>> ffff8300920bfd00
>> (XEN) [  121.075553]    ffff82d080246781 ffff82d0805cdb00 ffff8300920bfd80 
>> ffff82d0805c7040
>> (XEN) [  121.083621]    ffff82d0805cdb00 ffff82d0805cdb00 fffffffffffffff9 
>> ffff8300920bffff
>> (XEN) [  121.091674]    0000000000000000 ffff8300920bfd30 ffff82d0802425a5 
>> ffff82d0805c7040
>> (XEN) [  121.099739]    ffff82d0805cdb00 fffffffffffffff9 ffff8300920bffff 
>> ffff8300920bfd40
>> (XEN) [  121.107797]    ffff82d0802425e5 ffff8300920bfd80 ffff82d08022bc0f 
>> 0000000000000000
>> (XEN) [  121.115852]    ffff82d08022b600 ffff82d0804b3888 ffff82d0805cdb00 
>> ffff82d0805cdb00
>> (XEN) [  121.123917]    fffffffffffffff9 ffff8300920bfdb0 ffff82d0802425a5 
>> 0000000000000003
>> (XEN) [  121.131975]    0000000000000001 00000000ffffffef ffff8300920bffff 
>> ffff8300920bfdc0
>> (XEN) [  121.140037]    ffff82d0802425e5 ffff8300920bfdd0 ffff82d08022b91b 
>> ffff8300920bfdf0
>> (XEN) [  121.148093]    ffff82d0802addb1 ffff83042b3b0000 0000000000000003 
>> ffff8300920bfe30
>> (XEN) [  121.156150]    ffff82d0802ae086 ffff8300920bfe10 ffff83042b7e81e0 
>> ffff83042b3b0000
>> (XEN) [  121.164216]    0000000000000000 0000000000000000 0000000000000000 
>> ffff8300920bfe50
>> (XEN) [  121.172271] Xen call trace:
>> (XEN) [  121.175573]    [<ffff82d0802aa750>] R 
>> smp_send_call_function_mask+0x40/0x43
>> (XEN) [  121.183024]    [<ffff82d080242c84>] F on_selected_cpus+0xa4/0xde
>> (XEN) [  121.189520]    [<ffff82d0802af1fa>] F 
>> arch/x86/time.c#time_calibration+0x82/0x89
>> (XEN) [  121.197403]    [<ffff82d080245f4d>] F 
>> common/timer.c#execute_timer+0x49/0x64
>> (XEN) [  121.204951]    [<ffff82d080246781>] F 
>> common/timer.c#timer_softirq_action+0x116/0x24e
>> (XEN) [  121.213271]    [<ffff82d0802425a5>] F 
>> common/softirq.c#__do_softirq+0x85/0x90
>> (XEN) [  121.220890]    [<ffff82d0802425e5>] F 
>> process_pending_softirqs+0x35/0x37
>> (XEN) [  121.228086]    [<ffff82d08022bc0f>] F 
>> common/rcupdate.c#rcu_process_callbacks+0x1ef/0x20d
>> (XEN) [  121.236758]    [<ffff82d0802425a5>] F 
>> common/softirq.c#__do_softirq+0x85/0x90
>> (XEN) [  121.244378]    [<ffff82d0802425e5>] F 
>> process_pending_softirqs+0x35/0x37
>> (XEN) [  121.251568]    [<ffff82d08022b91b>] F rcu_barrier+0x58/0x6e
>> (XEN) [  121.257639]    [<ffff82d0802addb1>] F cpu_down_helper+0x11/0x32
>> (XEN) [  121.264051]    [<ffff82d0802ae086>] F 
>> arch/x86/sysctl.c#smt_up_down_helper+0x1d6/0x1fe
>> (XEN) [  121.272454]    [<ffff82d08020878d>] F 
>> common/domain.c#continue_hypercall_tasklet_handler+0x54/0xb8
>> (XEN) [  121.281900]    [<ffff82d0802454e6>] F 
>> common/tasklet.c#do_tasklet_work+0x81/0xb4
>> (XEN) [  121.289786]    [<ffff82d080245803>] F do_tasklet+0x58/0x85
>> (XEN) [  121.295771]    [<ffff82d08027a0b4>] F 
>> arch/x86/domain.c#idle_loop+0x87/0xcb
>>
>> So it's not in get_cpu_maps() loop. It seems to me it's not entering time 
>> sync for some
>> reason.
> 
> Interesting. Looking further into that.
> 
> At least time_calibration() is missing to call get_cpu_maps().

I debugged this issue and the following fixes it:

diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
index ccf2ec6..36d98a4 100644
--- a/xen/common/rcupdate.c
+++ b/xen/common/rcupdate.c
@@ -153,6 +153,7 @@ static int rsinterval = 1000;
  * multiple times.
  */
 static atomic_t cpu_count = ATOMIC_INIT(0);
+static atomic_t done_count = ATOMIC_INIT(0);
 
 static void rcu_barrier_callback(struct rcu_head *head)
 {
@@ -175,6 +176,8 @@ static void rcu_barrier_action(void)
         process_pending_softirqs();
         cpu_relax();
     }
+
+    atomic_dec(&done_count);
 }
 
 void rcu_barrier(void)
@@ -194,10 +197,11 @@ void rcu_barrier(void)
     if ( !initial )
     {
         atomic_set(&cpu_count, num_online_cpus());
+        atomic_set(&done_count, num_online_cpus());
         cpumask_raise_softirq(&cpu_online_map, RCU_SOFTIRQ);
     }
 
-    while ( atomic_read(&cpu_count) )
+    while ( atomic_read(&done_count) )
     {
         process_pending_softirqs();
         cpu_relax();

Is there anything else that blocks v3 currently.

Igor

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.