[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Interrupt injection with ISR set on Intel hardware



> From: Roger Pau Monné [mailto:roger.pau@xxxxxxxxxx]
> Sent: Monday, October 15, 2018 6:30 PM
> (XEN)   [22642] POWER    TYPE 4
> (XEN)   [22643] IDLE     PPR 0x00000020
> (XEN)                    IRR
> 00000000000000000000000000000000000000000000000000000000000000
> 00
> (XEN)                    ISR
> 00000000020000000000000000000000000000000000000000000000000000
> 00
> (XEN)   [22644] WAKE     PPR 0x00000020
> (XEN)                    IRR
> 00000000020000000000000000000000000000000000000000000000000000
> 00
> (XEN)                    ISR
> 00000000020000000000000000000000000000000000000000000000000000
> 00

looks pending IRR (0x21) doesn't always trigger a spurious interrupt?
is it a fixed pattern after how many rounds of Cstate enter/exit with
pending IRR(0x21) then you see assertion happened (in this example
it happens at 3rd time)?

> (XEN)   [22645] POWER    TYPE 3
> (XEN)   [22646] IDLE     PPR 0x00000020
> (XEN)                    IRR
> 00000000020000000000000000000000000000000000000000000000000000
> 00
> (XEN)                    ISR
> 00000000020000000000000000000000000000000000000000000000000000
> 00
> (XEN)   [22647] WAKE     PPR 0x00000020
> (XEN)                    IRR
> 00000000020000000000000000000000000000000000000000000000000000
> 00
> (XEN)                    ISR
> 00000000020000000000000000000000000000000000000000000000000000
> 00
> (XEN)   [22648] POWER    TYPE 3
> (XEN)   [22649] IDLE     PPR 0x00000020
> (XEN)                    IRR
> 00000000020000000000000000000000000000000000000000000000000000
> 00
> (XEN)                    ISR
> 00000000020000000000000000000000000000000000000000000000000000
> 00
> (XEN)   [22650] WAKE     PPR 0x00000020
> (XEN)                    IRR
> 00000000020000000000000000000000000000000000000000000000000000
> 00
> (XEN)                    ISR
> 00000000020000000000000000000000000000000000000000000000000000
> 00
> (XEN) All LAPIC state:
> (XEN)   [vector]      ISR      TMR      IRR
> (XEN)   [1f:00]  00000000 00000000 00000000
> (XEN)   [3f:20]  00000002 00000000 00000000
> (XEN)   [5f:40]  00000000 00000000 00000000
> (XEN)   [7f:60]  00000000 00000000 00000000
> (XEN)   [9f:80]  00000000 00000000 00000000
> (XEN)   [bf:a0]  00000000 00000000 00000000
> (XEN)   [df:c0]  00000000 00000000 00000000
> (XEN)   [ff:e0]  00000000 00000000 04000000
> (XEN) Assertion '(sp == 0) || (peoi[sp-1].vector < vector)' failed at 
> irq.c:1340
> (XEN) ----[ Xen-4.12-unstable  x86_64  debug=y   Tainted:  C   ]----
> (XEN) CPU:    1
> (XEN) RIP:    e008:[<ffff82d08028737d>] do_IRQ+0x8df/0xacb
> (XEN) RFLAGS: 0000000000010002   CONTEXT: hypervisor
> (XEN) rax: ffff83086c67202c   rbx: 0000000000000180   rcx:
> 0000000000000000
> (XEN) rdx: ffff83086c68ffff   rsi: 000000000000000a   rdi: ffff83086c601e24
> (XEN) rbp: ffff83086c68fd98   rsp: ffff83086c68fd38   r8:  ffff83086c690000
> (XEN) r9:  0000000000000030   r10: 0000000004000000   r11:
> 0000000000000007
> (XEN) r12: 000000000000011f   r13: 00000000ffffffff   r14: ffff83086c601e00
> (XEN) r15: ffff82cfffffb100   cr0: 0000000080050033   cr4:
> 00000000003526e0
> (XEN) cr3: 0000000855ba7000   cr2: 0000556bfa53c040
> (XEN) fsb: 0000000000000000   gsb: 0000000000000000   gss:
> 0000000000000000
> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
> (XEN) Xen code around <ffff82d08028737d> (do_IRQ+0x8df/0xacb):
> (XEN)  8d 7e 24 e8 51 66 fb ff <0f> 0b 0f 0b 0f 0b 0f 0b b8 00 00 00 00 eb 4e
> 83
> (XEN) Xen stack trace from rsp=ffff83086c68fd38:
> (XEN)    ffff82d000000000 ffff83086c601e24 0000000000000000
> ffff83086c6724e0
> (XEN)    ffff82d08037b841 ffff82d08037b835 ffff82d08037b841
> 0000000000000000
> (XEN)    0000000000000000 0000000000000000 ffff83086c68ffff
> 0000000000000000
> (XEN)    00007cf793970237 ffff82d08037b8aa 00000003040712e5
> 0000000000000008
> (XEN)    ffff83086c671448 ffff83086c671390 ffff83086c68fec0
> 00000003040b3015
> (XEN)    ffff83086c672d08 ffff83086c6724e0 ffff83086c672d28
> 0000000000000180
> (XEN)    ffff83086c67202c 0000000000000000 ffff83086c68ffff
> 0000000000002ccf
> (XEN)    ffff83086c6713c0 0000002100000000 ffff82d0802e2403
> 000000000000e008
> (XEN)    0000000000000202 ffff83086c68fe50 0000000000000000
> ffff830088dd4000
> (XEN)    00000020ffffffff 0000000000000000 ffff83086c68fee8
> ffff82d08059bd00
> (XEN)    0000000000000000 0000000000000000 000002d90000017f
> ffff82d0805a3c80
> (XEN)    0000000000000001 ffff82d08059bd00 0000000000000001
> 0000000000000001
> (XEN)    ffff830856085000 ffff83086c68fef0 ffff82d08027755d
> ffff83086c6a5000
> (XEN)    ffff830088dd4000 ffff830088bfa000 ffff83086c6a5000
> ffff83086c68fdb8
> (XEN)    0000000000000000 0000000000000000 ffff880269a3bd00
> ffff880269a3bd00
> (XEN)    0000000000000005 0000000000000005 0000000000000000
> 0000000000000120
> (XEN)    0000000000000000 000000002059d803 ffffffff816fe980
> ffff88027335a7c0
> (XEN)    ffffffff82049af8 ffff88027335a7c0 00000000dade4600
> 0000beef0000beef
> (XEN)    ffffffff816fec52 000000bf0000beef 0000000000000246
> ffffc90000d13e98
> (XEN)    000000000000beef ffff83086c68beef 000000000000beef
> 000000000000beef
> (XEN) Xen call trace:
> (XEN)    [<ffff82d08028737d>] do_IRQ+0x8df/0xacb
> (XEN)    [<ffff82d08037b8aa>] common_interrupt+0x10a/0x120
> (XEN)    [<ffff82d0802e2403>] mwait-idle.c#mwait_idle+0x2a5/0x381
> (XEN)    [<ffff82d08027755d>] domain.c#idle_loop+0xb3/0xb5
> (XEN)
> (XEN)
> (XEN) ****************************************
> (XEN) Panic on CPU 1:
> (XEN) Assertion '(sp == 0) || (peoi[sp-1].vector < vector)' failed at 
> irq.c:1340
> (XEN) ****************************************
> (XEN)
> (XEN) Manual reset required ('noreboot' specified)
> 
> Finally I'm also proving the surrounding context of the instructions
> pointers in the trace above:
> 
> (XEN)    [<ffff82d08028737d>] do_IRQ+0x8df/0xacb
> xen/arch/x86/irq.c:1340:
> 
>   1325            if ( action->ack_type == ACKTYPE_EOI )
>   1326            {
>   1327                sp = pending_eoi_sp(peoi);
>   1328                if ( !((sp == 0) || (peoi[sp-1].vector < vector)) )
>   1329                {
>   1330                    printk("*** Pending EOI error ***\n");
>   1331                    printk("  cpu #%u, irq %d, vector 0x%x, sp %d\n",
>   1332                           smp_processor_id(), irq, vector, sp);
>   1333
>   1334                    dump_peoi_stack(sp);
>   1335                    dump_peoi_records();
>   1336                    dump_lapic();
>   1337
>   1338                    spin_unlock(&desc->lock);
>   1339
> ->1340                    assert_failed("(sp == 0) || (peoi[sp-1].vector < 
> vector)");
>   1341                }
>   1342
>   1343                ASSERT(sp < (NR_DYNAMIC_VECTORS-1));
>   1344                peoi[sp].irq = irq;
>   1345                peoi[sp].vector = vector;
>   1346                peoi[sp].ready = 0;
>   1347                pending_eoi_sp(peoi) = sp+1;
>   1348                cpumask_set_cpu(smp_processor_id(), 
> action->cpu_eoi_map);
> 
> (XEN)    [<ffff82d08037b8aa>] common_interrupt+0x10a/0x120
> xen/arch/x86/x86_64/entry.S:58
> 
>     47                /* Inject exception if pending. */
>     48                lea   VCPU_trap_bounce(%rbx), %rdx
>     49                testb $TBF_EXCEPTION, TRAPBOUNCE_flags(%rdx)
>     50                jnz   .Lprocess_trapbounce
>     51
>     52                cmpb  $0, VCPU_mce_pending(%rbx)
>     53                jne   process_mce
>     54        .Ltest_guest_nmi:
>     55                cmpb  $0, VCPU_nmi_pending(%rbx)
>     56                jne   process_nmi
>     57        test_guest_events:
> ->  58                movq  VCPU_vcpu_info(%rbx), %rax
>     59                movzwl VCPUINFO_upcall_pending(%rax), %eax
>     60                decl  %eax
>     61                cmpl  $0xfe, %eax
>     62                ja    restore_all_guest
>     63        /*process_guest_events:*/
>     64                sti
>     65                leaq  VCPU_trap_bounce(%rbx), %rdx
>     66                movq  VCPU_event_addr(%rbx), %rax
>     67                movq  %rax, TRAPBOUNCE_eip(%rdx)
>     68                movb  $TBF_INTERRUPT, TRAPBOUNCE_flags(%rdx)
>     69                call  create_bounce_frame
>     70                jmp   test_all_events
> 
> (XEN)    [<ffff82d0802e2403>] mwait-idle.c#mwait_idle+0x2a5/0x381
> xen/arch/x86/cpu/mwait-idle.c:802
> 
>    788                if (cpu_is_haltable(cpu))
>    789                        mwait_idle_with_hints(eax,
> MWAIT_ECX_INTERRUPT_BREAK);
>    790
>    791                after = cpuidle_get_tick();
>    792
>    793                cstate_restore_tsc();
>    794                trace_exit_reason(irq_traced);
>    795                TRACE_6D(TRC_PM_IDLE_EXIT, cx->type, after,
>    796                        irq_traced[0], irq_traced[1], irq_traced[2],
> irq_traced[3]);
>    797
>    798                /* Now back in C0. */
>    799                update_idle_stats(power, cx, before, after);
>    800                local_irq_enable();
>    801
> -> 802                if (!(lapic_timer_reliable_states & (1 << cstate)))
>    803                        lapic_timer_on();
>    804
>    805                sched_tick_resume();
>    806                cpufreq_dbs_timer_resume();
> 
> (XEN)    [<ffff82d08027755d>] domain.c#idle_loop+0xb3/0xb5
> xen/arch/x86/domain.c:144
> 
>    129            for ( ; ; )
>    130            {
>    131                if ( cpu_is_offline(cpu) )
>    132                    play_dead();
>    133
>    134                /* Are we here for running vcpu context tasklets, or 
> for idling?
> */
>    135                if ( unlikely(tasklet_work_to_do(cpu)) )
>    136                    do_tasklet();
>    137                /*
>    138                 * Test softirqs twice --- first to see if should even 
> try scrubbing
>    139                 * and then, after it is done, whether softirqs became 
> pending
>    140                 * while we were scrubbing.
>    141                 */
>    142                else if ( !softirq_pending(cpu) && !scrub_free_pages()  
> &&
>    143                            !softirq_pending(cpu) )
> -> 144                    pm_idle();
>    145                do_softirq();
>    146                /*
>    147                 * We MUST be last (or before pm_idle). Otherwise after 
> we get
> the
>    148                 * softirq we would execute pm_idle (and sleep) and not 
> patch.
>    149                 */
>    150                check_for_livepatch_work();
>    151            }

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.