[Xen-devel] Re: [PATCH] blkfront: Move blkif_interrupt into a ta

On Tue, Aug 16, 2011 at 04:26:54AM -0700, imammedo [via Xen] wrote:
> 
> Jeremy Fitzhardinge wrote:
> > 
> > Have you tried bisecting to see when this particular problem appeared? 
> > It looks to me like something is accidentally re-enabling interrupts -
> > perhaps a stack overrun is corrupting the "flags" argument between a
> > spin_lock_irqsave()/restore pair. 
> > 
> > Is it only on 32-bit kernels?
> > 

Any specific reason you did not include xen-devel in this email? I am
CC-ing it here.

>  ------------[ cut here ]------------
> [604001.659925] WARNING: at block/blk-core.c:239 blk_start_queue+0x70/0x80()
> [604001.659964] Modules linked in: nfs lockd fscache auth_rpcgss nfs_acl
> sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4
> nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables xen_netfront
> pcspkr [last unloaded: scsi_wait_scan]
> [604001.660147] Pid: 336, comm: udevd Tainted: G        W   3.0.0+ #50
> [604001.660181] Call Trace:
> [604001.660209]  [<c045c512>] warn_slowpath_common+0x72/0xa0
> [604001.660243]  [<c06643a0>] ? blk_start_queue+0x70/0x80
> [604001.660275]  [<c06643a0>] ? blk_start_queue+0x70/0x80
> [604001.660310]  [<c045c562>] warn_slowpath_null+0x22/0x30
> [604001.660343]  [<c06643a0>] blk_start_queue+0x70/0x80
> [604001.660379]  [<c075e231>] kick_pending_request_queues+0x21/0x30
> [604001.660417]  [<c075e42f>] blkif_interrupt+0x19f/0x2b0
> ...
>  ------------[ cut here ]------------
> 
> I've debugged a bit blk-core warning and can say:
>   - Yes, It is 32-bit PAE kernel and happens only with it so far.
>   - Affects PV xen guest, bare-metal and kvm configs are not affected.
>   - Upstream kernel is affected as well.
>   - Reproduces on xen 4.1.1 and 3.1.2 hosts
> 
> IF flag is always restored at drivers/md/dm.c
> static void clone_endio(struct bio *bio, int error)
> ...
> dm_endio_fn endio = tio->ti->type->end_io;
> ...
> when page fault happens accessing tio->ti->type field.
> 
> After successful resync with kernel's pagetable in
> do_page_fault->vmalloc_fault, io continues happily on, however with IF flag
> restored even if faulted context's eflags register had no IF flag set.
> It happens with random task every time.
> 
> Here is ftrace call graph showing problematic place:
> ========================================================
> # tracer: function_graph
> #
> # function_graph latency trace v1.1.5 on 3.0.0+
> # --------------------------------------------------------------------
> # latency: 0 us, #42330/242738181, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0
> #P:1)
> #    -----------------
> #    | task: -0 (uid:0 nice:0 policy:0 rt_prio:0)
> #    -----------------
> #
> #      _-----=> irqs-off        
> #     / _----=> need-resched    
> #    | / _---=> hardirq/softirq 
> #    || / _--=> preempt-depth   
> #    ||| /                      
> # CPU||||  DURATION                  FUNCTION CALLS
> # |  ||||   |   |                     |   |   |   |
>  0)  d...              |              xen_evtchn_do_upcall() {
>  0)  d...              |                irq_enter() {
>  0)  d.h.  2.880 us    |                }
>  0)  d.h.              |                __xen_evtchn_do_upcall() {
>  0)  d.h.  0.099 us    |                  irq_to_desc();
>  0)  d.h.              |                  handle_edge_irq() {
>  0)  d.h.  0.107 us    |                    _raw_spin_lock();
>  0)  d.h.              |                    ack_dynirq() {
>  0)  d.h.  3.153 us    |                    }
>  0)  d.h.              |                    handle_irq_event() {
>  0)  d.h.              |                      handle_irq_event_percpu() {
>  0)  d.h.              |                        blkif_interrupt() {
>  0)  d.h.  0.110 us    |                          _raw_spin_lock_irqsave();
>  0)  d.h.              |                          __blk_end_request_all() {
>  0)  d.h.              |                           
> blk_update_bidi_request() {
>  0)  d.h.              |                              blk_update_request() {
>  0)  d.h.              |                                req_bio_endio() {
>  0)  d.h.              |                                  bio_endio() {
>  0)  d.h.              |                                    endio() {
>  0)  d.h.              |                                      bio_put() {
>  0)  d.h.  4.149 us    |                                      }
>  0)  d.h.              |                                      dec_count() {
>  0)  d.h.              |                                       
> mempool_free() {
>  0)  d.h.  1.395 us    |                                        }
>  0)  d.h.              |                                       
> read_callback() {
>  0)  d.h.              |                                         
> bio_endio() {
>  0)  d.h.              |                                           
> clone_endio() {
>  0)  d.h.              |                                              /* ==>
> enter clone_endio: tio: c1e14c70 */
>  0)  d.h.  0.104 us    |                                             
> arch_irqs_disabled_flags();
>  0)  d.h.              |                                              /* ==>
> clone_endio: endio = tio->ti->type->end_io: tio->ti c918c040 */
>  0)  d.h.  0.100 us    |                                             
> arch_irqs_disabled_flags();
>  0)  d.h.  0.117 us    |                                             
> mirror_end_io();
>  0)  d.h.              |                                             
> free_tio() {
>  0)  d.h.  2.269 us    |                                              }
>  0)  d.h.              |                                             
> bio_put() {
>  0)  d.h.  3.933 us    |                                              }
>  0)  d.h.              |                                             
> dec_pending() {
>  0)  d.h.  0.100 us    |                                               
> atomic_dec_and_test();
>  0)  d.h.              |                                               
> end_io_acct() {
>  0)  d.h.  5.655 us    |                                                }
>  0)  d.h.              |                                               
> free_io() {
>  0)  d.h.  1.992 us    |                                                }
>  0)  d.h.  0.098 us    |                                               
> trace_block_bio_complete();
>  0)  d.h.              |                                               
> bio_endio() {
>  0)  d.h.              |                                                 
> clone_endio() {
>  0)  d.h.              |                                                   
> /* ==> enter clone_endio: tio: c1e14ee0 */
>  0)  d.h.  0.098 us    |                                                   
> arch_irqs_disabled_flags();
>  0)  d.h.              |                                                   
> do_page_fault() {
>  0)  d.h.  0.103 us    |                                                     
> xen_read_cr2();
>  0)  d.h.              |                                                     
> /* dpf: tsk: c785a6a0  mm: 0 comm: kworker/0:0 */
>  0)  d.h.              |                                                     
> /* before vmalloc_fault (c9552044) regs: c786db1c ip: c082bb20  eflags:
> 10002  err: 0 irq: off */
>                            ^^^ - fault error code
>  0)  d.h.              |                                                     
> vmalloc_fault() {
>  0)  d.h.  0.104 us    |                                                      
>  
> xen_read_cr3();
>  0)  d.h.              |                                                      
>  
> xen_pgd_val(); 
>  0)  d.h.              |                                                      
>  
> xen_pgd_val(); 
>  0)  d.h.              |                                                      
>  
> xen_set_pmd();
>  0)  d.h.              |                                                      
>  
> xen_pmd_val();
>  0)  d.h.+ 14.599 us   |                                                     
> }
>  0)  d.h.+ 18.019 us   |                                                   
> }
>       v -- irq enabled
>  0)  ..h.              |                                                   
> /* ==> clone_endio: endio = tio->ti->type->end_io: tio->ti c9552040 */
>  0)  ..h.  0.102 us    |                                                   
> arch_irqs_disabled_flags();
>  0)  ..h.              |                                                   
> /* <7>clone_endio BUG DETECTED irq */
> ========================================
> 
> So IF flag is restored right after exiting from do_page_fault().
> 
> Any thoughts why it might happen?
> 
> PS:
> Full logs, additional trace patch, kernel config and a way reproduce bug can
> be found at https://bugzilla.redhat.com/show_bug.cgi?id=707552
> 
> 
> 
> ______________________________________
> If you reply to this email, your message will be added to the discussion 
> below:
> http://xen.1045712.n5.nabble.com/Fix-the-occasional-xen-blkfront-deadlock-when-irqbalancing-tp2644296p4704111.html
> This email was sent by imammedo (via Nabble)
> To receive all replies by email, subscribe to this discussion: 
> http://xen.1045712.n5.nabble.com/template/NamlServlet.jtp?macro=subscribe_by_code&node=2644296&code=a29ucmFkLndpbGtAb3JhY2xlLmNvbXwyNjQ0Mjk2fDE1MjU5MDEwODc=

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
WARNING - OLD ARCHIVES

xen-devel

[Xen-devel] Re: [PATCH] blkfront: Move blkif_interrupt into a tasklet.