WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] Re: [PATCH] blkfront: Move blkif_interrupt into a tasklet.

Jeremy Fitzhardinge wrote:
> 
> Have you tried bisecting to see when this particular problem appeared? 
> It looks to me like something is accidentally re-enabling interrupts -
> perhaps a stack overrun is corrupting the "flags" argument between a
> spin_lock_irqsave()/restore pair. 
> 
> Is it only on 32-bit kernels?
> 
 ------------[ cut here ]------------
[604001.659925] WARNING: at block/blk-core.c:239 blk_start_queue+0x70/0x80()
[604001.659964] Modules linked in: nfs lockd fscache auth_rpcgss nfs_acl
sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4
nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables xen_netfront
pcspkr [last unloaded: scsi_wait_scan]
[604001.660147] Pid: 336, comm: udevd Tainted: G        W   3.0.0+ #50
[604001.660181] Call Trace:
[604001.660209]  [<c045c512>] warn_slowpath_common+0x72/0xa0
[604001.660243]  [<c06643a0>] ? blk_start_queue+0x70/0x80
[604001.660275]  [<c06643a0>] ? blk_start_queue+0x70/0x80
[604001.660310]  [<c045c562>] warn_slowpath_null+0x22/0x30
[604001.660343]  [<c06643a0>] blk_start_queue+0x70/0x80
[604001.660379]  [<c075e231>] kick_pending_request_queues+0x21/0x30
[604001.660417]  [<c075e42f>] blkif_interrupt+0x19f/0x2b0
...
 ------------[ cut here ]------------

I've debugged a bit blk-core warning and can say:
  - Yes, It is 32-bit PAE kernel and happens only with it so far.
  - Affects PV xen guest, bare-metal and kvm configs are not affected.
  - Upstream kernel is affected as well.
  - Reproduces on xen 4.1.1 and 3.1.2 hosts

IF flag is always restored at drivers/md/dm.c
static void clone_endio(struct bio *bio, int error)
...
dm_endio_fn endio = tio->ti->type->end_io;
...
when page fault happens accessing tio->ti->type field.

After successful resync with kernel's pagetable in
do_page_fault->vmalloc_fault, io continues happily on, however with IF flag
restored even if faulted context's eflags register had no IF flag set.
It happens with random task every time.

Here is ftrace call graph showing problematic place:
========================================================
# tracer: function_graph
#
# function_graph latency trace v1.1.5 on 3.0.0+
# --------------------------------------------------------------------
# latency: 0 us, #42330/242738181, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0
#P:1)
#    -----------------
#    | task: -0 (uid:0 nice:0 policy:0 rt_prio:0)
#    -----------------
#
#      _-----=> irqs-off        
#     / _----=> need-resched    
#    | / _---=> hardirq/softirq 
#    || / _--=> preempt-depth   
#    ||| /                      
# CPU||||  DURATION                  FUNCTION CALLS
# |  ||||   |   |                     |   |   |   |
 0)  d...              |              xen_evtchn_do_upcall() {
 0)  d...              |                irq_enter() {
 0)  d.h.  2.880 us    |                }
 0)  d.h.              |                __xen_evtchn_do_upcall() {
 0)  d.h.  0.099 us    |                  irq_to_desc();
 0)  d.h.              |                  handle_edge_irq() {
 0)  d.h.  0.107 us    |                    _raw_spin_lock();
 0)  d.h.              |                    ack_dynirq() {
 0)  d.h.  3.153 us    |                    }
 0)  d.h.              |                    handle_irq_event() {
 0)  d.h.              |                      handle_irq_event_percpu() {
 0)  d.h.              |                        blkif_interrupt() {
 0)  d.h.  0.110 us    |                          _raw_spin_lock_irqsave();
 0)  d.h.              |                          __blk_end_request_all() {
 0)  d.h.              |                           
blk_update_bidi_request() {
 0)  d.h.              |                              blk_update_request() {
 0)  d.h.              |                                req_bio_endio() {
 0)  d.h.              |                                  bio_endio() {
 0)  d.h.              |                                    endio() {
 0)  d.h.              |                                      bio_put() {
 0)  d.h.  4.149 us    |                                      }
 0)  d.h.              |                                      dec_count() {
 0)  d.h.              |                                       
mempool_free() {
 0)  d.h.  1.395 us    |                                        }
 0)  d.h.              |                                       
read_callback() {
 0)  d.h.              |                                         
bio_endio() {
 0)  d.h.              |                                           
clone_endio() {
 0)  d.h.              |                                              /* ==>
enter clone_endio: tio: c1e14c70 */
 0)  d.h.  0.104 us    |                                             
arch_irqs_disabled_flags();
 0)  d.h.              |                                              /* ==>
clone_endio: endio = tio->ti->type->end_io: tio->ti c918c040 */
 0)  d.h.  0.100 us    |                                             
arch_irqs_disabled_flags();
 0)  d.h.  0.117 us    |                                             
mirror_end_io();
 0)  d.h.              |                                             
free_tio() {
 0)  d.h.  2.269 us    |                                              }
 0)  d.h.              |                                             
bio_put() {
 0)  d.h.  3.933 us    |                                              }
 0)  d.h.              |                                             
dec_pending() {
 0)  d.h.  0.100 us    |                                               
atomic_dec_and_test();
 0)  d.h.              |                                               
end_io_acct() {
 0)  d.h.  5.655 us    |                                                }
 0)  d.h.              |                                               
free_io() {
 0)  d.h.  1.992 us    |                                                }
 0)  d.h.  0.098 us    |                                               
trace_block_bio_complete();
 0)  d.h.              |                                               
bio_endio() {
 0)  d.h.              |                                                 
clone_endio() {
 0)  d.h.              |                                                   
/* ==> enter clone_endio: tio: c1e14ee0 */
 0)  d.h.  0.098 us    |                                                   
arch_irqs_disabled_flags();
 0)  d.h.              |                                                   
do_page_fault() {
 0)  d.h.  0.103 us    |                                                     
xen_read_cr2();
 0)  d.h.              |                                                     
/* dpf: tsk: c785a6a0  mm: 0 comm: kworker/0:0 */
 0)  d.h.              |                                                     
/* before vmalloc_fault (c9552044) regs: c786db1c ip: c082bb20  eflags:
10002  err: 0 irq: off */
                           ^^^ - fault error code
 0)  d.h.              |                                                     
vmalloc_fault() {
 0)  d.h.  0.104 us    |                                                       
xen_read_cr3();
 0)  d.h.              |                                                       
xen_pgd_val(); 
 0)  d.h.              |                                                       
xen_pgd_val(); 
 0)  d.h.              |                                                       
xen_set_pmd();
 0)  d.h.              |                                                       
xen_pmd_val();
 0)  d.h.+ 14.599 us   |                                                     
}
 0)  d.h.+ 18.019 us   |                                                   
}
      v -- irq enabled
 0)  ..h.              |                                                   
/* ==> clone_endio: endio = tio->ti->type->end_io: tio->ti c9552040 */
 0)  ..h.  0.102 us    |                                                   
arch_irqs_disabled_flags();
 0)  ..h.              |                                                   
/* <7>clone_endio BUG DETECTED irq */
========================================

So IF flag is restored right after exiting from do_page_fault().

Any thoughts why it might happen?

PS:
Full logs, additional trace patch, kernel config and a way reproduce bug can
be found at https://bugzilla.redhat.com/show_bug.cgi?id=707552


--
View this message in context: 
http://xen.1045712.n5.nabble.com/Fix-the-occasional-xen-blkfront-deadlock-when-irqbalancing-tp2644296p4704111.html
Sent from the Xen - Dev mailing list archive at Nabble.com.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel