WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Kernel Panic in xen-blkfront.c:blkif_queue_request under

To: Jens Axboe <jens.axboe@xxxxxxxxxx>
Subject: Re: [Xen-devel] Kernel Panic in xen-blkfront.c:blkif_queue_request under 2.6.28
From: Greg Harris <greg.harris@xxxxxxxxxxxxx>
Date: Mon, 2 Feb 2009 09:11:32 -0500 (EST)
Cc: Jeremy Fitzhardinge <jeremy@xxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Mon, 02 Feb 2009 06:12:10 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <24523950.8056631233583862811.JavaMail.root@ouachita>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
----- "Jens Axboe" <jens.axboe@xxxxxxxxxx> wrote:
> Hmm, xen-blkfront.c does:
> 
> BUG_ON(ring_req->nr_segments == BLKIF_MAX_SEGMENTS_PER_REQUEST);
> 
> with a limit setting of
> 
> blk_queue_max_phys_segments(rq, BLKIF_MAX_SEGMENTS_PER_REQUEST);
> blk_queue_max_hw_segments(rq, BLKIF_MAX_SEGMENTS_PER_REQUEST);
> 
> So the BUG_ON(), as it stands, can indeed very well trigger, since
> you
> asked for that limit.
> 
> Either that should be
> 
> BUG_ON(ring_req->nr_segments > BLKIF_MAX_SEGMENTS_PER_REQUEST);

The nr_segments is being used as an offset into an array of size 
BLKIF_MAX_SEGMENTS_PER_REQUEST so by the time it is equal to 
BLKIF_MAX_SEGMENTS_PER_REQUEST (when the BUG_ON fires) it is already poised to 
write outside of the array as allocated.

>From include/xen/interface/io/blkif.h:

struct blkif_request {
        ...
        struct blkif_request_segment {
                grant_ref_t gref;        /* reference to I/O buffer frame       
 */
                /* @first_sect: first sector in frame to transfer (inclusive).  
 */
                /* @last_sect: last sector in frame to transfer (inclusive).    
 */
                uint8_t     first_sect, last_sect;
        } seg[BLKIF_MAX_SEGMENTS_PER_REQUEST];
}

> 
> or the limit should be BLKIF_MAX_SEGMENTS_PER_REQUEST - 1.

According to Documentation/block/biodoc.txt the calls to 
blk_queue_max_*_segment are setting the maximum number of segments the driver 
can hold which according to my read of the data structure above is 
BLKIF_MAX_SEGMENTS_PER_REQUEST.  I will attempt compiling another kernel 
setting the max segments to BLKIF_MAX_SEGMENTS_PER_REQUEST -1 to see if there 
is any effect.

It sounds to me like the kernel itself may not be obeying the requested segment 
limits here?

Thanks,
-- Greg

> 
> > 
> > Thanks,
> >    J
> > 
> > >Attached are two panics:
> > >
> > >kernel BUG at drivers/block/xen-blkfront.c:243!
> > >invalid opcode: 0000 [#1] SMP
> > >last sysfs file: /sys/block/xvda/dev
> > >CPU 0
> > >Modules linked in:
> > >Pid: 0, comm: swapper Not tainted 2.6.28-metacarta-appliance-1 #2
> > >RIP: e030:[<ffffffff804077c0>]  [<ffffffff804077c0>]
> > >do_blkif_request+0x2f0/0x380
> > >RSP: e02b:ffffffff80865dd8  EFLAGS: 00010046
> > >RAX: 0000000000000000 RBX: ffff880366ee33c0 RCX: ffff880366ee33c0
> > >RDX: ffff880366f15d90 RSI: 000000000000000a RDI: 0000000000000303
> > >RBP: ffff88039d78b190 R08: 0000000000001818 R09: ffff88038fb7a9e0
> > >R10: 0000000000000004 R11: 000000000000001a R12: 0000000000000303
> > >R13: 0000000000000001 R14: ffff880366f15da0 R15: 0000000000000000
> > >FS:  0000000000000000(0000) GS:ffffffff807a1980(0000) 
> > >knlGS:0000000000000000
> > >CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> > >CR2: 00000000f7f54444 CR3: 00000003977e5000 CR4: 0000000000002620
> > >DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > >DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > >Process swapper (pid: 0, threadinfo ffffffff807a6000, task 
> > >ffffffff806f0360)
> > >Stack:
> > > 000000000000004c ffff88038fb7a9e0 ffff88039a5f4000
> ffff880366f123b8
> > > 0000000680298eec 000000000000000f ffff88039a5f4000
> 0000000066edc808
> > > ffff880366ee33c0 ffff88038fb7aa00 ffffffff00000001
> ffff88038fb7a9e0
> > >Call Trace:
> > > <IRQ> <0> [<ffffffff8036fa45>] ? blk_invoke_request_fn+0xa5/0x110
> > > [<ffffffff80407868>] ? kick_pending_request_queues+0x18/0x30
> > > [<ffffffff80407a17>] ? blkif_interrupt+0x197/0x1e0
> > > [<ffffffff8026cc59>] ? handle_IRQ_event+0x39/0x80
> > > [<ffffffff8026f016>] ? handle_level_irq+0x96/0x120
> > > [<ffffffff802140d5>] ? do_IRQ+0x85/0x110
> > > [<ffffffff803c8315>] ? xen_evtchn_do_upcall+0xe5/0x130
> > > [<ffffffff802461f7>] ? __do_softirq+0xe7/0x180
> > > [<ffffffff8059f3ee>] ? xen_do_hypervisor_callback+0x1e/0x30
> > > <EOI> <0> [<ffffffff802093aa>] ? _stext+0x3aa/0x1000
> > > [<ffffffff802093aa>] ? _stext+0x3aa/0x1000
> > > [<ffffffff8020de8c>] ? xen_safe_halt+0xc/0x20
> > > [<ffffffff8020c1fa>] ? xen_idle+0x2a/0x50
> > > [<ffffffff80210041>] ? cpu_idle+0x41/0x70
> > >Code: fa d0 00 00 00 48 8d bc 07 88 00 00 00 e8 b9 dd f7 ff 8b 7c
> 24 54 e8 
> > >90 fb
> > >fb ff ff 44 24 24 e9 3b fd ff ff 0f 0b eb fe 66 66 90 <0f> 0b eb fe
> 48 8b 
> > >7c 24
> > >30 48 8b 54 24 30 b9 0b 00 00 00 48 c7
> > >RIP  [<ffffffff804077c0>] do_blkif_request+0x2f0/0x380
> > > RSP <ffffffff80865dd8>
> > >Kernel panic - not syncing: Fatal exception in interrupt
> > >
> > >kernel BUG at drivers/block/xen-blkfront.c:243!
> > >invalid opcode: 0000 [#1] SMP
> > >last sysfs file: /sys/block/xvda/dev
> > >CPU 0
> > >Modules linked in:
> > >Pid: 0, comm: swapper Not tainted 2.6.28-metacarta-appliance-1 #2
> > >RIP: e030:[<ffffffff804077c0>]  [<ffffffff804077c0>]
> > >do_blkif_request+0x2f0/0x380
> > >RSP: e02b:ffffffff80865dd8  EFLAGS: 00010046
> > >RAX: 0000000000000000 RBX: ffff880366f2a9c0 RCX: ffff880366f2a9c0
> > >RDX: ffff880366f233b0 RSI: 000000000000000a RDI: 0000000000000168
> > >RBP: ffff88039d895cf0 R08: 0000000000000b40 R09: ffff88038fb029e0
> > >R10: 000000000000000f R11: 000000000000001a R12: 0000000000000168
> > >R13: 0000000000000001 R14: ffff880366f233c0 R15: 0000000000000000
> > >FS:  0000000000000000(0000) GS:ffffffff807a1980(0000) 
> > >knlGS:0000000000000000
> > >CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> > >CR2: 00000000f7f9c444 CR3: 000000039e7ea000 CR4: 0000000000002620
> > >DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > >DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > >Process swapper (pid: 0, threadinfo ffffffff807a6000, task 
> > >ffffffff806f0360)
> > >Stack:
> > > 000000000000004c ffff88038fb029e0 ffff88039a5d8000
> ffff880366f11938
> > > 0000000980298eec 000000000000001d ffff88039a5d8000
> 000004008036f3da
> > > ffff880366f2a9c0 ffff88038fb02a00 ffffffff00000001
> ffff88038fb029e0
> > >Call Trace:
> > > <IRQ> <0> [<ffffffff8036fa45>] ? blk_invoke_request_fn+0xa5/0x110
> > > [<ffffffff80407868>] ? kick_pending_request_queues+0x18/0x30
> > > [<ffffffff80407a17>] ? blkif_interrupt+0x197/0x1e0
> > > [<ffffffff8026cc59>] ? handle_IRQ_event+0x39/0x80
> > > [<ffffffff8026f016>] ? handle_level_irq+0x96/0x120
> > > [<ffffffff802140d5>] ? do_IRQ+0x85/0x110
> > > [<ffffffff803c8315>] ? xen_evtchn_do_upcall+0xe5/0x130
> > > [<ffffffff802461f7>] ? __do_softirq+0xe7/0x180
> > > [<ffffffff8059f3ee>] ? xen_do_hypervisor_callback+0x1e/0x30
> > > <EOI> <0> [<ffffffff802093aa>] ? _stext+0x3aa/0x1000
> > > [<ffffffff802093aa>] ? _stext+0x3aa/0x1000
> > > [<ffffffff8020de8c>] ? xen_safe_halt+0xc/0x20
> > > [<ffffffff8020c1fa>] ? xen_idle+0x2a/0x50
> > > [<ffffffff80210041>] ? cpu_idle+0x41/0x70
> > >Code: fa d0 00 00 00 48 8d bc 07 88 00 00 00 e8 b9 dd f7 ff 8b 7c
> 24 54 e8 
> > >90 fb
> > >fb ff ff 44 24 24 e9 3b fd ff ff 0f 0b eb fe 66 66 90 <0f> 0b eb
> > >fe 48 8b 7c 24 30 48 8b 54 24 30 b9 0b 00 00 00 48 c7
> > >RIP  [<ffffffff804077c0>] do_blkif_request+0x2f0/0x380
> > > RSP <ffffffff80865dd8>
> > >Kernel panic - not syncing: Fatal exception in interrupt
> > >
> > >We've encountered the a similar panic using Xen 3.2.1
> (debian-backports, 
> > >2.6.18-6-xen-amd64 kernel) and Xen 3.2.0 (Ubuntu Hardy,
> 2.6.24-23-xen 
> > >kernel) running in para-virtual mode.  The source around the line 
> > >referenced in the panic is:
> > >
> > >rq_for_each_segment(bvec, req, iter) {
> > >                BUG_ON(ring_req->nr_segments == 
> > >                BLKIF_MAX_SEGMENTS_PER_REQUEST);
> > >                ...
> > >                handle the segment
> > >                ...
> > >                ring_req->nr_segments++;
> > >}
> > >
> > >I'm able to reliably reproduce this panic through a certain
> workload 
> > >(usually through creating file-systems) if anyone would like me to
> do 
> > >further debugging.
> > >
> > >Thanks,
> > >---
> > >
> > >Greg Harris
> > >System Administrator
> > >MetaCarta, Inc.
> > >
> > >(O) +1 (617) 301-5530
> > >(M) +1 (781) 258-4474
> > >
> > >
> > >---
> > >
> > >Greg Harris
> > >System Administrator
> > >MetaCarta, Inc.
> > >
> > >(O) +1 (617) 301-5530
> > >(M) +1 (781) 258-4474
> > >
> > >_______________________________________________
> > >Xen-devel mailing list
> > >Xen-devel@xxxxxxxxxxxxxxxxxxx
> > >http://lists.xensource.com/xen-devel
> > >  
> > 
> 
> -- 
> Jens Axboe

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel