[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: xen-blkfront: BUG_ON(info->nr_rings)



On Thu, Mar 11, 2021 at 5:37 AM Roger Pau Monné <roger.pau@xxxxxxxxxx> wrote:
>
> On Thu, Mar 11, 2021 at 09:01:51AM +0000, Paul Durrant wrote:
> > On 10/03/2021 14:58, Jason Andryuk wrote:
> > > Hi,
> > >
> > > I was running a loop of `xl block-attach ; xl block-detach` and I
> > > triggered a BUG in xen-blkfront, drivers/block/xen-blkfront.c:1917
> > > This is BUG_ON(info->nr_rings) in negotiate_mq called by blkback_changed.
> > >
> > > I'm using Linux 5.4.103 and blktap3 on Xen 4.12 (OpenXT), though I
> > > don't think that matters.  The backtrace and some preceding logs (from
> > > the reproducer) are below.
> > >
> > > I just repro-ed with this:
> > > path=<backend path/state>
> > > xenstore-write $path 5 ; xenstore-write $path 4
> > >
> > > info->nr_rings is still set because of the unexpected transition
> > > XenbusStateClosing -> XenbusStateConnected:
> > > dom7: [ 2866.574853] vbd vbd-51728: blkfront:blkback_changed to state 5.
> > > dom7: [ 2866.578385] vbd vbd-51728: blkfront:blkback_changed to state 4.
> > >
> > > I'm not totally sure how to handle this.  The XenbusStateConnected
> > > event should be creating a new blkfront device, but instead it's seen
> > > by the old one which hasn't been cleaned up yet.
>
> IIRC xenbus state changes (like you perform above) never trigger the
> creation or destruction of devices on the bus. See
> xenbus_otherend_changed.
>
> xl block-detach however should indeed remove the device. We should add
> an option to `xl block-detach -w` to wait for the device to actually
> be removed before returning (or exit with a timeout).

I didn't realize `xl block-detach` didn't wait.  There is some timeout
logic with detaching devices, but I have to investigate this more.

> > >
> >
> > Sounds like blkfront needs to be fixed. Once it is in state 5 the only state
> > it should go to should be 6. From there it can cycle back to 4.

Ok, thanks for the feedback.  So blocking 5->6 is straight forward.
6->4 triggered the same BUG, so I'm still investigating.

> Indeed, there's likely some logic to be improved in blkfront so it
> doesn't get messed up so badly on state changes by blkback.
>
> I'm happy to review patch for both blkfront and libxl/xl in order to
> make this better :).

Okay.

Regards,
Jason



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.