[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: GPF on 0xdead000000000100 in nvme_map_data - Linux 5.9.9



On Sat, Dec 5, 2020 at 3:29 AM Roger Pau Monné <roger.pau@xxxxxxxxxx> wrote:
>
> On Fri, Dec 04, 2020 at 01:20:54PM +0100, Marek Marczykowski-Górecki wrote:
> > On Fri, Dec 04, 2020 at 01:08:03PM +0100, Christoph Hellwig wrote:
> > > On Fri, Dec 04, 2020 at 12:08:47PM +0100, Marek Marczykowski-Górecki 
> > > wrote:
> > > > culprit:
> > > >
> > > > commit 9e2369c06c8a181478039258a4598c1ddd2cadfa
> > > > Author: Roger Pau Monne <roger.pau@xxxxxxxxxx>
> > > > Date:   Tue Sep 1 10:33:26 2020 +0200
> > > >
> > > >     xen: add helpers to allocate unpopulated memory
> > > >
> > > > I'm adding relevant people and xen-devel to the thread.
> > > > For completeness, here is the original crash message:
> > >
> > > That commit definitively adds a new ZONE_DEVICE user, so it does look
> > > related.  But you are not running on Xen, are you?
> >
> > I am. It is Xen dom0.
>
> I'm afraid I'm on leave and won't be able to look into this until the
> beginning of January. I would guess it's some kind of bad
> interaction between blkback and NVMe drivers both using ZONE_DEVICE?
>
> Maybe the best is to revert this change and I will look into it when
> I get back, unless someone is willing to debug this further.

Looking at commit 9e2369c06c8a and xen-blkback put_free_pages() , they
both use page->lru which is part of the anonymous union shared with
*pgmap.  That matches Marek's suspicion that the ZONE_DEVICE memory is
being used as ZONE_NORMAL.

memmap_init_zone_device() says:
* ZONE_DEVICE pages union ->lru with a ->pgmap back pointer
* and zone_device_data.  It is a bug if a ZONE_DEVICE page is
* ever freed or placed on a driver-private list.

Regards,
Jason



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.