|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Crashing kernel with dom0/libxc gnttab/gntshr
On Tue, 30 Jul 2013, Daniel De Graaf wrote:
> On 07/30/2013 12:58 PM, David Vrabel wrote:
> [...]
> >
> > [ 902.729307] BUG: Bad page map in process vchan-node1 pte:12bfff167
> > pmd:b9b5c067
> > [ 902.729312] page:ffffea0004afffc0 count:1 mapcount:-1 mapping:
> > (null) index:0xffffffffffffffff
> >
> > I think this is the test for page_mapcount(page) < 0 in zap_pte_range().
> > This has looked up the page using the PTE it is trying to clear. Has
> > it found the correct page? Since the MFN is currently mapped into the
> > same domain, has the m2p_override stuff confused the look up and it is
> > checking the grantee page not the granter?
> >
> > David
>
> I think something like this is happening, since while reproducing this
> on my test system, some linked list corruption was found that I believe
> to be the cause of this problem. The gnttab_map_refs function on PV uses
> m2p_add_override on the page, which threads page->lru to an
> m2p_overrides list. However, something else is using page->lru during
> the use of gntdev, as shown by the following debug patch:
I have never managed to prove that something else is trying to use
page->lru while the m2p_override is using it.
Jeremy, at the time the code was written, you were pretty confident
that page->lru couldn't be used by anybody else.
Why was that?
> diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
> index 3c8803f..198e57e 100644
> --- a/drivers/xen/gntdev.c
> +++ b/drivers/xen/gntdev.c
> @@ -294,6 +294,11 @@ static int map_grant_pages(struct grant_map *map)
> if (err)
> return err;
> + printk("map page0 lru: %p prev=%p:%p next=%p:%p\n",
> + &map->pages[0]->lru,
> + map->pages[0]->lru.prev, map->pages[0]->lru.prev->next,
> + map->pages[0]->lru.next, map->pages[0]->lru.next->prev);
> +
> for (i = 0; i < map->count; i++) {
> if (map->map_ops[i].status)
> err = -EINVAL;
> @@ -320,6 +325,10 @@ static int __unmap_grant_pages(struct grant_map *map, int
> offset, int pages)
> }
> }
> + printk("unmap page0 lru: %p prev=%p:%p next=%p:%p\n",
> + &map->pages[0]->lru,
> + map->pages[0]->lru.prev, map->pages[0]->lru.prev->next,
> + map->pages[0]->lru.next, map->pages[0]->lru.next->prev);
> err = gnttab_unmap_refs(map->unmap_ops + offset,
> use_ptemod ? map->kmap_ops + offset : NULL, map->pages
> + offset,
> pages);
>
> Output:
> [ 88.610644] map page0 lru: ffffea0001dee160
> prev=ffffffff82f2d510:ffffea0001dee160 next=ffffffff82f2d510:ffffea0001dee160
> [ 88.611515] BUG: Bad page map in process a.out pte:8000000077b85167
> pmd:2541a067
> [ 88.611525] page:ffffea0001dee140 count:1 mapcount:-1 mapping:
> (null) index:0xffffffffffffffff
> [ 88.611532] page flags: 0x1000000000000814(referenced|dirty|private)
> [ 88.611541] addr:00007f1adaef3000 vm_flags:140400fb anon_vma:
> (null) mapping:ffff8800692974a0 index:0
> [ 88.611547] vma->vm_ops->fault: (null)
> [ 88.611555] vma->vm_file->f_op->mmap: gntalloc_mmap+0x0/0x1d0
> [...backtrace cropped...]
> [ 88.614301] unmap page0 lru: ffffea0001dee160
> prev=ffff8800254c9d08:ffff88001ea0b120 next=ffff8800254c9d08:ffff88001ea0b938
>
> The initial map is a linked list with only that element, so the address
> 0xffffffff82f2d510 is the m2p_overrides entry. This means the page being
> found by zap_pte_range is not a valid struct page.
>
> The struct page* being used by the gntalloc device was 0xffffea0000952740,
> for reference; it's not a direct collision between the page used by the
> gntdev and gntalloc devices.
>
> Not sure what the best fix is for this at the moment.
>
> --
> Daniel De Graaf
> National Security Agency
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |