WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

buggy linear page table handling Re: [Xen-devel] xm pause causing lockup

To: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: buggy linear page table handling Re: [Xen-devel] xm pause causing lockup
From: Kip Macy <kip.macy@xxxxxxxxx>
Date: Sat, 16 Apr 2005 12:59:01 -0700
Delivery-date: Sat, 16 Apr 2005 19:58:56 +0000
Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=fbgRs0klAecbFgAMw52Mn3OLpjd6vPiNmS6aJdfFY41uy9WsUGaqPqZqxwsLUPWKZHiqHCK0ho2oG0/wJhxmc6MQgmcwtYT2gOh7c17iJtI1DlSj22jFLgk9GLFsloPKIOiKjFiNZxSWEoFg3/XgqFZHSUFJhNBxobRU/0QYsmg=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <b1fa291705041514046f3b20e9@xxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <A95E2296287EAD4EB592B5DEEFCE0E9D1E3BC6@xxxxxxxxxxxxxxxxxxxxxxxxxxx> <b1fa291705041514046f3b20e9@xxxxxxxxxxxxxx>
Reply-to: Kip Macy <kip.macy@xxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
I went through a few quick iterations to test page table reference
counting. In short, if I L2 pin a zeroed page that I've re-mapped
read-only the pin succeeds. If the page has a self-referential mapping
before it is remapped read-only the pin never returns. It is probably
safe to conclude that the type count is not correctly changed when the
page is re-mapped if there is a self-referential entry. This used to
work, thus it is also safe to say that this is a regression introduced
some time between 3/22 and 4/11. Test code from pmap_pinit below.

                          -Kip 


        /* ***** TEMP \/ ********** */
        ma = xpmap_ptom(VM_PAGE_TO_PHYS(ptdpg[0]));
#if 0
        /* works */
        pmap_qremove((vm_offset_t)pmap->pm_pdir, NPGPTD);
#elif 0
        /* works */
        PT_SET_MA(pmap->pm_pdir, 0);
#elif 0
        /* works */
        PT_SET_MA(pmap->pm_pdir, ma | PG_V | PG_A);
#else           
        /* causes lockup on pin */
        pmap->pm_pdir[PTDPTDI + i] = ma | PG_V | PG_A | PG_M;
        PT_SET_MA(pmap->pm_pdir, ma | PG_V | PG_A);
#endif
        
        printk("pinning %p - pass 0\n", ma);
        xen_pgd_pin(xpmap_ptom(VM_PAGE_TO_PHYS(ptdpg[0])));
        printk("pinned %p - pass 0\n", ma);
        /* ***** TEMP ^ ********** */

On 4/15/05, Kip Macy <kip.macy@xxxxxxxxx> wrote:
> > Does this happen if you boot with 'nosmp'? I don't really believe it's a
> > race, but might be worth checking.
> 
> Yes, it still happens. It would have found it quite astonishing if it
> were a race.
> (XEN) EIP:    0808:[<fc52d5a3>]
> (gdb) x/i 0xfc52d5a3
> 0xfc52d5a3 <get_page_type+265>: mov    0x14(%eax),%eax
> (gdb) info line *0xfc52d5a3
> Line 1236 of "mm.c" starts at address 0xfc52d5a0 <get_page_type+262>
> and ends at 0xfc52d5b0 <get_page_type+278>.
> (gdb)
> 
> Line 1236-1240 of local mm.c:
>             while ( (y = page->u.inuse.type_info) == x )
>                 cpu_relax();
>             counter++;
>             printk("page was not validated");
>             goto again;
> 
> > Also, it's worth adding a printk into this loop just to check that that
> > is where you're getting caught.
> 
> Obviously wasn't thinking and stuck it in the wrong place.
> Nonetheless, even without the printk I think I've proven my point.
> 
> 
> >
> >             /* Someone else is updating validation of this page. Wait...
> > */
> >             while ( (y = page->u.inuse.type_info) == x )
> >                 cpu_relax();
> >             goto again;
> 
> Yep.
> 
> >
> > We need to figure out how the type count managed to get to one without
> > the page being validated. I presume you're doing a debug=y build of Xen?
> 
> Correct. Nothing comes out on the console apart from debug output from 
> FreeBSD.
> 
> > Do you get any warnings about illegal mmu_update attempts when you boot
> > FreeBSD?
> 
> No, I don't. This is the offending code snippet from pmap_pinit:
> 
>         /* install self-referential address mapping entry(s) */
>         for (i = 0; i < NPGPTD; i++) {
>                 ma = xpmap_ptom(VM_PAGE_TO_PHYS(ptdpg[i]));
>                 pmap->pm_pdir[PTDPTDI + i] = ma | PG_V | PG_A | PG_M;
> #ifdef PAE
>                 pmap->pm_pdpt[i] = ma | PG_V;
> #endif
>                 /* re-map page directory read-only */
>                 PT_SET_MA(pmap->pm_pdir, *vtopte((vm_offset_t)pmap->pm_pdir) 
> & ~PG_RW);
>                 xen_pgd_pin(ma);
>         }
> 
> PT_SET_MA is just a wrapper for update_va_mapping. Have there been any
> recent changes to the page typing code that would cause it to get
> confused by a self-referential mapping?
> 
>                           -Kip
>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>