WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [RFC][PATCH] walking the page lists needs the page_alloc

To: Jan Beulich <JBeulich@xxxxxxxxxx>
Subject: Re: [Xen-devel] [RFC][PATCH] walking the page lists needs the page_alloc lock
From: Tim Deegan <Tim.Deegan@xxxxxxxxxx>
Date: Thu, 12 Aug 2010 17:37:12 +0100
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Thu, 12 Aug 2010 09:38:23 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4C642AC4020000780000F8D8@xxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <20100723134913.GQ13291@xxxxxxxxxxxxxxxxxxxxxxx> <4C642AC4020000780000F8D8@xxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.18 (2008-05-17)
At 16:09 +0100 on 12 Aug (1281629364), Jan Beulich wrote:
> >>> On 23.07.10 at 15:49, Tim Deegan <Tim.Deegan@xxxxxxxxxx> wrote:
> > There are a few places in Xen where we walk a domain's page lists
> > without holding the page_alloc lock.  They race with updates to the page
> > lists, which are normally rare but can be quite common under PoD when
> > the domain is close to its memory limit and the PoD reclaimer is busy.
> > This patch protects those places by taking the page_alloc lock.
> > 
> > I think this is OK for the two debug-key printouts - they don't run from
> > irq context and look deadlock-free.  The tboot change seems safe too
> 
> While the comment says the patch would leave debug key printouts
> alone, ...

Sorry, my intention was to say that changes to the debug-key printouts
are safe, not that they didn't require changes.  

The debug-key printouts (in particular the NUMA one) are where I
actually hit this bug on a running system.

Tim.

> > unless tboot shutdown functions are called from irq context or with the
> > page_alloc lock held.  The p2m one is the scariest but there are already
> > code paths in PoD that take the page_alloc lock with the p2m lock held
> > so it's no worse than existing code. 
> > 
> > Signed-off-by: Tim Deegan <Tim.Deegan@xxxxxxxxxx>
> > 
> > diff -r e8dbc1262f52 xen/arch/x86/domain.c
> > --- a/xen/arch/x86/domain.c Wed Jul 21 09:02:10 2010 +0100
> > +++ b/xen/arch/x86/domain.c Fri Jul 23 14:33:22 2010 +0100
> > @@ -139,12 +139,14 @@ void dump_pageframe_info(struct domain *
> 
> ... the actual patch still touches a respective function. It would seem
> to me that this part ought to be reverted.
> 
> >      }
> >      else
> >      {
> > +        spin_lock(&d->page_alloc_lock);
> >          page_list_for_each ( page, &d->page_list )
> >          {
> >              printk("    DomPage %p: caf=%08lx, taf=%" PRtype_info "\n",
> >                     _p(page_to_mfn(page)),
> >                     page->count_info, page->u.inuse.type_info);
> >          }
> > +        spin_unlock(&d->page_alloc_lock);
> >      }
> >  
> >      if ( is_hvm_domain(d) )
> > @@ -152,12 +154,14 @@ void dump_pageframe_info(struct domain *
> >          p2m_pod_dump_data(d);
> >      }
> >  
> > +    spin_lock(&d->page_alloc_lock);
> >      page_list_for_each ( page, &d->xenpage_list )
> >      {
> >          printk("    XenPage %p: caf=%08lx, taf=%" PRtype_info "\n",
> >                 _p(page_to_mfn(page)),
> >                 page->count_info, page->u.inuse.type_info);
> >      }
> > +    spin_unlock(&d->page_alloc_lock);
> >  }
> >  
> >  struct domain *alloc_domain_struct(void)
> 
> Sorry for not noticing this earlier.
> 
> Jan
> 

-- 
Tim Deegan <Tim.Deegan@xxxxxxxxxx>
Principal Software Engineer, XenServer Engineering
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel