[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen pvops kernel CONFIG_HIGHPTE race/crash



On Mon, Feb 08, 2010 at 02:57:56PM +0200, Pasi Kärkkäinen wrote:
> On Mon, Feb 08, 2010 at 10:57:16AM +0200, Pasi Kärkkäinen wrote:
> > On Mon, Feb 08, 2010 at 08:50:53AM +0000, Ian Campbell wrote:
> > > On Mon, 2010-02-08 at 08:06 +0000, Pasi Kärkkäinen wrote: 
> > > > On Mon, Feb 08, 2010 at 07:47:02AM +0000, Ian Campbell wrote:
> > > > > On Mon, 2010-02-08 at 07:41 +0000, Pasi Kärkkäinen wrote: 
> > > > > > On Sun, Feb 07, 2010 at 02:22:15PM -0800, Daniel Stodden wrote:
> > > > > > > On Sun, 2010-02-07 at 16:42 -0500, Ian Campbell wrote:
> > > > > > > > On Sun, 2010-02-07 at 19:35 +0000, Pasi Kärkkäinen wrote: 
> > > > > > > > > On Wed, Jan 27, 2010 at 05:26:13PM +0000, Ian Campbell wrote:
> > > > > > > > > > diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
> > > > > > > > > > index 65215ab..49f8e83 100644
> > > > > > > > > > --- a/arch/x86/mm/pgtable.c
> > > > > > > > > > +++ b/arch/x86/mm/pgtable.c
> > > > > > > > > > @@ -28,7 +28,10 @@ pgtable_t pte_alloc_one(struct mm_struct 
> > > > > > > > > > *mm, unsigned long address)
> > > > > > > > > >     struct page *pte;
> > > > > > > > > >  
> > > > > > > > > >  #ifdef CONFIG_HIGHPTE
> > > > > > > > > > -   pte = alloc_pages(PGALLOC_GFP | __GFP_HIGHMEM, 0);
> > > > > > > > > > +   if (is_xen_domain())
> > > > > > > > > > +           pte = alloc_pages(PGALLOC_GFP, 0);
> > > > > > > > > > +   else
> > > > > > > > > > +           pte = alloc_pages(PGALLOC_GFP | __GFP_HIGHMEM, 
> > > > > > > > > > 0);
> > > > > > > > > >  #else
> > > > > > > > > >     pte = alloc_pages(PGALLOC_GFP, 0);
> > > > > > > > > >  #endif
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > I just tried this patch, but it fails to compile:
> > > > > > > > > 
> > > > > > > > > arch/x86/mm/pgtable.c: In function 'pte_alloc_one':
> > > > > > > > > arch/x86/mm/pgtable.c:19: error: implicit declaration of 
> > > > > > > > > function 'is_xen_domain'
> > > > > > > > > make[2]: *** [arch/x86/mm/pgtable.o] Error 1
> > > > > > > > > make[1]: *** [arch/x86/mm] Error 2
> > > > > > > > > make: *** [arch/x86] Error 2
> > > > > > > > > 
> > > > > > > > > I tried grepping around for that function but didn't find it 
> > > > > > > > > from any header..
> > > > > > > > 
> > > > > > > > IIRC on some kernels it was called just xen_domain(), I'm not 
> > > > > > > > sure but I
> > > > > > > > think my patch was against plain 2.6.32. I think the function 
> > > > > > > > (whatever
> > > > > > > > it is called) also moved around in the headers recently.
> > > > > > > 
> > > > > > > is_running_on_xen()
> > > > > > > 
> > > > > > 
> > > > > > I'm using kernel.org 2.6.32.7, and I can't find is_running_on_xen() 
> > > > > > either.
> > > > > 
> > > > > 2.6.32.7 has xen_domain() defined in include/xen/xen.h. Probably
> > > > > xen_pv_domain() is the correct check though.
> > > > > 
> > > > 
> > > > # grep xen_domain include/xen/xen.h
> > > > grep: include/xen/xen.h: No such file or directory
> > > > 
> > > > # grep xen_domain include/xen/interface/xen.h
> > > > typedef uint8_t xen_domain_handle_t[16];
> > > > 
> > > > # ls include/xen
> > > > events.h  features.h     hvc-console.h  Kbuild  xenbus.h   xen-ops.h
> > > > evtchn.h  grant_table.h  interface      page.h  xencomm.h
> > > > 
> > > > # grep xen_domain include/xen/*
> > > > #
> > > > 
> > > > # grep xen_domain include/xen/interface/*
> > > > include/xen/interface/xen.h:typedef uint8_t xen_domain_handle_t[16];
> > > > #
> > > 
> > > Ah, my 2.6.32.7 was actually 2.6.32.7 + Jeremy's next branch which has
> > > the headfile move I mentioend earlier whereas plain 2.6.32.7 does not.
> > > 
> > > # git grep  xen_domain_type  v2.6.32.7 arch/x86/xen arch/x86/include/asm
> > > v2.6.32.7:arch/x86/include/asm/xen/hypervisor.h:enum xen_domain_type {
> > > v2.6.32.7:arch/x86/include/asm/xen/hypervisor.h:extern enum 
> > > xen_domain_type xen_domain_type;
> > > v2.6.32.7:arch/x86/include/asm/xen/hypervisor.h:#define xen_domain_type   
> > >       XEN_NATIVE
> > > v2.6.32.7:arch/x86/include/asm/xen/hypervisor.h:#define xen_domain()      
> > >       (xen_domain_type != XEN_NATIVE)
> > > v2.6.32.7:arch/x86/include/asm/xen/hypervisor.h:                          
> > >        xen_domain_type == XEN_PV_DOMAIN)
> > > v2.6.32.7:arch/x86/include/asm/xen/hypervisor.h:                          
> > >        xen_domain_type == XEN_HVM_DOMAIN)
> > > v2.6.32.7:arch/x86/xen/enlighten.c:enum xen_domain_type xen_domain_type = 
> > > XEN_NATIVE;
> > > v2.6.32.7:arch/x86/xen/enlighten.c:EXPORT_SYMBOL_GPL(xen_domain_type);
> > > v2.6.32.7:arch/x86/xen/enlighten.c:     xen_domain_type = XEN_PV_DOMAIN;
> > > 
> > > so it looks like you need xen_(|pv_)domain() and asm/xen/hypervisor.h
> > > 
> > 
> > # grep xen_pv_domain arch/x86/include/asm/xen/hypervisor.h
> > #define xen_pv_domain()         (xen_domain() &&                        \
> > #define xen_initial_domain()    (xen_pv_domain() && \
> > 
> > And now it compiles.. I'm using xen_pv_domain().
> > I'll do some testing when it's built. 
> > 
> > I also noticed that the HIGHPTE race is much easier to reproduce
> > on a 2vcpu guest.. where it takes less than 30 mins to happen.
> > 
> > 1vcpu guest survived kernel compilation loop for a couple hours OK.
> > 
> 
> Ok, based on quick testing the patch seems to fix the problem.
> 
> details: 2 vcpu guest, 2 GB RAM, 32bit PAE, running 2.6.32.7 with and without 
> the patch.
> 
> without the patch: crashes in around 30 minutes.
> with the patch: survived over one hour without problems.
> 
> I'll continue the compilation loop to make sure it works for real..
> 

The patched PV guest kernel survived around 5 hours of kernel compilation loop 
without problems. I stopped the loop after 5 hours.

Booting back to the non-patched kernel and starting the kernel compilation loop
made it crash in less than 15 minutes.

So this patch definitely helps.

-- Pasi

> 
> This is the patch I used:
> 
> --- linux-2.6.32.7/arch/x86/mm/pgtable.c.orig   2010-01-29 01:06:20.000000000 
> +0200
> +++ linux-2.6.32.7/arch/x86/mm/pgtable.c        2010-02-08 10:52:36.495142772 
> +0200
> @@ -3,6 +3,7 @@
>  #include <asm/pgtable.h>
>  #include <asm/tlb.h>
>  #include <asm/fixmap.h>
> +#include <asm/xen/hypervisor.h>
> 
>  #define PGALLOC_GFP GFP_KERNEL | __GFP_NOTRACK | __GFP_REPEAT | __GFP_ZERO
> 
> @@ -16,7 +17,10 @@
>         struct page *pte;
> 
>  #ifdef CONFIG_HIGHPTE
> -       pte = alloc_pages(PGALLOC_GFP | __GFP_HIGHMEM, 0);
> +       if (xen_pv_domain())
> +               pte = alloc_pages(PGALLOC_GFP, 0);
> +       else
> +               pte = alloc_pages(PGALLOC_GFP | __GFP_HIGHMEM, 0);
>  #else
>         pte = alloc_pages(PGALLOC_GFP, 0);
>  #endif
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.