[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] RE: kernel BUG at arch/x86/xen/mmu.c:1872



Hi:
 
       I have just kicked off cpuidle=0 "cpufreq=none" tests.
 
       What is your Xen version?  Do you use the backend driver of 2.6.32.36?
 
       Beside the "TLB BUG ", I've met at least two other issues
       1)Xen4.0.1 + 2.6.32.36 kernel + backend driver from 2.6.31  ==> will cause "Bad grant reference " log in serial output
       2)Xen4.0.1 + 2.6.32.36 kernel with its owen backend driver   ==> will cause disk error like belows.
 
sd 0:0:0:0: rejecting I/O to offline device
sd 0:0:0:0: rejecting I/O to offline device
sd 0:0:0:0: rejecting I/O to offline device
sd 0:0:0:0: rejecting I/O to offline device
sd 0:0:0:0: rejecting I/O to offline device
sd 0:0:0:0: rejecting I/O to offline device
sd 0:0:0:0: rejecting I/O to offline device
sd 0:0:0:0: rejecting I/O to offline device
sd 0:0:0:0: rejecting I/O to offline device
sd 0:0:0:0: rejecting I/O to offline device
sd 0:0:0:0: rejecting I/O to offline device
sd 0:0:0:0: rejecting I/O to offline device
sd 0:0:0:0: rejecting I/O to&n bsp;offline device
end_request: I/O error, dev tdb, sector 28699593
end_request: I/O error, dev tdb, sector 28699673
end_request: I/O error, dev tdb, sector 28699753
end_request: I/O error, dev tdb, sector 28699833
end_request: I/O error, dev tdb, sector 28699913
end_request: I/O error, dev tdb, sector 28699993
end_request: I/O error, dev tdb, sector 28700073

     
    thanks.
 
 
> Date: Mon, 11 Apr 2011 23:25:19 +0800
> Subject: Re: kernel BUG at arch/x86/xen/mmu.c:1872
> From: giamteckchoon@xxxxxxxxx
> To: tinnycloud@xxxxxxxxxxx
> CC: xen-devel@xxxxxxxxxxxxxxxxxxx; dave@xxxxxxxxxx; ian.campbell@xxxxxxxxxx; konrad.wilk@xxxxxxxxxx; jeremy@xxxxxxxx; keir@xxxxxxx
>
> 2011/4/11 MaoXiaoyun <tinnycloud@xxxxxxxxxxx>:
> > Hi:
> >
> >      I believe this is the fix at much extent.
> >      Since I have my own test cases which with this patch, my test case will
> > success in 30 rounds run.
> >      Every round takes 8hours.  While without this patch, tests fail evey
> > round in 15minutes.
> >
> >       So this really means fix most of the things.
> >
> >       But during running, I met another crash, from the log it it looks like
> > has relation with
> > this BUG, since the crash log shows it is tlb related and this BUG also tlb
> > related.
>
> Are you able to run another test with cpuidle=0 cpufreq=none in kernel
> boot option? Just curious whether can you reproduce the tlb bug when
> you boot with cpuidle=0 cpufreq=none... ...
>
> >
> >       Well, I'm also have poor knowledge of kernel.
> >       Hope someone from Xen Devel offer some help.
> >
> >       Many thanks.
> >
> >> Date: Mon, 11 Apr 2011 20:16:53 +0800
> >> Subject: Re: kernel BUG at arch/x86/xen/mmu.c:1872
> >> From: giamteckchoon@xxxxxxxxx
> >> To: tinnycloud@xxxxxxxxxxx
> >> CC: xen-devel@xxxxxxxxxxxxxxxxxxx; dave@xxxxxxxxxx;
> >> ian.campbell@xxxxxxxxxx; konrad.wilk@xxxxxxxxxx; jeremy@xxxxxxxx;
> >> keir@xxxxxxx
> >>
> >> >
> >> > Hi,
> >> >
> >> > Sorry, since this mmu related BUG has been troubled me for very
> >> > long... I really want to "kill" this BUG but my knowledge in kernel
> >> > hacking and/or xen is very limited.
> >> >
> >> > While waiting for Jeremy or Konrad or others ...
> >> >
> >> > Many thanks for spending time to track down this mmu related BUG.  I
> >> > have backported the commit from
> >> >
> >> > http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=commit;h=64141da587241301ce8638cc945f8b67853156ec
> >> > to 2.6.32.36 PVOPS kernel and patch attached.  I won't know whether
> >> > did I backport it correctly nor does it affects anything.  I am
> >> > currently testing the 2.6.32.36 PVOPS kernel with this patch applied
> >> > and also unset CONFIG_DEBUG_PAGEALLOC.  Currently running testcrash.sh
> >> > loop 1000 as I am unable to reproduce this mmu BUG 1872 in
> >> > testcrash.sh loop 100.  Please note that when CONFIG_DEBUG_PAGEALLOC
> >> > is unset, I can reproduce this mmu BUG 1872 easily within <50
> >> > testcrash.sh loop cycle with PVOPS version 2.6.32.24 to 2.6.32.36
> >> > kernel.  Now test with this backport patch to see whether I can
> >> > reproduce this mmu BUG... ...
> >> >
> >> > Kindest regards,
> >> > Giam Teck Choon
> >> >
> >>
> >> I have tested with my backport patch and it is working fine as I am
> >> unable to reproduce the mmu.c 1872 or 1860 bug with
> >> CONFIG_DEBUG_PAGEALLOC not set. I tested with testcrash.sh loop 100
> >> and 1000. Now doing testcrash.sh loop 10000.
> >>
> >> Xiaoyun, is it possible for you to test my patch and see whether can
> >> you reproduce the mmu.c 1872/1860 bug?
> >>
> >> Can anyone of you review my patch?
> >>
> >> I will post a format patch according to
> >> Documentation/SubmittingPatches in my next reply and hopefully can be
> >> reviewed.
> >>
> >> Thanks.
> >>
> >> Kindest regards,
> >> Giam Teck Choon
> >
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.