Xen project Mailing List

Re: [Xen-devel] 100% reliable Oops on xen 4.0.1

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Tue, 14 Aug 2012 17:09:46 +0100

Cc: Ian Campbell <Ian.Campbell@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>

Delivery-date: Tue, 14 Aug 2012 16:09:34 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

>>> On 14.08.12 at 17:55, Peter Moody <pmoody@xxxxxxxxxx> wrote: > On Tue, Aug 14, 2012 at 7:47 AM, Jan Beulich <JBeulich@xxxxxxxx> wrote: >>>>> On 14.08.12 at 16:42, Peter Moody <pmoody@xxxxxxxxxx> wrote: >>> Hi Ian, here's the trace in question. I'm perfectly happy with this >>> not being a xen issue if for no other reason then it means that I have >>> one less thing I need to look at. The python script in question was >>> essentially doing the same thing as crasher.c, though in the middle of >>> other, more productive activities. >>> ... >>> Call Trace: >>> [<ffffffff81654cd1>] ? down_read+0x11/0x30 >>> [<ffffffff811c9294>] ? ext3_xattr_get+0xf4/0x2b0 >>> [<ffffffff811baf88>] ext3_clear_blocks+0x128/0x190 >>> [<ffffffff811bb104>] ext3_free_data+0x114/0x160 >>> [<ffffffff811bbc0a>] ext3_truncate+0x87a/0x950 >>> [<ffffffff812133f5>] ? journal_start+0xb5/0x100 >>> [<ffffffff811bc840>] ext3_evict_inode+0x180/0x1a0 >>> [<ffffffff8114065f>] evict+0x1f/0xb0 >>> [<ffffffff81006d52>] ? check_events+0x12/0x20 >>> [<ffffffff81140c14>] iput+0x1a4/0x290 >>> [<ffffffff8113ed05>] dput+0x265/0x310 >>> [<ffffffff81132435>] path_put+0x15/0x30 >>> [<ffffffff810a5d31>] audit_syscall_exit+0x171/0x260 >>> [<ffffffff8103ed9a>] sysexit_audit+0x21/0x5f >>> [<ffffffff810065ad>] ? xen_force_evtchn_callback+0xd/0x10 >>> [<ffffffff81006d52>] ? check_events+0x12/0x20 >> >> This obviously is just a leftover on the stack, one can see clearly >> that we're in the middle of a syscall (which would never have >> xen_force_evtchn_callback that deep into the stack (i.e. where >> we just came from user mode). > > Interesting, thanks. Do you have any idea why something like this > would only be reproducible (thus far anyway, still trying to get my > hands on some other test systems) on xen? And not just xen, but on > this particular xen configuration (huge memory, lots of cpus, etc)? Is > this likely a race condition with the audit subsystem or some other > part of the kernel that this configuration somehow tickles? >From the above as well as based on you indicating that the traces are highly variable between instances, I'd suppose this is memory corruption of some sort, which can easily be hidden by all sorts of factors. Until you can find a pattern, I don't think there can be done much by anyone not having an affected system available for debugging. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.