[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] xen.git branch reorg / crash with 2.6.30-rc3 pv_ops dom0



Ian Campbell wrote:
On Tue, 2009-04-28 at 13:30 -0400, Jeremy Fitzhardinge wrote:
Ian Campbell wrote:
Interesting. Curiously its working for me on my machine, but has a tendency to crash a bit later with a protection fault on an RO page during memory allocation. I wonder if its related...
Certainly smells similar.
Yeah. The fault in both cases has an error-code of 3, so a write protect fault. It suggests a pinned page is getting freed into the general heap for some reason. Except there are no complaints from Xen about writes into a pagetable, so that makes it look like page is being made RO but not (left) pinned.

The crash Pasi and I are seeing is pretty early on though, is there any
opportunity for a page table page to have been recycled before
~kernel_physical_mapping_init()? I'd have thought not.

No, sounds unlikely.

I was wondering if perhaps e820_table_start (used by alloc_low_page) had
somehow got initialised to a bogus value such that it was pointing at
the domain builder supplied page tables (hence RO but not marked as
pinned yet).

Well, Xen would know they're pinned, even if the kernel's structures don't, one would expect to see any completely bogus writes appear on the Xen console.

 There was some unification work in this area around the
beginning of March although I'm pretty such I've had it work much more
recently. (I guess it might not have been merged into a visible branch
until more recently, it's a bit hard to tell with git but I don't think
that's what happened).

Yeah, I think it has been working properly for some since then, though I think Pasi has been reporting problems since 32-bit dom0 first booted.

My symptoms are a bit all over the place. For a while I was just seeing writes-to-RO pages, but since I added some debugging to try and work out where that was happening, I'm now seeing more major pagetable corruption (like instruction fetches failing because of reserved bits being set in the pagetable...). So something is stomping pagetable, and I think its some page being freed.

I think these are somewhat similar to Pasi's symptoms which he said that disabling HIGHPTE fixed. I see problems independent of the HIGHPTE setting.

I have best success in causing crashes when scp'ing a 8GB file onto my XFS /home filesystem. XFS does quite a lot of vmapping, so that may exacerbate the problem.

I also realized that my Xen doesn't have debug=y set, so I'm probably missing some information.

   J

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.