[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] dom0 linux 3.6.0-rc4, crash due to ballooning althoug dom0_mem=X, max:X set



Wednesday, September 5, 2012, 10:19:33 PM, you wrote:

> On Wed, Sep 05, 2012 at 04:38:48PM +0200, Sander Eikelenboom wrote:
>> 
>> Wednesday, September 5, 2012, 4:06:01 PM, you wrote:
>> 
>> > On Tue, Sep 04, 2012 at 04:27:20PM -0400, Robert Phillips wrote:
>> >> Ben,
>> >> 
>> >> You have asked me to provide the rationale behind the gnttab_old_mfn 
>> >> patch, which you emailed to Sander earlier today. 
>> >> Here are my findings.
>> >> 
>> >> I found that xen_blkbk_map() in drivers/block/xen-blkback/blkback.c has 
>> >> changed from our previous version.  It now calls gnttab_map_refs() in 
>> >> drivers/xen/grant-table.c.
>> >> 
>> >> That function first calls 
>> >> HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, ... ) and then calls 
>> >> m2p_add_override() in p2m.c
>> 
>> > And HYPERVISOR_grant_table_op .. would populate map_ops[i].bus_addr with 
>> > the machine address..
>> 
>> >> which is where I made my change.
>> >> 
>> >> The unpatched code was saving the pfn's old mfn in kmap_op->dev_bus_addr. 
>> >>  
>> >> 
>> >> kmap_op is of type struct gnttab_map_grant_ref.  That data type is used 
>> >> to record grant table mappings so later they can be unmapped correctly.
>> 
>> > Right, but the blkback makes a distinction by passing NULL as kmap_op, 
>> > which means it should
>> > use the old mechanism. Meaning that once the hypercall is done, the 
>> > map_ops[i].bus_addr is not
>> > used anymore..
>> 
>> >> 
>> >> The problem with saving the old mfn in kmap_op->dev_bus_addr is that it 
>> >> is later overwritten by __gnttab_map_grant_ref() in 
>> >> xen/common/grant_table.c
>> 
>> > Uh, so the problem of saving the old mfn in dev_bus_addr has been there 
>> > for a long long time then?
>> > Even before this patch set?
>> >> 
>> >> Since the storage holding the old mfn got overwritten, the unmapping was 
>> >> being done incorrectly.  The balloon code detected that and bugged at 
>> >> drivers/xen/balloon.c:359
>> >> 
>> 
>> > Hmm, I believe the storage for holding the old mfn was/is page->index.
>> 
>> 
>> >> My patch simply adds another member called old_mfn to struct 
>> >> gnttab_map_grant_ref rather than trying to overload dev_bus_addr.
>> >> 
>> >> I don't know if Sander's bug is the same or related.  The BUG_ON at 
>> >> drivers/xen/balloon.c:359 is quite general.  It simply asserts that we 
>> >> are not trying to re-map a valid mapping.
>> 
>> > Right. Somehow he ends up with valid mappings where there should be none. 
>> > And lots of them.
>> 
>> It's something between kernel v3.4.1 and v3.5.3, haven't had time to narrow 
>> it down yet.
>> Any suggestions for specific commits i could try to quickly bisect this one ?

> These are the ones that went in:

> ea61fc0 xen/p2m: Reserve 8MB of _brk space for P2M leafs when populating back.
> b9e0d95 xen: mark local pages as FOREIGN in the m2p_override
> 6878c32 xen/blkfront: Add WARN to deal with misbehaving backends.
> 5e62625 xen/setup: filter APERFMPERF cpuid feature out
> 8c9ce60 xen/blkback: Copy id field when doing BLKIF_DISCARD.
> 58b7b53 xen/balloon: Subtract from xen_released_pages the count that is 
> populated.
> 780dbcd xen/pci: Check for PCI bridge before using it.
> 5e152e6 xen/events: Add WARN_ON when quick lookup found invalid type.
> 5842f57 xen/hvc: Check HVM_PARAM_CONSOLE_[EVTCHN|PFN] for correctness.
> a32c88b xen/hvc: Fix error cases around HVM_PARAM_CONSOLE_PFN
> 2e5ad6b xen/hvc: Collapse error logic.
> 7664810 xen: do not disable netfront in dom0
> 68c2c39 xen: do not map the same GSI twice in PVHVM guests.
> 201a52b hvc_xen: NULL dereference on allocation failure
> d79d595 xen: Add selfballoning memory reservation tunable.
> d2fb4c5 xenbus: Add support for xenbus backend in stub domain
> 2f1bd67 xen/smp: unbind irqworkX when unplugging vCPUs.
> 87e4baa x86/xen/apic: Add missing #include <xen/xen.h>
> 323f90a xen-acpi-processor: Add missing #include <xen/xen.h>
> 8605067 xen-blkfront: module exit handling adjustments
> e77c78c xen-blkfront: properly name all devices
> f62805f xen: enter/exit lazy_mmu_mode around m2p_override calls
> 211063d xen/acpi/sleep: Enable ACPI sleep via the __acpi_os_prepare_sleep
> 1ff2b0c xen: implement IRQ_WORK_VECTOR handler
> f447d56 xen: implement apic ipi interface
> 83d51ab xen/setup: update VA mapping when releasing memory during setup
> 96dc08b xen/setup: Combine the two hypercall functions - since they are quite 
> similar.
> 2e2fb75 xen/setup: Populate freed MFNs from non-RAM E820 entries and gaps to 
> E820 RAM
> ca11823 xen/setup: Only print "Freeing XXX-YYY pfn range: Z pages freed" if Z 
> > 0
> 9438ef7 x86/apic: Fix UP boot crash
> ab6ec39 xen/apic: implement io apic read with hypercall
> 27abd14 Revert "xen/x86: Workaround 'x86/ioapic: Add register level checks to 
> detect bogus io-apic entries'"
> 31b3c9d xen/x86: Implement x86_apic_ops
> 4a8e2a3 x86/apic: Replace io_apic_ops with x86_io_apic_ops.
> 977f857 PCI: move mutex locking out of pci_dev_reset function
> 569ca5b xen/gnttab: add deferred freeing logic
> 9fe2a70 debugfs: Add support to print u32 array in debugfs
> 940713b xen/p2m: An early bootup variant of set_phys_to_machine
> d509685 xen/p2m: Collapse early_alloc_p2m_middle redundant checks.
> cef4cca xen/p2m: Allow alloc_p2m_middle to call reserve_brk depending on 
> argument
> 3f3aaea xen/p2m: Move code around to allow for better re-usage.


> Narrowing this down (so ignore APIC bootup, drivers, etc) these could be it:

> b9e0d95 xen: mark local pages as FOREIGN in the m2p_override
> 58b7b53 xen/balloon: Subtract from xen_released_pages the count that is 
> populated.
> d79d595 xen: Add selfballoning memory reservation tunable.
> f62805f xen: enter/exit lazy_mmu_mode around m2p_override calls
> 83d51ab xen/setup: update VA mapping when releasing memory during setup
> 96dc08b xen/setup: Combine the two hypercall functions - since they are quite 
> similar.
> 2e2fb75 xen/setup: Populate freed MFNs from non-RAM E820 entries and gaps to 
> E820 RAM
> ca11823 xen/setup: Only print "Freeing XXX-YYY pfn range: Z pages freed" if Z 
> > 0
> 940713b xen/p2m: An early bootup variant of set_phys_to_machine
> d509685 xen/p2m: Collapse early_alloc_p2m_middle redundant checks.
> cef4cca xen/p2m: Allow alloc_p2m_middle to call reserve_brk depending on 
> argument
> 3f3aaea xen/p2m: Move code around to allow for better re-usage.

> About nine of them deal with dom0_mem=max ballooning up right, so if you
> ignore those:

> b9e0d95 xen: mark local pages as FOREIGN in the m2p_override
> d79d595 xen: Add selfballoning memory reservation tunable.
> f62805f xen: enter/exit lazy_mmu_mode around m2p_override calls

> Try reverting any of those.

Ah i missed your email since my hostingprovider was down :-(
But anyway done a git bisect in the mean time that leads to:

[f62805f1f30a40e354bd036b4cb799863a39be4b] xen: enter/exit lazy_mmu_mode around 
m2p_override calls


> And if nothing works there then we can try to revert the ones that
> deal with 'dom0_mem=max:XX'..

> I also need to be able to reproduce this. You said you can only reproduce this
> on your Intel box - is this a fast Intel machine? It also looks like you only
> have 2GB in the machine - and reserve 1GB to the dom0.

Machine is a quad core q9400 @ 2.66mhz, not very fast .. not very slow either

> If you manually (so don't start the guest), balloon down - say to 512MB and 
> then launch
> a guest do you see this problem?


Should i use

xl  mem-max domain-id mem

or

xl  mem-set domain-id mem

for that ?


Perhaps a silly question, but why is it ballooning anyway ?
I have set dom0's memory and there is enough left to create the domain ... or 
at least there should be ...

--
Sander


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.