[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 5/8] xen/vmap: allow vmap() to be called during early boot
On Tue, 2020-02-04 at 11:00 +0000, George Dunlap wrote: > On Mon, Feb 3, 2020 at 4:37 PM David Woodhouse <dwmw2@xxxxxxxxxxxxx> wrote: > > > > On Mon, 2020-02-03 at 14:00 +0000, Julien Grall wrote: > > > Hi David, > > > > > > On 01/02/2020 00:33, David Woodhouse wrote: > > > > From: David Woodhouse <dwmw@xxxxxxxxxxxx> > > > > > > I am a bit concerned with this change, particularly the consequence this > > > have for the page-tables. There is an assumption that intermediate > > > page-tables allocated via the boot allocator will never be freed. > > > > > > On x86, a call to vunmap() will not free page-tables, but a subsequent > > > call to vmap() may free it depending on the mapping size. So we would > > > call free_domheap_pages() rather than init_heap_pages(). > > > > > > I am not entirely sure what is the full consequence, but I think this is > > > a call for investigation and write it down a summary in the commit > > > message. > > > > This isn't just about page tables, right? It's about *any* allocation > > given out by the boot allocator, being freed with free_heap_pages() ? > > > > Given the amount of code that has conditionals in both alloc and free > > paths along the lines of… > > > > if (system_state > SYS_STATE_boot) > > use xenheap > > else > > use boot allocator > > > > … I'm not sure I'd really trust the assumption that such a thing never > > happens; that no pages are ever allocated from the boot allocator and > > then freed into the heap. > > > > In fact it does work fine except for some esoteric corner cases, > > because init_heap_pages() is mostly just a trivial loop over > > free_heap_pages(). > > > > The corner cases are if you call free_heap_pages() on boot-allocated > > memory which matches one or more of the following criteria: > > > > • Includes MFN #0, > > > > • Includes the first page the heap has seen on a given node, so > > init_node_heap() has to be called, or > > > > • High-order allocations crossing from one node to another. > > I was asked to forward a message relating to MFN 0 and allocations > crossing zones from a private discussion on the security list: > > 8<--- > > > I am having difficulty seeing how invalidating MFN0 would solve the issue > > here. > > The zone number for a specific page is calculated from the most significant > > bit > > position set in it's MFN. As a result, each successive zone contains an > > order of > > magnitude more pages. You would need to invalidate the first or last MFN in > > each > > zone. > > Because (unless Jan and I are reading the code wrong): > > * Chunks can only be merged such that they end up on order-boundaries. > * Chunks can only be merged if they are the same order. > * Zone boundaries are on order boundaries. > > So say you're freeing mfn 0x100, and mfn 0xff is free. In that loop, (1 > << order) & mfn will always be 0, so it will always only look "forward" > fro things to merge, not backwards. > > Suppose on the other hand, that you're freeing mfn 0x101, and 0x98 > through 0x100 are free. The loop will look "backwards" and merge with > 0x100; but then it will look "forwards" again. > > Now suppose you've merged 0x100-0x1ff, and the order moves up to size > 0x100. Now the mask becomes 0x1ff; so it can't merge with 0x200-0x2ff > (which would cross zones); instead it looks backwards to 0x0-0xff. > > We don't think it's possible for things to be merged across zones unless > it can (say) start at 0xff, and merge all the way back to 0x0; which > can't be done if 0x0 is never on the free list. > > That's the idea anyway. That would explain why we've never seen it on > x86 -- due to the way the architecture is, mfn 0 is never on the free list. > > --->8 Thanks. I still don't really get it. What if the zone boundary is at MFN 0x300? What prevents the buddy allocator from merging a range a 0x200-0x2FF with another from 0x300-0x3FF, creating a single range 0x200-0x400 which crosses nodes? The MFN0 trick only works if all zone boundaries must be at an address which is 2ⁿ, doesn't it? Is that always true? Attachment:
smime.p7s _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |