Xen project Mailing List

Re: [Xen-devel] [PATCH 5/8] xen/vmap: allow vmap() to be called during early boot

To: David Woodhouse <dwmw2@xxxxxxxxxxxxx>

From: George Dunlap <dunlapg@xxxxxxxxx>

Date: Tue, 4 Feb 2020 11:00:03 +0000

Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Wei Liu <wl@xxxxxxx>, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Varad Gautam <vrd@xxxxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, Hongyan Xia <hongyxia@xxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Paul Durrant <pdurrant@xxxxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>

Delivery-date: Tue, 04 Feb 2020 11:00:23 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Mon, Feb 3, 2020 at 4:37 PM David Woodhouse <dwmw2@xxxxxxxxxxxxx> wrote: > > On Mon, 2020-02-03 at 14:00 +0000, Julien Grall wrote: > > Hi David, > > > > On 01/02/2020 00:33, David Woodhouse wrote: > > > From: David Woodhouse <dwmw@xxxxxxxxxxxx> > > > > I am a bit concerned with this change, particularly the consequence this > > have for the page-tables. There is an assumption that intermediate > > page-tables allocated via the boot allocator will never be freed. > > > > On x86, a call to vunmap() will not free page-tables, but a subsequent > > call to vmap() may free it depending on the mapping size. So we would > > call free_domheap_pages() rather than init_heap_pages(). > > > > I am not entirely sure what is the full consequence, but I think this is > > a call for investigation and write it down a summary in the commit message. > > This isn't just about page tables, right? It's about *any* allocation > given out by the boot allocator, being freed with free_heap_pages() ? > > Given the amount of code that has conditionals in both alloc and free > paths along the lines of… > > if (system_state > SYS_STATE_boot) > use xenheap > else > use boot allocator > > … I'm not sure I'd really trust the assumption that such a thing never > happens; that no pages are ever allocated from the boot allocator and > then freed into the heap. > > In fact it does work fine except for some esoteric corner cases, > because init_heap_pages() is mostly just a trivial loop over > free_heap_pages(). > > The corner cases are if you call free_heap_pages() on boot-allocated > memory which matches one or more of the following criteria: > > • Includes MFN #0, > > • Includes the first page the heap has seen on a given node, so > init_node_heap() has to be called, or > > • High-order allocations crossing from one node to another. I was asked to forward a message relating to MFN 0 and allocations crossing zones from a private discussion on the security list: 8<--- > I am having difficulty seeing how invalidating MFN0 would solve the issue > here. > The zone number for a specific page is calculated from the most significant > bit > position set in it's MFN. As a result, each successive zone contains an order > of > magnitude more pages. You would need to invalidate the first or last MFN in > each > zone. Because (unless Jan and I are reading the code wrong): * Chunks can only be merged such that they end up on order-boundaries. * Chunks can only be merged if they are the same order. * Zone boundaries are on order boundaries. So say you're freeing mfn 0x100, and mfn 0xff is free. In that loop, (1 << order) & mfn will always be 0, so it will always only look "forward" fro things to merge, not backwards. Suppose on the other hand, that you're freeing mfn 0x101, and 0x98 through 0x100 are free. The loop will look "backwards" and merge with 0x100; but then it will look "forwards" again. Now suppose you've merged 0x100-0x1ff, and the order moves up to size 0x100. Now the mask becomes 0x1ff; so it can't merge with 0x200-0x2ff (which would cross zones); instead it looks backwards to 0x0-0xff. We don't think it's possible for things to be merged across zones unless it can (say) start at 0xff, and merge all the way back to 0x0; which can't be done if 0x0 is never on the free list. That's the idea anyway. That would explain why we've never seen it on x86 -- due to the way the architecture is, mfn 0 is never on the free list. --->8 -George _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.