|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 1/3] mm: introduce xvmalloc() et al and use for grant table allocations
On 19/11/2020 12:33, Jan Beulich wrote: On 19.11.2020 11:59, Julien Grall wrote:On 19/11/2020 09:46, Jan Beulich wrote:On 18.11.2020 20:10, Julien Grall wrote:On 06/11/2020 07:11, Jan Beulich wrote:All of the array allocations in grant_table_init() can exceed a page's worth of memory, which xmalloc()-based interfaces aren't really suitable for after boot. Performance is always subjective... So while I agree that having xvmalloc() is a good move, I am not convinced of your argument regarding the boot vs runtime. I think a better argument is the allocation doesn't need to be physically contiguous in memory. So better avoid it when we can.Well... I've added a sentence.It would be good to clarify which one you refer because none of them are really a problem only after boot...Given the above, I'm not sure in which way you would see this be clarified. Any suggestions welcome.One thing to be aware thought is xv*() are going to be more inefficient because they involve touching the page-tables (at least until the work to map on-demand the direct map is not merged). In addition, on Arm, they will also use only 4K mappings (I have a TODO to fix that). So I think we will need to be careful when to use xmalloc() vs xvalloc(). It might be worth outlining that in the documentation of xv*().The rule is quite simple and the inefficiencies you mention shouldn't matter imo: Post-boot there should not be any implicit allocations of non-zero order. "Implicit" here meaning to still permit e.g. direct alloc_domheap_pages() invocations, making apparent at the call site that the aspect of memory fragmentation was (hopefully) taken into consideration. I'm actually inclined to suggest (down the road) to have _xmalloc() no longer fall back to multi-page allocations post-boot, but instead return NULL.One advantage of xmalloc() is it is able to allocate a suitable xenheap area. So it will not touch the page-tables and therefore useful for short-life allocation as the overhead will be more limited compare to xvalloc().Yet it still shouldn't be used post-boot when the size may exceed system page size, to avoid reporting -ENOMEM or alike when really there's ample but fragmented memory available. Well, we have been pretty bad at documenting code... So some of us may have inferred some behavior from xmalloc(). This is also why I requested to make more explicit what 'v' means. However... It's certainly against the "Xen malloc/free-style interface" comment at the top of the header. ... if you consider it as a mistaken, then why did you introduce xvalloc() rather than modifying the implementation of xmalloc()? It was my understanding so far that with the removal of the direct- map this property would go away anyway. Direct-map is going to disappear on x86, but there are no plan for that on Arm so far. I am not saying I don't want to remove it, I just want to point out that decision should not be made solely based on what x86 is doing (see more below). So I think you would want to mention that in the sentence.Well, as you've seen further down the comment already mentions that aspect.Where the inefficiencies you mention would imo matter is in (future) decisions whether to use vmalloc() et al vs xvmalloc() et al: If the size _may_ be no more than a page, the latter may want preferring.I am not sure to understand this... why would we want to keep vmalloc() extern when xvalloc() will be calling it for allocation over a PAGE_SIZE?Why force callers knowing they'll allocate more than a page to go through the extra layer? If that was the plan, then we wouldn't need a set of new functions, but could instead tweak vmalloc() etc. Two reasons:1) There are too many ways to allocate memory in Xen. This adds extra-confusion to use. 2) The impact of the extra check is going to be insignificant compare to the cost of the function. Feel free to prove me otherwise with numbers.
I can't see a use of vm_size() in vmalloc_type(). I can only see a implicit downcast.
Yes I missed that point. But I am not sure where you are trying to infer...If it wasn't clear enough, I didn't suggest to fix in this patch. I am only pointed out that we hardened _xmalloc() and this looks like going backwards. And no, there's no overflowing calculation anywhere afaics which would resemble the ones in xmalloc() you refer to. "overflow" was probably the wrong word. It would be more a downcast when computing the number of pages. __vmap() is taking an "unsigned int", yet the number of pages is computed using size_t.
... Thankfully we have only 2 architectures to care... Otherwise we would never be able to change common code without bikeshedding on micro-optimization. But, I am really surprised this is a concern to you when all the functions in this code will modify the pages tables. You dismissed this overhead in the same e-mail...Entirely different considerations: The goal of limiting variable (and parameter) types to 32 bits where possible is a generic one. At the cost of introducing multiple implicit downcast that one day or another is going to bite us. Which is, if for nothing else, to avoid introducing bad precedent. I am ok with 32-bit internal value, but please at least check the downcasting is going to be harmless. Cheers, -- Julien Grall
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |