|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3 06/13] mm: introduce generic lazy_mmu helpers
On 23/10/2025 21:52, David Hildenbrand wrote: > On 15.10.25 10:27, Kevin Brodsky wrote: >> [...] >> >> * madvise_*_pte_range() call arch_leave() in multiple paths, some >> followed by an immediate exit/rescheduling and some followed by a >> conditional exit. These functions assume that they are called >> with lazy MMU disabled and we cannot simply use pause()/resume() >> to address that. This patch leaves the situation unchanged by >> calling enable()/disable() in all cases. > > I'm confused, the function simply does > > (a) enables lazy mmu > (b) does something on the page table > (c) disables lazy mmu > (d) does something expensive (split folio -> take sleepable locks, > flushes tlb) > (e) go to (a) That step is conditional: we exit right away if pte_offset_map_lock() fails. The fundamental issue is that pause() must always be matched with resume(), but as those functions look today there is no situation where a pause() would always be matched with a resume(). Alternatively it should be possible to pause(), unconditionally resume() after the expensive operations are done and then leave() right away in case of failure. It requires restructuring and might look a bit strange, but can be done if you think it's justified. > > Why would we use enable/disable instead? > >> >> * x86/Xen is currently the only case where explicit handling is >> required for lazy MMU when context-switching. This is purely an >> implementation detail and using the generic lazy_mmu_mode_* >> functions would cause trouble when nesting support is introduced, >> because the generic functions must be called from the current task. >> For that reason we still use arch_leave() and arch_enter() there. > > How does this interact with patch #11? It is a requirement for patch 11, in fact. If we called disable() when switching out a task, then lazy_mmu_state.enabled would (most likely) be false when scheduling it again. By calling the arch_* helpers when context-switching, we ensure lazy_mmu_state remains unchanged. This is consistent with what happens on all other architectures (which don't do anything about lazy_mmu when context-switching). lazy_mmu_state is the lazy MMU status *when the task is scheduled*, and should be preserved on a context-switch. > >> >> Note: x86 calls arch_flush_lazy_mmu_mode() unconditionally in a few >> places, but only defines it if PARAVIRT_XXL is selected, and we are >> removing the fallback in <linux/pgtable.h>. Add a new fallback >> definition to <asm/pgtable.h> to keep things building. > > I can see a call in __kernel_map_pages() and > arch_kmap_local_post_map()/arch_kmap_local_post_unmap(). > > I guess that is ... harmless/irrelevant in the context of this series? It should be. arch_flush_lazy_mmu_mode() was only used by x86 before this series; we're adding new calls to it from the generic layer, but existing x86 calls shouldn't be affected. - Kevin
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |