[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v5 08/12] mm: enable lazy_mmu sections to nest
- To: Kevin Brodsky <kevin.brodsky@xxxxxxx>, linux-mm@xxxxxxxxx
- From: "David Hildenbrand (Red Hat)" <david@xxxxxxxxxx>
- Date: Mon, 24 Nov 2025 15:09:48 +0100
- Cc: linux-kernel@xxxxxxxxxxxxxxx, Alexander Gordeev <agordeev@xxxxxxxxxxxxx>, Andreas Larsson <andreas@xxxxxxxxxxx>, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>, Borislav Petkov <bp@xxxxxxxxx>, Catalin Marinas <catalin.marinas@xxxxxxx>, Christophe Leroy <christophe.leroy@xxxxxxxxxx>, Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>, "David S. Miller" <davem@xxxxxxxxxxxxx>, David Woodhouse <dwmw2@xxxxxxxxxxxxx>, "H. Peter Anvin" <hpa@xxxxxxxxx>, Ingo Molnar <mingo@xxxxxxxxxx>, Jann Horn <jannh@xxxxxxxxxx>, Juergen Gross <jgross@xxxxxxxx>, "Liam R. Howlett" <Liam.Howlett@xxxxxxxxxx>, Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx>, Madhavan Srinivasan <maddy@xxxxxxxxxxxxx>, Michael Ellerman <mpe@xxxxxxxxxxxxxx>, Michal Hocko <mhocko@xxxxxxxx>, Mike Rapoport <rppt@xxxxxxxxxx>, Nicholas Piggin <npiggin@xxxxxxxxx>, Peter Zijlstra <peterz@xxxxxxxxxxxxx>, "Ritesh Harjani (IBM)" <ritesh.list@xxxxxxxxx>, Ryan Roberts <ryan.roberts@xxxxxxx>, Suren Baghdasaryan <surenb@xxxxxxxxxx>, Thomas Gleixner <tglx@xxxxxxxxxxxxx>, Venkat Rao Bagalkote <venkat88@xxxxxxxxxxxxx>, Vlastimil Babka <vbabka@xxxxxxx>, Will Deacon <will@xxxxxxxxxx>, Yeoreum Yun <yeoreum.yun@xxxxxxx>, linux-arm-kernel@xxxxxxxxxxxxxxxxxxx, linuxppc-dev@xxxxxxxxxxxxxxxx, sparclinux@xxxxxxxxxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxxx, x86@xxxxxxxxxx
- Delivery-date: Mon, 24 Nov 2025 14:10:08 +0000
- List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
On 11/24/25 14:22, Kevin Brodsky wrote:
Despite recent efforts to prevent lazy_mmu sections from nesting, it
remains difficult to ensure that it never occurs - and in fact it
does occur on arm64 in certain situations (CONFIG_DEBUG_PAGEALLOC).
Commit 1ef3095b1405 ("arm64/mm: Permit lazy_mmu_mode to be nested")
made nesting tolerable on arm64, but without truly supporting it:
the inner call to leave() disables the batching optimisation before
the outer section ends.
This patch actually enables lazy_mmu sections to nest by tracking
the nesting level in task_struct, in a similar fashion to e.g.
pagefault_{enable,disable}(). This is fully handled by the generic
lazy_mmu helpers that were recently introduced.
lazy_mmu sections were not initially intended to nest, so we need to
clarify the semantics w.r.t. the arch_*_lazy_mmu_mode() callbacks.
This patch takes the following approach:
* The outermost calls to lazy_mmu_mode_{enable,disable}() trigger
calls to arch_{enter,leave}_lazy_mmu_mode() - this is unchanged.
* Nested calls to lazy_mmu_mode_{enable,disable}() are not forwarded
to the arch via arch_{enter,leave} - lazy MMU remains enabled so
the assumption is that these callbacks are not relevant. However,
existing code may rely on a call to disable() to flush any batched
state, regardless of nesting. arch_flush_lazy_mmu_mode() is
therefore called in that situation.
A separate interface was recently introduced to temporarily pause
the lazy MMU mode: lazy_mmu_mode_{pause,resume}(). pause() fully
exits the mode *regardless of the nesting level*, and resume()
restores the mode at the same nesting level.
pause()/resume() are themselves allowed to nest, so we actually
store two nesting levels in task_struct: enable_count and
pause_count. A new helper in_lazy_mmu_mode() is introduced to
determine whether we are currently in lazy MMU mode; this will be
used in subsequent patches to replace the various ways arch's
currently track whether the mode is enabled.
In summary (enable/pause represent the values *after* the call):
lazy_mmu_mode_enable() -> arch_enter() enable=1 pause=0
lazy_mmu_mode_enable() -> ø enable=2 pause=0
lazy_mmu_mode_pause() -> arch_leave() enable=2 pause=1
lazy_mmu_mode_resume() -> arch_enter() enable=2 pause=0
lazy_mmu_mode_disable() -> arch_flush() enable=1 pause=0
lazy_mmu_mode_disable() -> arch_leave() enable=0 pause=0
Note: in_lazy_mmu_mode() is added to <linux/sched.h> to allow arch
headers included by <linux/pgtable.h> to use it.
Signed-off-by: Kevin Brodsky <kevin.brodsky@xxxxxxx>
Nothing jumped at me, so
Acked-by: David Hildenbrand (Red Hat) <david@xxxxxxxxxx>
Hoping we can get some more eyes to have a look.
--
Cheers
David
|