[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2 6/9] mm/memory: convert print_bad_pte() to print_bad_page_map()
- To: David Hildenbrand <david@xxxxxxxxxx>, linux-kernel@xxxxxxxxxxxxxxx
- From: Demi Marie Obenour <demiobenour@xxxxxxxxx>
- Date: Fri, 18 Jul 2025 03:59:18 -0400
- Autocrypt: addr=demiobenour@xxxxxxxxx; keydata= xsFNBFp+A0oBEADffj6anl9/BHhUSxGTICeVl2tob7hPDdhHNgPR4C8xlYt5q49yB+l2nipd aq+4Gk6FZfqC825TKl7eRpUjMriwle4r3R0ydSIGcy4M6eb0IcxmuPYfbWpr/si88QKgyGSV Z7GeNW1UnzTdhYHuFlk8dBSmB1fzhEYEk0RcJqg4AKoq6/3/UorR+FaSuVwT7rqzGrTlscnT DlPWgRzrQ3jssesI7sZLm82E3pJSgaUoCdCOlL7MMPCJwI8JpPlBedRpe9tfVyfu3euTPLPx wcV3L/cfWPGSL4PofBtB8NUU6QwYiQ9Hzx4xOyn67zW73/G0Q2vPPRst8LBDqlxLjbtx/WLR 6h3nBc3eyuZ+q62HS1pJ5EvUT1vjyJ1ySrqtUXWQ4XlZyoEFUfpJxJoN0A9HCxmHGVckzTRl 5FMWo8TCniHynNXsBtDQbabt7aNEOaAJdE7to0AH3T/Bvwzcp0ZJtBk0EM6YeMLtotUut7h2 Bkg1b//r6bTBswMBXVJ5H44Qf0+eKeUg7whSC9qpYOzzrm7+0r9F5u3qF8ZTx55TJc2g656C 9a1P1MYVysLvkLvS4H+crmxA/i08Tc1h+x9RRvqba4lSzZ6/Tmt60DPM5Sc4R0nSm9BBff0N m0bSNRS8InXdO1Aq3362QKX2NOwcL5YaStwODNyZUqF7izjK4QARAQABzTxEZW1pIE1hcmll IE9iZW5vdXIgKGxvdmVyIG9mIGNvZGluZykgPGRlbWlvYmVub3VyQGdtYWlsLmNvbT7CwXgE EwECACIFAlp+A0oCGwMGCwkIBwMCBhUIAgkKCwQWAgMBAh4BAheAAAoJELKItV//nCLBhr8Q AK/xrb4wyi71xII2hkFBpT59ObLN+32FQT7R3lbZRjVFjc6yMUjOb1H/hJVxx+yo5gsSj5LS 9AwggioUSrcUKldfA/PKKai2mzTlUDxTcF3vKx6iMXKA6AqwAw4B57ZEJoMM6egm57TV19kz PMc879NV2nc6+elaKl+/kbVeD3qvBuEwsTe2Do3HAAdrfUG/j9erwIk6gha/Hp9yZlCnPTX+ VK+xifQqt8RtMqS5R/S8z0msJMI/ajNU03kFjOpqrYziv6OZLJ5cuKb3bZU5aoaRQRDzkFIR 6aqtFLTohTo20QywXwRa39uFaOT/0YMpNyel0kdOszFOykTEGI2u+kja35g9TkH90kkBTG+a EWttIht0Hy6YFmwjcAxisSakBuHnHuMSOiyRQLu43ej2+mDWgItLZ48Mu0C3IG1seeQDjEYP tqvyZ6bGkf2Vj+L6wLoLLIhRZxQOedqArIk/Sb2SzQYuxN44IDRt+3ZcDqsPppoKcxSyd1Ny 2tpvjYJXlfKmOYLhTWs8nwlAlSHX/c/jz/ywwf7eSvGknToo1Y0VpRtoxMaKW1nvH0OeCSVJ itfRP7YbiRVc2aNqWPCSgtqHAuVraBRbAFLKh9d2rKFB3BmynTUpc1BQLJP8+D5oNyb8Ts4x Xd3iV/uD8JLGJfYZIR7oGWFLP4uZ3tkneDfYzsFNBFp+A0oBEAC9ynZI9LU+uJkMeEJeJyQ/ 8VFkCJQPQZEsIGzOTlPnwvVna0AS86n2Z+rK7R/usYs5iJCZ55/JISWd8xD57ue0eB47bcJv VqGlObI2DEG8TwaW0O0duRhDgzMEL4t1KdRAepIESBEA/iPpI4gfUbVEIEQuqdqQyO4GAe+M kD0Hy5JH/0qgFmbaSegNTdQg5iqYjRZ3ttiswalql1/iSyv1WYeC1OAs+2BLOAT2NEggSiVO txEfgewsQtCWi8H1SoirakIfo45Hz0tk/Ad9ZWh2PvOGt97Ka85o4TLJxgJJqGEnqcFUZnJJ riwoaRIS8N2C8/nEM53jb1sH0gYddMU3QxY7dYNLIUrRKQeNkF30dK7V6JRH7pleRlf+wQcN fRAIUrNlatj9TxwivQrKnC9aIFFHEy/0mAgtrQShcMRmMgVlRoOA5B8RTulRLCmkafvwuhs6 dCxN0GNAORIVVFxjx9Vn7OqYPgwiofZ6SbEl0hgPyWBQvE85klFLZLoj7p+joDY1XNQztmfA rnJ9x+YV4igjWImINAZSlmEcYtd+xy3Li/8oeYDAqrsnrOjb+WvGhCykJk4urBog2LNtcyCj kTs7F+WeXGUo0NDhbd3Z6AyFfqeF7uJ3D5hlpX2nI9no/ugPrrTVoVZAgrrnNz0iZG2DVx46 x913pVKHl5mlYQARAQABwsFfBBgBAgAJBQJafgNKAhsMAAoJELKItV//nCLBwNIP/AiIHE8b oIqReFQyaMzxq6lE4YZCZNj65B/nkDOvodSiwfwjjVVE2V3iEzxMHbgyTCGA67+Bo/d5aQGj gn0TPtsGzelyQHipaUzEyrsceUGWYoKXYyVWKEfyh0cDfnd9diAm3VeNqchtcMpoehETH8fr RHnJdBcjf112PzQSdKC6kqU0Q196c4Vp5HDOQfNiDnTf7gZSj0BraHOByy9LEDCLhQiCmr+2 E0rW4tBtDAn2HkT9uf32ZGqJCn1O+2uVfFhGu6vPE5qkqrbSE8TG+03H8ecU2q50zgHWPdHM OBvy3EhzfAh2VmOSTcRK+tSUe/u3wdLRDPwv/DTzGI36Kgky9MsDC5gpIwNbOJP2G/q1wT1o Gkw4IXfWv2ufWiXqJ+k7HEi2N1sree7Dy9KBCqb+ca1vFhYPDJfhP75I/VnzHVssZ/rYZ9+5 1yDoUABoNdJNSGUYl+Yh9Pw9pE3Kt4EFzUlFZWbE4xKL/NPno+z4J9aWemLLszcYz/u3XnbO vUSQHSrmfOzX3cV4yfmjM5lewgSstoxGyTx2M8enslgdXhPthZlDnTnOT+C+OTsh8+m5tos8 HQjaPM01MKBiAqdPgksm1wu2DrrwUi6ChRVTUBcj6+/9IJ81H2P2gJk3Ls3AVIxIffLoY34E +MYSfkEjBz0E8CLOcAw7JIwAaeBT
- Cc: linux-mm@xxxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxxx, linux-fsdevel@xxxxxxxxxxxxxxx, nvdimm@xxxxxxxxxxxxxxx, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>, Juergen Gross <jgross@xxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx>, Dan Williams <dan.j.williams@xxxxxxxxx>, Matthew Wilcox <willy@xxxxxxxxxxxxx>, Jan Kara <jack@xxxxxxx>, Alexander Viro <viro@xxxxxxxxxxxxxxxxxx>, Christian Brauner <brauner@xxxxxxxxxx>, Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx>, "Liam R. Howlett" <Liam.Howlett@xxxxxxxxxx>, Vlastimil Babka <vbabka@xxxxxxx>, Mike Rapoport <rppt@xxxxxxxxxx>, Suren Baghdasaryan <surenb@xxxxxxxxxx>, Michal Hocko <mhocko@xxxxxxxx>, Zi Yan <ziy@xxxxxxxxxx>, Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx>, Nico Pache <npache@xxxxxxxxxx>, Ryan Roberts <ryan.roberts@xxxxxxx>, Dev Jain <dev.jain@xxxxxxx>, Barry Song <baohua@xxxxxxxxxx>, Jann Horn <jannh@xxxxxxxxxx>, Pedro Falcato <pfalcato@xxxxxxx>, Hugh Dickins <hughd@xxxxxxxxxx>, Oscar Salvador <osalvador@xxxxxxx>, Lance Yang <lance.yang@xxxxxxxxx>
- Delivery-date: Fri, 18 Jul 2025 07:59:42 +0000
- List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
On 7/18/25 03:44, David Hildenbrand wrote:
> On 18.07.25 00:06, Demi Marie Obenour wrote:
>> On 7/17/25 07:52, David Hildenbrand wrote:
>>> print_bad_pte() looks like something that should actually be a WARN
>>> or similar, but historically it apparently has proven to be useful to
>>> detect corruption of page tables even on production systems -- report
>>> the issue and keep the system running to make it easier to actually detect
>>> what is going wrong (e.g., multiple such messages might shed a light).
>>>
>>> As we want to unify vm_normal_page_*() handling for PTE/PMD/PUD, we'll have
>>> to take care of print_bad_pte() as well.
>>>
>>> Let's prepare for using print_bad_pte() also for non-PTEs by adjusting the
>>> implementation and renaming the function -- we'll rename it to what
>>> we actually print: bad (page) mappings. Maybe it should be called
>>> "print_bad_table_entry()"? We'll just call it "print_bad_page_map()"
>>> because the assumption is that we are dealing with some (previously)
>>> present page table entry that got corrupted in weird ways.
>>>
>>> Whether it is a PTE or something else will usually become obvious from the
>>> page table dump or from the dumped stack. If ever required in the future,
>>> we could pass the entry level type similar to "enum rmap_level". For now,
>>> let's keep it simple.
>>>
>>> To make the function a bit more readable, factor out the ratelimit check
>>> into is_bad_page_map_ratelimited() and place the dumping of page
>>> table content into __dump_bad_page_map_pgtable(). We'll now dump
>>> information from each level in a single line, and just stop the table
>>> walk once we hit something that is not a present page table.
>>>
>>> Use print_bad_page_map() in vm_normal_page_pmd() similar to how we do it
>>> for vm_normal_page(), now that we have a function that can handle it.
>>>
>>> The report will now look something like (dumping pgd to pmd values):
>>>
>>> [ 77.943408] BUG: Bad page map in process XXX entry:80000001233f5867
>>> [ 77.944077] addr:00007fd84bb1c000 vm_flags:08100071 anon_vma: ...
>>> [ 77.945186] pgd:10a89f067 p4d:10a89f067 pud:10e5a2067 pmd:105327067
>>>
>>> Not using pgdp_get(), because that does not work properly on some arm
>>> configs where pgd_t is an array. Note that we are dumping all levels
>>> even when levels are folded for simplicity.
>>>
>>> Signed-off-by: David Hildenbrand <david@xxxxxxxxxx>
>>
>> Should this still use a WARN? If the admin sets panic-on-warn they
>> have asked for "crash if anything goes wrong" and so that is what
>> they should get. Otherwise the system will still stay up.
>
> I assume you're comment is in context of the other proposal regarding
> panicking.
Which one? I'm only subscribed to xen-devel and might have missed it.
> It's a good question whether we should WARN: likely we should convert
> the "BUG:" ... message into a WARN. On panic-on-warn you'd panic
> immediately without being able to observe any other such messages (and
> as discussed in the RFC, apparently that can be valuable for debugging,
> because a single such report is often insufficient)
>
> But as panic-on-warn is "panic on the first sight of a problem", that
> sounds right.
There are other advantages to WARN() as well:
- Automated tools (like syzkaller) know how to deal with it (though they
probably know how to deal with this too).
- It counts against kernel.warn_limit, so an administrator can choose
to have the system panic if there are a huge number of warnings.
(My completely uninformed guess is that "other such messages" would
usually be less than 100, and that once one has gotten into the
quadruple digits, the kernel is probably hopelessly confused.)
- It shows up in /sys/kernel/warn_count.
> That change should not be part of this patch, though.
Fair.
--
Sincerely,
Demi Marie Obenour (she/her/hers) Attachment:
OpenPGP_0xB288B55FFF9C22C1.asc
Description: OpenPGP public key
Attachment:
OpenPGP_signature.asc
Description: OpenPGP digital signature
|