[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3 06/20] xen/riscv: add root page table allocation
On 8/5/25 12:37 PM, Jan Beulich wrote:
On 31.07.2025 17:58, Oleksii Kurochko wrote:Introduce support for allocating and initializing the root page table required for RISC-V stage-2 address translation. To implement root page table allocation the following is introduced: - p2m_get_clean_page() and p2m_alloc_root_table(), p2m_allocate_root() helpers to allocate and zero a 16 KiB root page table, as mandated by the RISC-V privileged specification for Sv32x4/Sv39x4/Sv48x4/Sv57x4 modes. - Update p2m_init() to inititialize p2m_root_order. - Add maddr_to_page() and page_to_maddr() macros for easier address manipulation. - Introduce paging_ret_pages_to_domheap() to return some pages before allocate 16 KiB pages for root page table. - Allocate root p2m table after p2m pool is initialized. - Add construct_hgatp() to construct the hgatp register value based on p2m->root, p2m->hgatp_mode and VMID.Imo for this to be complete, freeing of the root table also wants taking care of. Much like imo p2m_init() would better immediately be accompanied by the respective teardown function. Once you start using them, you want to use them in pairs, after all. I decided to ignore freeing of the root table and tearing down p2m mapping as it is going to be used during a domain destroy, which isn't supported at the moment, and thereby an implementation of them could be delayed when they really will be used. --- a/xen/arch/riscv/include/asm/riscv_encoding.h +++ b/xen/arch/riscv/include/asm/riscv_encoding.h @@ -133,11 +133,13 @@ #define HGATP_MODE_SV48X4 _UL(9) #define HGATP32_MODE_SHIFT 31 +#define HGATP32_MODE_MASK _UL(0x80000000) #define HGATP32_VMID_SHIFT 22 #define HGATP32_VMID_MASK _UL(0x1FC00000) #define HGATP32_PPN _UL(0x003FFFFF) #define HGATP64_MODE_SHIFT 60 +#define HGATP64_MODE_MASK _ULL(0xF000000000000000) #define HGATP64_VMID_SHIFT 44 #define HGATP64_VMID_MASK _ULL(0x03FFF00000000000) #define HGATP64_PPN _ULL(0x00000FFFFFFFFFFF) @@ -170,6 +172,7 @@ #define HGATP_VMID_SHIFT HGATP64_VMID_SHIFT #define HGATP_VMID_MASK HGATP64_VMID_MASK #define HGATP_MODE_SHIFT HGATP64_MODE_SHIFT +#define HGATP_MODE_MASK HGATP64_MODE_MASK #else #define MSTATUS_SD MSTATUS32_SD #define SSTATUS_SD SSTATUS32_SD @@ -181,8 +184,11 @@ #define HGATP_VMID_SHIFT HGATP32_VMID_SHIFT #define HGATP_VMID_MASK HGATP32_VMID_MASK #define HGATP_MODE_SHIFT HGATP32_MODE_SHIFT +#define HGATP_MODE_MASK HGATP32_MODE_MASK #endif +#define GUEST_ROOT_PAGE_TABLE_SIZE KB(16)In another context I already mentioned that imo you want to be careful with the use of "guest" in identifiers. It's not the guest page tables which have an order-2 root table, but the P2M (Xen terminology) or G-stage / second stage (RISC-V spec terminology) ones. As long as you're only doing P2M work, this may not look significant. But once you actually start dealing with guest page tables, it easily can end up confusing. I thought that GUEST_ROOT_PAGE_TABLE is equal to G-stage root page table. But it is confusing even now, then I'll use GSTAGE_ROOT_PAGE_TABLE_SIZE instead. --- a/xen/arch/riscv/p2m.c +++ b/xen/arch/riscv/p2m.c @@ -1,8 +1,86 @@ +#include <xen/domain_page.h> #include <xen/mm.h> #include <xen/rwlock.h> #include <xen/sched.h> #include <asm/paging.h> +#include <asm/p2m.h> +#include <asm/riscv_encoding.h> + +unsigned int __read_mostly p2m_root_order;If this is to be a variable at all, it ought to be __ro_after_init, and hence it shouldn't be written every time p2m_init() is run. If you want to to remain as a variable, what's wrong with const unsigned int p2m_root_order = ilog2(GUEST_ROOT_PAGE_TABLE_SIZE) - PAGE_SHIFT; or some such? But of course equally well you could have #define P2M_ROOT_ORDER (ilog2(GUEST_ROOT_PAGE_TABLE_SIZE) - PAGE_SHIFT) The only one reason p2m_root_order was introduced as variable it was that I had a compilation issue when define P2M_ROOT_ORDER in such way: #define P2M_ROOT_ORDER get_order_from_bytes(GUEST_ROOT_PAGE_TABLE_SIZE) But I can't reproduce it anymore. Anyway, your option is better as it should be faster. +static void clear_and_clean_page(struct page_info *page) +{ + clear_domain_page(page_to_mfn(page)); + + /* + * If the IOMMU doesn't support coherent walks and the p2m tables are + * shared between the CPU and IOMMU, it is necessary to clean the + * d-cache. + */That is, ...+ clean_dcache_va_range(page, PAGE_SIZE);... this call really wants to be conditional? It makes sense. I will add "if ( p2m->clean_pte )" and update clear_and_clean_page() declaration. +} + +static struct page_info *p2m_allocate_root(struct domain *d)With there also being p2m_alloc_root_table() and with that being the sole caller of the function here, I wonder: Is having this in a separate function really outweighing the possible confusion of which of the two functions to use? p2m_allocate_root() will be also used in further patches to allocate root's metadata page(s), but, also, in the same function p2m_alloc_root_table(). Probably, to avoid confusion it makes sense to rename p2m_allocate_root() to p2m_allocate_root_page(). +{ + struct page_info *page; + + /* + * As mentioned in the Priviliged Architecture Spec (version 20240411) + * in Section 18.5.1, for the paged virtual-memory schemes (Sv32x4, + * Sv39x4, Sv48x4, and Sv57x4), the root page table is 16 KiB and must + * be aligned to a 16-KiB boundary. + */ + page = alloc_domheap_pages(d, P2M_ROOT_ORDER, MEMF_no_owner); + if ( !page ) + return NULL; + + for ( unsigned int i = 0; i < P2M_ROOT_PAGES; i++ ) + clear_and_clean_page(page + i); + + return page; +} + +unsigned long construct_hgatp(struct p2m_domain *p2m, uint16_t vmid) +{ + unsigned long ppn; + + ppn = PFN_DOWN(page_to_maddr(p2m->root)) & HGATP_PPN;Why not page_to_pfn() or mfn_x(page_to_mfn())? I.e. why mix different groups of accessors? No specific reason, just missed such option. As to "& HGATP_PPN" - that's making an assumption that you could avoid by using ...+ /* TODO: add detection of hgatp_mode instead of hard-coding it. */ +#if RV_STAGE1_MODE == SATP_MODE_SV39 + p2m->hgatp_mode = HGATP_MODE_SV39X4; +#elif RV_STAGE1_MODE == SATP_MODE_SV48 + p2m->hgatp_mode = HGATP_MODE_SV48X4; +#else +# error "add HGATP_MODE" +#endif + + return ppn | MASK_INSR(p2m->hgatp_mode, HGATP_MODE_MASK) | + MASK_INSR(vmid, HGATP_VMID_MASK);... MASK_INSR() also on "ppn". As to the writing of p2m->hgatp_mode - you don't want to do this here, when this is the function to calculate the value to put into hgatp. This field needs calculating only once, perhaps in p2m_init(). Agree, it makes sense to move hgatp_mode detection to p2m_init(). +static int p2m_alloc_root_table(struct p2m_domain *p2m) +{ + struct domain *d = p2m->domain; + struct page_info *page; + const unsigned int nr_root_pages = P2M_ROOT_PAGES;Is this local variable really of any use? It will be needed for one of the next patches and to have less change in further patch, I've decided to introduce it here. + /* + * Return back nr_root_pages to assure the root table memory is also + * accounted against the P2M pool of the domain. + */ + if ( !paging_ret_pages_to_domheap(d, nr_root_pages) ) + return -ENOMEM; + + page = p2m_allocate_root(d); + if ( !page ) + return -ENOMEM;Hmm, and the pool is then left shrunk by 4 pages? Yes until they are used for root table it shouldn't be in p2m pool (freelist), when root table will be freed then it makes sense to return them back. Am I missing something? Probably, you meant that it is needed to update p2m->pages? --- a/xen/arch/riscv/paging.c +++ b/xen/arch/riscv/paging.c @@ -54,6 +54,36 @@ int paging_freelist_init(struct domain *d, unsigned long pages, return 0; } + +bool paging_ret_pages_to_domheap(struct domain *d, unsigned int nr_pages) +{ + struct page_info *page; + + ASSERT(spin_is_locked(&d->arch.paging.lock)); + + if ( ACCESS_ONCE(d->arch.paging.total_pages) < nr_pages ) + return false; + + for ( unsigned int i = 0; i < nr_pages; i++ ) + { + /* Return memory to domheap. */ + page = page_list_remove_head(&d->arch.paging.freelist); + if( page ) + { + ACCESS_ONCE(d->arch.paging.total_pages)--; + free_domheap_page(page); + } + else + { + printk(XENLOG_ERR + "Failed to free P2M pages, P2M freelist is empty.\n"); + return false;Looks pretty redundant with half of paging_freelist_init(), including the stray full stop in the log message. I will introduce then a separate function (for a code, which is inside for-loop) and use it here and in paging_freelist_init(). Thanks. ~ Oleksii
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |