Xen project Mailing List

Re: [Xen-devel] [RFC 12/22] xen/arm: p2m: Introduce p2m_get_entry and use it to implement __p2m_lookupo

From: Stefano Stabellini <sstabellini@xxxxxxxxxx>

Date: Wed, 31 Aug 2016 12:33:13 -0700 (PDT)

Cc: proskurin@xxxxxxxxxxxxx, Stefano Stabellini <sstabellini@xxxxxxxxxx>, steve.capper@xxxxxxx, wei.chen@xxxxxxxxxx, xen-devel@xxxxxxxxxxxxx

Delivery-date: Wed, 31 Aug 2016 19:33:25 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Wed, 31 Aug 2016, Julien Grall wrote: > > > @@ -236,28 +238,99 @@ static lpae_t *p2m_get_root_pointer(struct > > > p2m_domain *p2m, > > > > > > /* > > > * Lookup the MFN corresponding to a domain's GFN. > > > + * Lookup mem access in the ratrix tree. > > > + * The entries associated to the GFN is considered valid. > > > + */ > > > +static p2m_access_t p2m_mem_access_radix_get(struct p2m_domain *p2m, > > > gfn_t gfn) > > > +{ > > > + void *ptr; > > > + > > > + if ( !p2m->mem_access_enabled ) > > > + return p2m_access_rwx; > > > > Shouldn't this be p2m->default_access? > > default_access will always be p2m_access_rwx when memaccess is disabled. It > will lead to crash a if you try to restrict permission without memaccess. > > Note that, this is matching the behavior of __p2m_get_mem_access. I would like to avoid some places to use p2m_access_rwx and others to use p2m->default_access. But it is true that in this context both are fine. This was just a suggestion. > > > + ptr = radix_tree_lookup(&p2m->mem_access_settings, gfn_x(gfn)); > > > + if ( !ptr ) > > > + return p2m_access_rwx; > > > > Same here? > > The radix tree will contain all the permission restriction but p2m_access_rwx. > This is because you may change the default_access whilst memaccess is enabled > and you don't know what page was restricted with the default access. > > Note that this is matching the behavior of p2m_mem_access_radix_set. > > > > > > > > + else > > > + return radix_tree_ptr_to_int(ptr); > > > +} > > > + > > > +#define GUEST_TABLE_MAP_FAILED 0 > > > +#define GUEST_TABLE_SUPER_PAGE 1 > > > +#define GUEST_TABLE_NORMAL_PAGE 2 > > > + > > > +static int p2m_create_table(struct p2m_domain *p2m, lpae_t *entry, > > > + int level_shift); > > > + > > > +/* > > > + * Take the currently mapped table, find the corresponding GFN entry, > > > + * and map the next table, if available. > > > > It is important to write down that the function also unmaps the previous > > table. > > Will do. > > > > > > * > > > - * There are no processor functions to do a stage 2 only lookup therefore > > > we > > > - * do a a software walk. > > > + * Return values: > > > + * GUEST_TABLE_MAP_FAILED: Either read_only was set and the entry > > > + * was empty, or allocating a new page failed. > > > + * GUEST_TABLE_NORMAL_PAGE: next level mapped normally > > > + * GUEST_TABLE_SUPER_PAGE: The next entry points to a superpage. > > > */ > > > -static mfn_t __p2m_lookup(struct domain *d, gfn_t gfn, p2m_type_t *t) > > > +static int p2m_next_level(struct p2m_domain *p2m, bool read_only, > > > + lpae_t **table, unsigned int offset) > > > { > > > - struct p2m_domain *p2m = &d->arch.p2m; > > > - const paddr_t paddr = pfn_to_paddr(gfn_x(gfn)); > > > - const unsigned int offsets[4] = { > > > - zeroeth_table_offset(paddr), > > > - first_table_offset(paddr), > > > - second_table_offset(paddr), > > > - third_table_offset(paddr) > > > - }; > > > - const paddr_t masks[4] = { > > > - ZEROETH_MASK, FIRST_MASK, SECOND_MASK, THIRD_MASK > > > - }; > > > - lpae_t pte, *map; > > > + lpae_t *entry; > > > + int ret; > > > + mfn_t mfn; > > > + > > > + entry = *table + offset; > > > + > > > + if ( !p2m_valid(*entry) ) > > > + { > > > + if ( read_only ) > > > + return GUEST_TABLE_MAP_FAILED; > > > + > > > + ret = p2m_create_table(p2m, entry, /* not used */ ~0); > > > + if ( ret ) > > > + return GUEST_TABLE_MAP_FAILED; > > > + } > > > + > > > + /* The function p2m_next_level is never called at the 3rd level */ > > > + if ( p2m_mapping(*entry) ) > > > + return GUEST_TABLE_SUPER_PAGE; > > > + > > > + mfn = _mfn(entry->p2m.base); > > > + > > > + unmap_domain_page(*table); > > > + *table = map_domain_page(mfn); > > > + > > > + return GUEST_TABLE_NORMAL_PAGE; > > > > I am a bit worried about having the same function doing the lookup and > > creating new tables, especially given it doesn't tell you whether the > > entry was already there or it was created: the return value is the same > > in both cases. At the very least the return values should be different. > > I don't understand your worry here. Why would you care that the table has been > allocated or was already existing? > > If the caller does not want to allocate table, it can request to browse the > entry in a read-only mode (see read_only). To avoid unintentional side effects, such as calling this function for a lookup but passing read-only as false by mistake. But maybe we just need to be careful enough in the code reviews, after all this function won't be called that many times. > > > +} > > > + > > > +/* > > > + * Get the details of a given gfn. > > > + * > > > + * If the entry is present, the associated MFN will be returned and the > > > + * access and type filled up. The page_order will correspond to the > > > + * order of the mapping in the page table (i.e it could be a superpage). > > > + * > > > + * If the entry is not present, INVALID_MFN will be returned and the > > > + * page_order will be set according to the order of the invalid range. > > > + */ > > > +static mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn, > > > + p2m_type_t *t, p2m_access_t *a, > > > + unsigned int *page_order) > > > +{ > > > + paddr_t addr = pfn_to_paddr(gfn_x(gfn)); > > > + unsigned int level = 0; > > > + lpae_t entry, *table; > > > + int rc; > > > mfn_t mfn = INVALID_MFN; > > > - paddr_t mask = 0; > > > p2m_type_t _t; > > > - unsigned int level; > > > + > > > + /* Convenience aliases */ > > > + const unsigned int offsets[4] = { > > > + zeroeth_table_offset(addr), > > > + first_table_offset(addr), > > > + second_table_offset(addr), > > > + third_table_offset(addr) > > > + }; > > > > > > ASSERT(p2m_is_locked(p2m)); > > > BUILD_BUG_ON(THIRD_MASK != PAGE_MASK); > > > @@ -267,46 +340,75 @@ static mfn_t __p2m_lookup(struct domain *d, gfn_t > > > gfn, p2m_type_t *t) > > > > > > *t = p2m_invalid; > > > > > > - map = p2m_get_root_pointer(p2m, gfn); > > > - if ( !map ) > > > - return INVALID_MFN; > > > + /* XXX: Check if the mapping is lower than the mapped gfn */ > > > > > > - ASSERT(P2M_ROOT_LEVEL < 4); > > > - > > > - for ( level = P2M_ROOT_LEVEL ; level < 4 ; level++ ) > > > + /* This gfn is higher than the highest the p2m map currently holds */ > > > + if ( gfn_x(gfn) > gfn_x(p2m->max_mapped_gfn) ) > > > { > > > - mask = masks[level]; > > > - > > > - pte = map[offsets[level]]; > > > + for ( level = P2M_ROOT_LEVEL; level < 3; level++ ) > > > + { > > > + if ( (gfn_x(gfn) & (level_masks[level] >> PAGE_SHIFT)) > > > > + gfn_x(p2m->max_mapped_gfn) ) > > > + break; > > > + goto out; > > > > I am not sure what this loop is for, but it looks wrong. > > As mentioned in the description of the function, if the entry is not present, > the function will return the order of the invalid range. > > The loop will find the highest possible order by checking the base of the > block mapping is greater than the max mapped gfn. All right, but shouldn't the `goto out` be right after the loop? Otherwise it is not really a loop :-) > > > + } > > > + } > > > > > > - if ( level == 3 && !p2m_table(pte) ) > > > - /* Invalid, clobber the pte */ > > > - pte.bits = 0; > > > - if ( level == 3 || !p2m_table(pte) ) > > > - /* Done */ > > > - break; > > > + table = p2m_get_root_pointer(p2m, gfn); > > > > > > - ASSERT(level < 3); > > > + /* > > > + * the table should always be non-NULL because the gfn is below > > > + * p2m->max_mapped_gfn and the root table pages are always present. > > > + */ > > > + BUG_ON(table == NULL); > > > > > > - /* Map for next level */ > > > - unmap_domain_page(map); > > > - map = map_domain_page(_mfn(pte.p2m.base)); > > > + for ( level = P2M_ROOT_LEVEL; level < 3; level++ ) > > > + { > > > + rc = p2m_next_level(p2m, true, &table, offsets[level]); > > > + if ( rc == GUEST_TABLE_MAP_FAILED ) > > > + goto out_unmap; > > > + else if ( rc != GUEST_TABLE_NORMAL_PAGE ) > > > + break; > > > } > > > > > > - unmap_domain_page(map); > > > + entry = table[offsets[level]]; > > > > > > - if ( p2m_valid(pte) ) > > > + if ( p2m_valid(entry) ) > > > { > > > - ASSERT(mask); > > > - ASSERT(pte.p2m.type != p2m_invalid); > > > - mfn = _mfn(paddr_to_pfn((pte.bits & PADDR_MASK & mask) | > > > - (paddr & ~mask))); > > > - *t = pte.p2m.type; > > > + *t = entry.p2m.type; > > > > Why don't you have a check for ( t ) like you have done in the case of > > ( a )? It would be more consistent. > > This is done at the beginning of the function: > > /* Allow t to be NULL */ > t = t ?: &_t; > > *t = p2m_invalid; > > This was kept from the implementation of __p2m_lookup because some callers > check the type before checking if the MFN is invalid (e.g get_page_from_gfn). > > I didn't follow the same principle for the access because going through the > radix tree is expensive and most of the hot path does not care about the > access. Makes sense. > > > + > > > + if ( a ) > > > + *a = p2m_mem_access_radix_get(p2m, gfn); > > > + > > > + mfn = _mfn(entry.p2m.base); > > > + /* > > > + * The entry may point to a superpage. Find the MFN associated > > > + * to the GFN. > > > + */ > > > + mfn = mfn_add(mfn, gfn_x(gfn) & ((1UL << level_orders[level]) - > > > 1)); > > > } > > > > > > +out_unmap: > > > + unmap_domain_page(table); > > > + > > > +out: > > > + if ( page_order ) > > > + *page_order = level_shifts[level] - PAGE_SHIFT; > > > + > > > return mfn; > > > } > > > > > > +/* > > > + * Lookup the MFN corresponding to a domain's GFN. > > > + * > > > + * There are no processor functions to do a stage 2 only lookup therefore > > > we > > > + * do a a software walk. > > > + */ > > > +static mfn_t __p2m_lookup(struct domain *d, gfn_t gfn, p2m_type_t *t) > > > +{ > > > + return p2m_get_entry(&d->arch.p2m, gfn, t, NULL, NULL); > > > +} > > > + > > > mfn_t p2m_lookup(struct domain *d, gfn_t gfn, p2m_type_t *t) > > > { > > > mfn_t ret; > > > diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h > > > index 05d9f82..1c5bd8b 100644 > > > --- a/xen/include/asm-arm/page.h > > > +++ b/xen/include/asm-arm/page.h > > > @@ -457,15 +457,19 @@ static inline int gva_to_ipa(vaddr_t va, paddr_t > > > *paddr, unsigned int flags) > > > #define LPAE_ENTRY_MASK (LPAE_ENTRIES - 1) > > > > > > #define THIRD_SHIFT (PAGE_SHIFT) > > > +#define THIRD_ORDER 0 > > > #define THIRD_SIZE ((paddr_t)1 << THIRD_SHIFT) > > > #define THIRD_MASK (~(THIRD_SIZE - 1)) > > > #define SECOND_SHIFT (THIRD_SHIFT + LPAE_SHIFT) > > > +#define SECOND_ORDER (THIRD_ORDER + LPAE_SHIFT) > > > #define SECOND_SIZE ((paddr_t)1 << SECOND_SHIFT) > > > #define SECOND_MASK (~(SECOND_SIZE - 1)) > > > #define FIRST_SHIFT (SECOND_SHIFT + LPAE_SHIFT) > > > +#define FIRST_ORDER (SECOND_ORDER + LPAE_SHIFT) > > > #define FIRST_SIZE ((paddr_t)1 << FIRST_SHIFT) > > > #define FIRST_MASK (~(FIRST_SIZE - 1)) > > > #define ZEROETH_SHIFT (FIRST_SHIFT + LPAE_SHIFT) > > > +#define ZEROETH_ORDER (FIRST_ORDER + LPAE_SHIFT) > > > #define ZEROETH_SIZE ((paddr_t)1 << ZEROETH_SHIFT) > > > #define ZEROETH_MASK (~(ZEROETH_SIZE - 1)) > > > > It might be clearer to define them by SHIFT: > > > > #define THIRD_ORDER (THIRD_SHIFT - PAGE_SHIFT) > > #define SECOND_ORDER (SECOND_SHIFT - PAGE_SHIFT) > > #define FIRST_ORDER (FIRST_SHIFT - PAGE_SHIFT) > > #define ZEROETH_ORDER (ZEROETH_SHIFT - PAGE_SHIFT) > > I was following the way the other constant has been defined. The order of the > 2nd level can be expressed in term of the order of the 3rd level... > > I don't think this is clearer, but I don't mind to use them. > > > or avoid them and just use level_shifts which is already defined. I > > don't think they add much value. > > Really? It avoids to spread - {PAGE,LPAE}_SHIFT everywhere in the code so the > code will be cleaner. > > Note that I have noticed that there are still few places where (level_shifts - > PAGE_SHIFT) is still in use. However, we can replace by level_orders. The reason why I suggested the alternative implementation is that "order" is not commonly used when dealing with pagetable levels (while "mask" and "shift" are). I had a guess about what it meant, but wasn't sure. To be sure I had to read the implementation. I think this version is more obvious about what it does. That said, this is just a suggestion, not a requirement. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.