|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v6 8/9] xen/riscv: page table handling
> > +/*
> > + * The PTE format does not contain the following bits within
> > itself;
> > + * they are created artificially to inform the Xen page table
> > + * handling algorithm. These bits should not be explicitly written
> > + * to the PTE entry.
> > + */
> > +#define PTE_SMALL BIT(10, UL)
> > +#define PTE_POPULATE BIT(11, UL)
> > +
> > +#define PTE_XWV_BITS (PTE_WRITABLE | PTE_EXECUTABLE |
> > PTE_VALID)
> > +#define PTE_XWV_MASK(x) ((x) & PTE_XWV_BITS)
> > +#define PTE_RWX_MASK(x) ((x) & (PTE_READABLE | PTE_WRITABLE |
> > PTE_EXECUTABLE))
>
> I think I commented on *_MASK macros before: They are conventionally
> constants (see e.g. PAGETABLE_ORDER_MASK that you have further down),
> not operations on an input. It's not really clear to me what the
> "mask" in this name is meant to signify as to what the macros are
> doing. I seem to vaguely recall that you indicated you'd drop all
> such helpers, in favor of using respective constants directly at use
> sites.
Regarding *_MASK I wrote about PTE_{R,W,X}_MASK ( which were dropped
becuase they were used only once ) and by MASK here I mean that only
some bits are going to be taken. For example, PTE_XWV_MASK() means that
only eXecute, Write and Valid bits will be taken. Probably EXTR (
extract ) would be better to use instead of EXTR.
>
> As a less significant (because of being a matter of personal taste to
> a fair degree) aspect: XWV is a pretty random sequence of characters.
> I for one wouldn't remember what order they need to be used in, and
> hence would always need to look this up.
I used that letter as they are used by RISC-V spec.
>
> Taken together, what about
>
> #define PTE_LEAF_MASK (PTE_WRITABLE | PTE_EXECUTABLE | PTE_VALID)
> #define PTE_ACCESS_MASK (PTE_READABLE | PTE_WRITABLE |
> PTE_EXECUTABLE)
>
> ?
Looks good to me. I will use them. Thanks for the naming and
clarification.
>
> > @@ -68,6 +109,37 @@ static inline bool pte_is_valid(pte_t p)
> > return p.pte & PTE_VALID;
> > }
> >
> > +/*
> > + * From the RISC-V spec:
> > + * Table 4.5 summarizes the encoding of the permission bits.
> > + * X W R Meaning
> > + * 0 0 0 Pointer to next level of page table.
> > + * 0 0 1 Read-only page.
> > + * 0 1 0 Reserved for future use.
> > + * 0 1 1 Read-write page.
> > + * 1 0 0 Execute-only page.
> > + * 1 0 1 Read-execute page.
> > + * 1 1 0 Reserved for future use.
> > + * 1 1 1 Read-write-execute page.
> > + */
> > +inline bool pte_is_table(const pte_t p)
> > +{
> > + return ((p.pte & (PTE_VALID |
> > + PTE_READABLE |
> > + PTE_WRITABLE |
> > + PTE_EXECUTABLE)) == PTE_VALID);
> > +}
> > +
> > +static inline bool pte_is_mapping(const pte_t p)
> > +{
> > + /* W = 1 || (X=1 && W=1) -> Reserved for future use */
> > + ASSERT((PTE_RWX_MASK(p.pte) != PTE_WRITABLE) ||
> > + (PTE_RWX_MASK(p.pte) != (PTE_WRITABLE |
> > PTE_EXECUTABLE)));
>
> I'm afraid I'm pretty unhappy with comments not matching the
> commented
> code: The comment mentions only set bits, but not clear ones.
I assumed that it would be clear that other bits should be 0 taking
into account the table above but I will update the comment to be more
precise.
> Further
> you're missing a check of the V bit - with that clear, the other bits
> can be set whichever way. Taken together (and the spec also says it
> this way): If V=1 and W=1 then R also needs to be 1.
My intention was to check in the way how it is mentioned in the table
4.5. For example,
0 1 0 Reserved for future use.
So I wanted to check that X=R=0 and W=1, I just confused myself with
that ASSERT(p) checks inside !p. I will update ASSERT() properly.
>
> Also - isn't this check equally relevant in pte_is_table()?
Missed that, it should be in pte_is_table() too.
>
> > --- a/xen/arch/riscv/include/asm/riscv_encoding.h
> > +++ b/xen/arch/riscv/include/asm/riscv_encoding.h
> > @@ -164,6 +164,7 @@
> > #define SSTATUS_SD SSTATUS64_SD
> > #define SATP_MODE SATP64_MODE
> > #define SATP_MODE_SHIFT SATP64_MODE_SHIFT
> > +#define SATP_PPN_MASK _UL(0x00000FFFFFFFFFFF)
>
> Why not SATP64_PPN on the rhs? And why no similar #define in the
> #else
> block that follows, using SATP32_PPN?
>
> > --- /dev/null
> > +++ b/xen/arch/riscv/pt.c
> > @@ -0,0 +1,423 @@
> > +#include <xen/bug.h>
> > +#include <xen/domain_page.h>
> > +#include <xen/errno.h>
> > +#include <xen/lib.h>
> > +#include <xen/mm.h>
> > +#include <xen/mm-frame.h>
>
> I don#t think you need this when you already have xen/mm.h.
>
> > +#include <xen/pfn.h>
> > +#include <xen/pmap.h>
> > +#include <xen/spinlock.h>
> > +
> > +#include <asm/flushtlb.h>
> > +#include <asm/page.h>
> > +
> > +static inline const mfn_t get_root_page(void)
> > +{
> > + paddr_t root_maddr = (csr_read(CSR_SATP) & SATP_PPN_MASK) <<
> > PAGE_SHIFT;
>
> Won't this lose bits in RV32 mode? IOW wouldn't you better avoid
> open-
> coding pfn_to_paddr() here?
Considering that PPN for RV32 mode is 22 bits then it will lose bits.
Anyway I agree that it would be better to use pfn_to_paddr().
>
> > +static int pt_update_entry(mfn_t root, unsigned long virt,
> > + mfn_t mfn, unsigned int target,
> > + unsigned int flags)
> > +{
> > + int rc;
> > + unsigned int level = HYP_PT_ROOT_LEVEL;
> > + pte_t *table;
> > + /*
> > + * The intermediate page table shouldn't be allocated when MFN
> > isn't
> > + * valid and we are not populating page table.
> > + * This means we either modify permissions or remove an entry,
> > or
> > + * inserting brand new entry.
> > + *
> > + * See the comment above pt_update() for an additional
> > explanation about
> > + * combinations of (mfn, flags).
> > + */
> > + bool alloc_tbl = !mfn_eq(mfn, INVALID_MFN) || (flags &
> > PTE_POPULATE);
> > + pte_t pte, *entry;
> > +
> > + /* convenience aliases */
> > + DECLARE_OFFSETS(offsets, virt);
> > +
> > + table = map_table(root);
> > + for ( ; level > target; level-- )
> > + {
> > + rc = pt_next_level(alloc_tbl, &table, offsets[level]);
> > + if ( rc == XEN_TABLE_MAP_FAILED )
> > + {
> > + rc = 0;
> > +
> > + /*
> > + * We are here because pt_next_level has failed to map
> > + * the intermediate page table (e.g the table does not
> > exist
> > + * and the pt shouldn't be allocated). It is a valid
> > case when
> > + * removing a mapping as it may not exist in the page
> > table.
> > + * In this case, just ignore it.
> > + */
> > + if ( flags & PTE_VALID )
> > + {
> > + printk("%s: Unable to map level %u\n", __func__,
> > level);
> > + rc = -ENOENT;
> > + }
>
> Both comment and error code assume the !populate case. What about the
> case
> where the allocation failed? That's "couldn't be allocated" and would
> better
> return back -ENOMEM (as create_table() correctly returns in that
> case).
The condition should be updated here:
if ( flags & (PTE_VALID|PTE_POPULATE) )
{
...
>
> > +static int pt_mapping_level(unsigned long vfn, mfn_t mfn, unsigned
> > long nr,
> > + unsigned int flags)
> > +{
> > + unsigned int level = 0;
> > + unsigned long mask;
> > + unsigned int i;
> > +
> > + /* Use blocking mapping unless the caller requests 4K mapping
> > */
> > + if ( unlikely(flags & PTE_SMALL) )
> > + return level;
>
> Maybe "block mapping" in the comment? "Blocking" typically has quite
> different a meaning. I'm uncertain about that terminology anyway, as
> the spec doesn't use it.
You are right that the spec doesn't define how to call bigger then 4k
mapping so I just re-use terminology from Arm here. Probably it would
be better just to reword:
/* Use a larger mapping than 4K unless the caller specifically requests
a 4K mapping */
>
> > + * If `mfn` != INVALID_MFN and flags has PTE_VALID bit set then it
> > means that
> > + * inserting will be done.
> > + */
> > +static int pt_update(unsigned long virt,
> > + mfn_t mfn,
> > + unsigned long nr_mfns,
> > + unsigned int flags)
> > +{
> > + int rc = 0;
> > + unsigned long vfn = PFN_DOWN(virt);
> > + unsigned long left = nr_mfns;
> > +
> > + const mfn_t root = get_root_page();
> > +
> > + /*
> > + * It is bad idea to have mapping both writeable and
> > + * executable.
> > + * When modifying/creating mapping (i.e PTE_VALID is set),
> > + * prevent any update if this happen.
> > + */
> > + if ( PTE_XWV_MASK(flags) == PTE_XWV_BITS )
> > + {
> > + printk("Mappings should not be both Writeable and
> > Executable.\n");
>
> I'm pretty sure I asked before that you omit full stops from log
> messages.
Yes, you asked and I think that even in this place. Just overlooked
that it was a lot of comments to the previous patch version. Sorry for
that.
>
> > +int map_pages_to_xen(unsigned long virt,
> > + mfn_t mfn,
> > + unsigned long nr_mfns,
> > + unsigned int flags)
> > +{
> > + /*
> > + * Ensure that flags has PTE_VALID bit as map_pages_to_xen()
> > is supposed
> > + * to create a mapping.
> > + *
> > + * Ensure that we have a valid MFN before proceeding.
> > + *
> > + * If the MFN is invalid, pt_update() might misinterpret the
> > operation,
> > + * treating it as either a population, a mapping destruction,
> > + * or a mapping modification.
> > + */
> > + ASSERT(!mfn_eq(mfn, INVALID_MFN) || (flags & PTE_VALID));
>
> Judging from the comment, do you mean && instead of || ?
Yes, it should be &&.
Thanks.
~ Oleksii
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |