[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 5/8] xen/riscv: introduce asm/pmap.h header





On 23/07/2024 18:28, oleksii.kurochko@xxxxxxxxx wrote:
On Tue, 2024-07-23 at 19:25 +0200, oleksii.kurochko@xxxxxxxxx wrote:
On Tue, 2024-07-23 at 16:49 +0100, Julien Grall wrote:
Hi Oleksii,

On 23/07/2024 16:36, oleksii.kurochko@xxxxxxxxx wrote:
On Tue, 2024-07-23 at 12:02 +0200, Jan Beulich wrote:
On 23.07.2024 10:55, oleksii.kurochko@xxxxxxxxx wrote:
On Tue, 2024-07-23 at 10:36 +0200, Jan Beulich wrote:
On 23.07.2024 10:02, Oleksii Kurochko wrote:
On Mon, Jul 22, 2024 at 7:27 PM Julien Grall
<julien@xxxxxxx>
wrote:
On 22/07/2024 15:44, Oleksii Kurochko wrote:
      /* Map a 4k page in a fixmap entry */
      void set_fixmap(unsigned map, mfn_t mfn,
unsigned
int
flags)
      {
          pte_t pte;

          pte = mfn_to_xen_entry(mfn, flags);
          pte.pte |= PTE_LEAF_DEFAULT;
          write_pte(&xen_fixmap[pt_index(0,
FIXMAP_ADDR(map))],
pte);

It would be saner to check if you are not overwriting
any
existing
mapping as otherwise you will probably need a TLB
flush.

      }

      /* Remove a mapping from a fixmap entry */
      void clear_fixmap(unsigned map)
      {
          pte_t pte = {0};
          write_pte(&xen_fixmap[pt_index(0,
FIXMAP_ADDR(map))],
pte);

Don't you need a TLB flush?

Inside write_pte() there is "sfence.vma".

That's just a fence though, not a TLB flush.
  From the privileged doc:
     ```
     SFENCE.VMA is also used to invalidate entries in the
     address-translation cache associated with a hart (see
Section
4.3.2).
     ...
     The SFENCE.VMA is used to flush any local hardware caches
related to
     address translation.
     It is specified as a fence rather than a TLB flush to
provide
cleaner
     semantics with respect to
     which instructions are affected by the flush operation
and
to
support a
     wider variety of dynamic
     caching structures and memory-management schemes.
SFENCE.VMA is
also
     used by higher
     privilege levels to synchronize page table writes and the
address
     translation hardware.
     ...
     ```
I read this as SFENCE.VMA is used not only for ordering of
load/stores,
but also to flush TLB ( which is a type of more general term
as
address-translation cache, IIUIC ).
I have to admit, I am a little because concerned with calling
sfence.vma
in write_pte() (this may only be because I am not very familiar
with
RISC-V).

We have cases where multiple entry will be written in a single
map_pages_to_xen() call. So wouldn't this means that the local TLBs
would be nuked for every write rather than once?
Yes, it will be nuked. It is bad from perfomance point of view.
I just wanted to be sure that I won't miss to put sfence.vma when it
is
necessary and then reworked that a little bit after. But it seems it
would be better not to call sfence.vma in write_pte() just from the
start.





Oh, I see. Kind of unexpected for an instruction of that name.
Yet
note
how they talk about the local hart only. You need a wider scope
TLB
flush here.
Could you please clarify why it is needed wider?

Arm Xen flushed only local TLB.

Which code are you looking at? set_fixmap() will propagate the TLB
flush
to all innershareable CPUs.
Yes, here I agree that set_fixmap() uses map_pages_to_xen which
somewhere inside uses flush_xen_tlb_range_va() ( not
flush_xen_tlb_range_va() ) so TLB flush will happen for all
innershareable CPUs.

The PMAP interface will do a local TLB flush because the interface
can
only be used during early boot where there is a single CPU running.

Yes, I am looking at PMAP:
    static inline void arch_pmap_unmap(unsigned int slot)
    {
        lpae_t pte = {};
       write_pte(&xen_fixmap[slot], pte);        flush_xen_tlb_range_va(FIXMAP_ADDR(slot), PAGE_SIZE);
    }
IIUC, originaly Jan told about arch_pmap_unmap() case so that is why
I
decided to clarify additionaly.

Julien,
I have a questation related to Arm's version of arch_pmap_map():

    static inline void arch_pmap_map(unsigned int slot, mfn_t mfn)
    {
        lpae_t *entry = &xen_fixmap[slot];
        lpae_t pte;
ASSERT(!lpae_is_valid(*entry)); pte = mfn_to_xen_entry(mfn, PAGE_HYPERVISOR_RW);
        pte.pt.table = 1;
        write_pte(entry, pte);
        /*
         * The new entry will be used very soon after arch_pmap_map()
    returns.
         * So ensure the DSB in write_pte() has completed before
    continuing.
         */
        isb();
    }
Is the code above isb() is correct? is it insure the DSB not ISB?

I guess you mean comment. If so, yes it is correct. write_pte() has a dsb() just after the PTE. This is to guarantee that implicit memory access (instruction fetch or from the Hardware translation table access) will only happen *after* the dsb() has completed.

I am not 100% sure whether the isb() is required for arch_pmap_map(). But I took the safest approach when implementing it.

And isn't need to do TLB flush here?

It is not. On Arm, we have a strict policy that every unmap will be followed by a TLB flush and you can't modify an existing entry (see ASSERT a few lines above). So the TLBs will not contain an entry for mapping.

This policy was mainly dictacted by the Arm Arm because when modifying the output address you need to follow the break-before-make sequence which means you have to transition to an invalid mapping and flush the TLBs before the new entry is added. For simplicity, we decided to just not bother with trying to implement break-before-make for the hypervisor page-tables. But we do for the P2M.

Note that newer revision of the Armv8 spec relaxed the requirement (see FEAT_BBM).

Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.