[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 5/8] xen/riscv: introduce asm/pmap.h header



On Tue, 2024-07-23 at 19:25 +0200, oleksii.kurochko@xxxxxxxxx wrote:
> On Tue, 2024-07-23 at 16:49 +0100, Julien Grall wrote:
> > Hi Oleksii,
> > 
> > On 23/07/2024 16:36, oleksii.kurochko@xxxxxxxxx wrote:
> > > On Tue, 2024-07-23 at 12:02 +0200, Jan Beulich wrote:
> > > > On 23.07.2024 10:55, oleksii.kurochko@xxxxxxxxx wrote:
> > > > > On Tue, 2024-07-23 at 10:36 +0200, Jan Beulich wrote:
> > > > > > On 23.07.2024 10:02, Oleksii Kurochko wrote:
> > > > > > > On Mon, Jul 22, 2024 at 7:27 PM Julien Grall
> > > > > > > <julien@xxxxxxx>
> > > > > > > wrote:
> > > > > > > > > > On 22/07/2024 15:44, Oleksii Kurochko wrote:
> > > > > > > > >      /* Map a 4k page in a fixmap entry */
> > > > > > > > >      void set_fixmap(unsigned map, mfn_t mfn,
> > > > > > > > > unsigned
> > > > > > > > > int
> > > > > > > > > flags)
> > > > > > > > >      {
> > > > > > > > >          pte_t pte;
> > > > > > > > > 
> > > > > > > > >          pte = mfn_to_xen_entry(mfn, flags);
> > > > > > > > >          pte.pte |= PTE_LEAF_DEFAULT;
> > > > > > > > >          write_pte(&xen_fixmap[pt_index(0,
> > > > > > > > > FIXMAP_ADDR(map))],
> > > > > > > > > pte);
> > > > > > > > 
> > > > > > > > It would be saner to check if you are not overwriting
> > > > > > > > any
> > > > > > > > existing
> > > > > > > > mapping as otherwise you will probably need a TLB
> > > > > > > > flush.
> > > > > > > > 
> > > > > > > > >      }
> > > > > > > > > 
> > > > > > > > >      /* Remove a mapping from a fixmap entry */
> > > > > > > > >      void clear_fixmap(unsigned map)
> > > > > > > > >      {
> > > > > > > > >          pte_t pte = {0};
> > > > > > > > >          write_pte(&xen_fixmap[pt_index(0,
> > > > > > > > > FIXMAP_ADDR(map))],
> > > > > > > > > pte);
> > > > > > > > 
> > > > > > > > Don't you need a TLB flush?
> > > > > > > > 
> > > > > > > Inside write_pte() there is "sfence.vma".
> > > > > > 
> > > > > > That's just a fence though, not a TLB flush.
> > > > >  From the privileged doc:
> > > > >     ```
> > > > >     SFENCE.VMA is also used to invalidate entries in the
> > > > >     address-translation cache associated with a hart (see
> > > > > Section
> > > > > 4.3.2).
> > > > >     ...
> > > > >     The SFENCE.VMA is used to flush any local hardware caches
> > > > > related to
> > > > >     address translation.
> > > > >     It is specified as a fence rather than a TLB flush to
> > > > > provide
> > > > > cleaner
> > > > >     semantics with respect to
> > > > >     which instructions are affected by the flush operation
> > > > > and
> > > > > to
> > > > > support a
> > > > >     wider variety of dynamic
> > > > >     caching structures and memory-management schemes.
> > > > > SFENCE.VMA is
> > > > > also
> > > > >     used by higher
> > > > >     privilege levels to synchronize page table writes and the
> > > > > address
> > > > >     translation hardware.
> > > > >     ...
> > > > >     ```
> > > > > I read this as SFENCE.VMA is used not only for ordering of
> > > > > load/stores,
> > > > > but also to flush TLB ( which is a type of more general term
> > > > > as
> > > > > address-translation cache, IIUIC ).
> > I have to admit, I am a little because concerned with calling
> > sfence.vma 
> > in write_pte() (this may only be because I am not very familiar
> > with 
> > RISC-V).
> > 
> > We have cases where multiple entry will be written in a single 
> > map_pages_to_xen() call. So wouldn't this means that the local TLBs
> > would be nuked for every write rather than once?
> Yes, it will be nuked. It is bad from perfomance point of view.
> I just wanted to be sure that I won't miss to put sfence.vma when it
> is
> necessary and then reworked that a little bit after. But it seems it
> would be better not to call sfence.vma in write_pte() just from the
> start.
> 
> 
> > 
> > 
> > > > 
> > > > Oh, I see. Kind of unexpected for an instruction of that name.
> > > > Yet
> > > > note
> > > > how they talk about the local hart only. You need a wider scope
> > > > TLB
> > > > flush here.
> > > Could you please clarify why it is needed wider?
> > > 
> > > Arm Xen flushed only local TLB.
> > 
> > Which code are you looking at? set_fixmap() will propagate the TLB
> > flush 
> > to all innershareable CPUs.
> Yes, here I agree that set_fixmap() uses map_pages_to_xen which
> somewhere inside uses flush_xen_tlb_range_va() ( not
> flush_xen_tlb_range_va() ) so TLB flush will happen for all
> innershareable CPUs.
> > 
> > The PMAP interface will do a local TLB flush because the interface
> > can 
> > only be used during early boot where there is a single CPU running.
> 
> Yes, I am looking at PMAP:
>    static inline void arch_pmap_unmap(unsigned int slot)
>    {
>        lpae_t pte = {};
>    
>        write_pte(&xen_fixmap[slot], pte);
>    
>        flush_xen_tlb_range_va(FIXMAP_ADDR(slot), PAGE_SIZE);
>    }
> IIUC, originaly Jan told about arch_pmap_unmap() case so that is why
> I
> decided to clarify additionaly.

Julien,
I have a questation related to Arm's version of arch_pmap_map():

   static inline void arch_pmap_map(unsigned int slot, mfn_t mfn)
   {
       lpae_t *entry = &xen_fixmap[slot];
       lpae_t pte;
   
       ASSERT(!lpae_is_valid(*entry));
   
       pte = mfn_to_xen_entry(mfn, PAGE_HYPERVISOR_RW);
       pte.pt.table = 1;
       write_pte(entry, pte);
       /*
        * The new entry will be used very soon after arch_pmap_map()
   returns.
        * So ensure the DSB in write_pte() has completed before
   continuing.
        */
       isb();
   }
   
Is the code above isb() is correct? is it insure the DSB not ISB?
And isn't need to do TLB flush here?

~ Oleksii


> 
> > 
> > > RISC-V Linux kernel for fixmap also uses: local_flush_tlb_page().
> > 
> > I don't know how Linux is using set_fixmap(). But what matters is
> > how
> > Xen is using set_fixmap(). We have a couple of places in Xen where
> > the 
> > fixmap needs to be accessed by all the CPUs.
> > 
> > Given this is a common interface in Xen, I think it makes sense to 
> > follow the same approach to avoid any confusion.
> Sure. Thanks for claryfying. I will flush_xen_tlb_range_va() for
> set_fixmap().
> 
> 
> ~ Oleksii




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.