[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/pv: Flush TLB in response to paging structure changes



On 21.10.2020 15:07, Andrew Cooper wrote:
> @@ -4037,6 +4037,9 @@ long do_mmu_update(
>                          break;
>                      rc = mod_l2_entry(va, l2e_from_intpte(req.val), mfn,
>                                        cmd == MMU_PT_UPDATE_PRESERVE_AD, v);
> +                    /* Paging structure maybe changed.  Flush linear range. 
> */
> +                    if ( !rc )
> +                        flush_flags_all |= FLUSH_TLB;
>                      break;
>  
>                  case PGT_l3_page_table:
> @@ -4044,6 +4047,9 @@ long do_mmu_update(
>                          break;
>                      rc = mod_l3_entry(va, l3e_from_intpte(req.val), mfn,
>                                        cmd == MMU_PT_UPDATE_PRESERVE_AD, v);
> +                    /* Paging structure maybe changed.  Flush linear range. 
> */
> +                    if ( !rc )
> +                        flush_flags_all |= FLUSH_TLB;
>                      break;
>  
>                  case PGT_l4_page_table:
> @@ -4051,27 +4057,28 @@ long do_mmu_update(
>                          break;
>                      rc = mod_l4_entry(va, l4e_from_intpte(req.val), mfn,
>                                        cmd == MMU_PT_UPDATE_PRESERVE_AD, v);
> -                    if ( !rc && pt_owner->arch.pv.xpti )
> +                    /* Paging structure maybe changed.  Flush linear range. 
> */
> +                    if ( !rc )
>                      {
> -                        bool local_in_use = false;
> +                        bool local_in_use = mfn_eq(
> +                            pagetable_get_mfn(curr->arch.guest_table), mfn);
>  
> -                        if ( 
> mfn_eq(pagetable_get_mfn(curr->arch.guest_table),
> -                                    mfn) )
> -                        {
> -                            local_in_use = true;
> -                            get_cpu_info()->root_pgt_changed = true;
> -                        }
> +                        flush_flags_all |= FLUSH_TLB;
> +
> +                        if ( local_in_use )
> +                            flush_flags_local |= FLUSH_TLB | 
> FLUSH_ROOT_PGTBL;

Aiui here (and in the code consuming the flags) you build upon
flush_flags_local, when not zero, always being a superset of
flush_flags_all. I think this is a trap to fall into when later
wanting to change this code, but as per below this won't hold
anymore anyway, I think. Hence here I think you want to set
FLUSH_TLB unconditionally, and above for L3 and L2 you want to
set it in both variables. Or, if I'm wrong below, a comment to
that effect may help people avoid falling into this trap.

An alternative would be to have

    flush_flags_local |= (flush_flags_all & FLUSH_TLB);

before doing the actual flush.

>                          /*
>                           * No need to sync if all uses of the page can be
>                           * accounted to the page lock we hold, its pinned
>                           * status, and uses on this (v)CPU.
>                           */
> -                        if ( (page->u.inuse.type_info & PGT_count_mask) >
> +                        if ( pt_owner->arch.pv.xpti &&

I assume you've moved this here to avoid the further non-trivial
checks when possible, but you've not put it around the setting
of FLUSH_ROOT_PGTBL in flush_flags_local because setting
->root_pgt_changed is benign when !XPTI?

> +                             (page->u.inuse.type_info & PGT_count_mask) >
>                               (1 + !!(page->u.inuse.type_info & PGT_pinned) +
>                                
> mfn_eq(pagetable_get_mfn(curr->arch.guest_table_user),
>                                       mfn) + local_in_use) )
> -                            sync_guest = true;
> +                            flush_flags_all |= FLUSH_ROOT_PGTBL;

This needs to also specify FLUSH_TLB_GLOBAL, or else original
XPTI behavior gets broken. Since the local CPU doesn't need this,
the variable may then better be named flush_flags_remote?

Or if I'm wrong here, the reasoning behind the dropping of the
global flush in this case needs putting in the description, not
the least because it would mean the change introducing it went
too far.

> @@ -4173,18 +4180,36 @@ long do_mmu_update(
>      if ( va )
>          unmap_domain_page(va);
>  
> -    if ( sync_guest )
> +    /*
> +     * Flushing needs to occur for one of several reasons.
> +     *
> +     * 1) An update to an L2 or higher occured.  This potentially changes the
> +     *    pagetable structure, requiring a flush of the linear range.
> +     * 2) An update to an L4 occured, and XPTI is enabled.  All CPUs running
> +     *    on a copy of this L4 need refreshing.
> +     */
> +    if ( flush_flags_all || flush_flags_local )

Minor remark: At least in x86 code it is more efficient to use
| instead of || in such cases, to avoid relying on the compiler
to carry out this small optimization. It may well be that all
compilers we care about do so, but it's certainly not the case
for all the compilers I've ever worked with.

>      {
> +        cpumask_t *mask = pt_owner->dirty_cpumask;
> +
>          /*
> -         * Force other vCPU-s of the affected guest to pick up L4 entry
> -         * changes (if any).
> +         * Local flushing may be asymmetric with remote.  If there is local
> +         * flushing to do, perform it separately and omit the current CPU 
> from
> +         * pt_owner->dirty_cpumask.
>           */
> -        unsigned int cpu = smp_processor_id();
> -        cpumask_t *mask = per_cpu(scratch_cpumask, cpu);
> +        if ( flush_flags_local )
> +        {
> +            unsigned int cpu = smp_processor_id();
> +
> +            mask = per_cpu(scratch_cpumask, cpu);
> +            cpumask_copy(mask, pt_owner->dirty_cpumask);
> +            __cpumask_clear_cpu(cpu, mask);
> +
> +            flush_local(flush_flags_local);
> +        }
>  
> -        cpumask_andnot(mask, pt_owner->dirty_cpumask, cpumask_of(cpu));

I understand you're of the opinion that cpumask_copy() +
__cpumask_clear_cpu() is more efficient than cpumask_andnot()?
(I guess I agree for high enough CPU counts.)

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.