[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 2/2] x86: allow Meltdown band-aid to be disabled



On 15/01/18 11:07, Jan Beulich wrote:
> First of all we don't need it on AMD systems. Additionally allow its use
> to be controlled by command line option. For best backportability, this
> intentionally doesn't use alternative instruction patching to achieve
> the intended effect - while we likely want it, this will be later
> follow-up.
>
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> ---
> v2: New.
>
> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -1849,6 +1849,15 @@ In the case that x2apic is in use, this
>  clustered mode.  The default, given no hint from the **FADT**, is cluster
>  mode.
>  
> +### xpti
> +> `= <boolean>`
> +
> +> Default: `false` on AMD hardware
> +> Default: `true` everywhere else

This is fine for now, but any further isolation is going to have to be
unconditional, or possibly compile-time choice, but maintaining that
going forwards is going to be truly horrible.

> +
> +Override default selection of whether to isolate 64-bit PV guest page
> +tables.

This wants a

** WARNING: Not yet a complete isolation implementation, but better than
nothing. **

or similar, just to avoid giving the people the impression that it is
complete.  Perhaps also a similar warning in the patch 1 commit message?

> +
>  ### xsave
>  > `= <boolean>`
>  
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -1511,8 +1511,10 @@ void paravirt_ctxt_switch_to(struct vcpu
>  {
>      unsigned long cr4;
>  
> -    this_cpu(root_pgt)[root_table_offset(PERDOMAIN_VIRT_START)] =
> -        l4e_from_page(v->domain->arch.perdomain_l3_pg, __PAGE_HYPERVISOR_RW);
> +    if ( this_cpu(root_pgt) )
> +        this_cpu(root_pgt)[root_table_offset(PERDOMAIN_VIRT_START)] =
> +            l4e_from_page(v->domain->arch.perdomain_l3_pg,
> +                          __PAGE_HYPERVISOR_RW);

This would be cleaner and have better generated code by pulling
this_cpu(root_pgt) into a local variable.

>  
>      cr4 = pv_guest_cr4_to_real_cr4(v);
>      if ( unlikely(cr4 != read_cr4()) )
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -3685,7 +3685,7 @@ long do_mmu_update(
>                      rc = mod_l4_entry(va, l4e_from_intpte(req.val), mfn,
>                                        cmd == MMU_PT_UPDATE_PRESERVE_AD, v);
>                      if ( !rc )
> -                        sync_guest = true;
> +                        sync_guest = this_cpu(root_pgt);

This is quite deceptive, as it is actually sync_guest =
this_cpu(root_pgt) != NULL;

It would probably be cleaner to export opt_xpti and use that, rather
than imply things based on root_pgt.  It would certainly be more
efficient code.

>                      break;
>  
>                  case PGT_writable_page:
> --- a/xen/arch/x86/smpboot.c
> +++ b/xen/arch/x86/smpboot.c
> @@ -328,7 +328,7 @@ void start_secondary(void *unused)
>      spin_debug_disable();
>  
>      get_cpu_info()->xen_cr3 = 0;
> -    get_cpu_info()->pv_cr3 = __pa(this_cpu(root_pgt));
> +    get_cpu_info()->pv_cr3 = this_cpu(root_pgt) ? __pa(this_cpu(root_pgt)) : 
> 0;
>  
>      load_system_tables();
>  
> @@ -734,14 +734,20 @@ static int clone_mapping(const void *ptr
>      return 0;
>  }
>  
> +static __read_mostly int8_t opt_xpti = -1;
> +boolean_param("xpti", opt_xpti);
>  DEFINE_PER_CPU(root_pgentry_t *, root_pgt);
>  
>  static int setup_cpu_root_pgt(unsigned int cpu)
>  {
> -    root_pgentry_t *rpt = alloc_xen_pagetable();
> +    root_pgentry_t *rpt;
>      unsigned int off;
>      int rc;
>  
> +    if ( !opt_xpti )
> +        return 0;
> +
> +    rpt = alloc_xen_pagetable();
>      if ( !rpt )
>          return -ENOMEM;
>  
> @@ -992,10 +998,13 @@ void __init smp_prepare_cpus(unsigned in
>  
>      stack_base[0] = stack_start;
>  
> +    if ( opt_xpti < 0 )
> +        opt_xpti = boot_cpu_data.x86_vendor != X86_VENDOR_AMD;

Newline?

>      rc = setup_cpu_root_pgt(0);
>      if ( rc )
>          panic("Error %d setting up PV root page table\n", rc);
> -    get_cpu_info()->pv_cr3 = __pa(per_cpu(root_pgt, 0));
> +    if ( per_cpu(root_pgt, 0) )
> +        get_cpu_info()->pv_cr3 = __pa(per_cpu(root_pgt, 0));
>  
>      set_nr_sockets();
>  
> @@ -1067,6 +1076,7 @@ void __init smp_prepare_boot_cpu(void)
>  #endif
>  
>      get_cpu_info()->xen_cr3 = 0;
> +    get_cpu_info()->pv_cr3 = 0;
>  }
>  
>  static void
> --- a/xen/arch/x86/x86_64/entry.S
> +++ b/xen/arch/x86/x86_64/entry.S
> @@ -46,6 +46,7 @@ restore_all_guest:
>          movabs $DIRECTMAP_VIRT_START, %rcx
>          mov   %rdi, %rax
>          and   %rsi, %rdi
> +        jz    .Lrag_keep_cr3

What is rag?  What about .L_skip_cr3_reload ?

~Andrew

>          and   %r9, %rsi
>          add   %rcx, %rdi
>          add   %rcx, %rsi
> @@ -62,6 +63,7 @@ restore_all_guest:
>          rep movsq
>          mov   %r9, STACK_CPUINFO_FIELD(xen_cr3)(%rdx)
>          write_cr3 rax, rdi, rsi
> +.Lrag_keep_cr3:
>  
>          RESTORE_ALL
>          testw $TRAP_syscall,4(%rsp)
>
>
>


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.