[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/xen: Clear user %gs before updating segment descriptors



On Fri, Dec 7, 2018 at 3:15 PM David Woodhouse <dwmw@xxxxxxxxxxxx> wrote:
>
> During a context switch, if clearing a descriptor which is currently
> referenced by the old process's user %gs, if Xen preempts the vCPU
> before %gs is set for the new process, a fault may occur.
>
> This fault actually seems to be fairly harmless; xen_failsafe_callback
> will just return to the "faulting" instruction (the random one after the
> vCPU was preempted) with the offending segment register zeroed, and then
> it'll get set again later during the context switch. But it's cleaner
> to avoid it.
>
> If the descriptor referenced by the %gs selector is being touched,
> then include a request to zero the user %gs in the multicall too.

Fine with me.

>
> Signed-off-by: David Woodhouse <dwmw@xxxxxxxxxxxx>
> ---
> v2: Don't accidentally remove the call to xen_mc_batch().
>
>  arch/x86/include/asm/xen/hypercall.h | 11 ++++++++
>  arch/x86/xen/enlighten_pv.c          | 40 ++++++++++++++++++++++------
>  2 files changed, 43 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/include/asm/xen/hypercall.h 
> b/arch/x86/include/asm/xen/hypercall.h
> index ef05bea7010d..e8b383b24246 100644
> --- a/arch/x86/include/asm/xen/hypercall.h
> +++ b/arch/x86/include/asm/xen/hypercall.h
> @@ -520,4 +520,15 @@ MULTI_stack_switch(struct multicall_entry *mcl,
>         trace_xen_mc_entry(mcl, 2);
>  }
>
> +static inline void
> +MULTI_set_segment_base(struct multicall_entry *mcl,
> +                      int reg, unsigned long value)
> +{
> +       mcl->op = __HYPERVISOR_set_segment_base;
> +       mcl->args[0] = reg;
> +       mcl->args[1] = value;
> +
> +       trace_xen_mc_entry(mcl, 2);
> +}
> +
>  #endif /* _ASM_X86_XEN_HYPERCALL_H */
> diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
> index 2f6787fc7106..2eb9827dab4b 100644
> --- a/arch/x86/xen/enlighten_pv.c
> +++ b/arch/x86/xen/enlighten_pv.c
> @@ -506,7 +506,7 @@ static inline bool desc_equal(const struct desc_struct 
> *d1,
>  }
>
>  static void load_TLS_descriptor(struct thread_struct *t,
> -                               unsigned int cpu, unsigned int i)
> +                               unsigned int cpu, unsigned int i, int 
> flush_gs)
>  {
>         struct desc_struct *shadow = &per_cpu(shadow_tls_desc, cpu).desc[i];
>         struct desc_struct *gdt;
> @@ -516,6 +516,17 @@ static void load_TLS_descriptor(struct thread_struct *t,
>         if (desc_equal(shadow, &t->tls_array[i]))
>                 return;
>
> +       /*
> +        * If the current user %gs points to a descriptor we're changing,
> +        * zero it first to avoid taking a fault if Xen preempts this
> +        * vCPU between now and the time that %gs is later loaded with
> +        * the new value.
> +        */
> +       if ((flush_gs >> 3) == GDT_ENTRY_TLS_MIN + i) {
> +               mc = __xen_mc_entry(0);
> +               MULTI_set_segment_base(mc.mc, SEGBASE_GS_USER_SEL, 0);
> +       }
> +
>         *shadow = t->tls_array[i];
>
>         gdt = get_cpu_gdt_rw(cpu);
> @@ -527,6 +538,8 @@ static void load_TLS_descriptor(struct thread_struct *t,
>
>  static void xen_load_tls(struct thread_struct *t, unsigned int cpu)
>  {
> +       u16 flush_gs = 0;
> +
>         /*
>          * XXX sleazy hack: If we're being called in a lazy-cpu zone
>          * and lazy gs handling is enabled, it means we're in a
> @@ -537,27 +550,38 @@ static void xen_load_tls(struct thread_struct *t, 
> unsigned int cpu)
>          * This will go away as soon as Xen has been modified to not
>          * save/restore %gs for normal hypercalls.
>          *
> -        * On x86_64, this hack is not used for %gs, because gs points
> -        * to KERNEL_GS_BASE (and uses it for PDA references), so we
> -        * must not zero %gs on x86_64
> -        *
>          * For x86_64, we need to zero %fs, otherwise we may get an
>          * exception between the new %fs descriptor being loaded and
>          * %fs being effectively cleared at __switch_to().
> +        *
> +        * We may also need to zero %gs, if it refers to a descriptor
> +        * which we are clearing. Otherwise a Xen vCPU context switch
> +        * before it gets reloaded could also cause a fault. Since
> +        * clearing the user %gs is another hypercall, do that only if
> +        * it's necessary.
> +        *
> +        * Note: These "faults" end up in xen_failsafe_callback(),
> +        * which just returns immediately to the "faulting" instruction
> +        * (i.e. the random one after Xen preempted this vCPU) with
> +        * the offending segment register zeroed. Which is actually
> +        * a perfectly safe thing to happen anyway, as it'll be loaded
> +        * again shortly. So maybe we needn't bother?
>          */
>         if (paravirt_get_lazy_mode() == PARAVIRT_LAZY_CPU) {
>  #ifdef CONFIG_X86_32
>                 lazy_load_gs(0);
>  #else
> +               savesegment(gs, flush_gs);
> +
>                 loadsegment(fs, 0);
>  #endif
>         }
>
>         xen_mc_batch();
>
> -       load_TLS_descriptor(t, cpu, 0);
> -       load_TLS_descriptor(t, cpu, 1);
> -       load_TLS_descriptor(t, cpu, 2);
> +       load_TLS_descriptor(t, cpu, 0, flush_gs);
> +       load_TLS_descriptor(t, cpu, 1, flush_gs);
> +       load_TLS_descriptor(t, cpu, 2, flush_gs);
>
>         xen_mc_issue(PARAVIRT_LAZY_CPU);
>  }
> --
> 2.17.1
>
>
>
>
> Amazon Development Centre (London) Ltd. Registered in England and Wales with 
> registration number 04543232 with its registered office at 1 Principal Place, 
> Worship Street, London EC2A 2FA, United Kingdom.
>
>
>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.