[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v9 06/11] x86/entry: Organise the use of MSR_SPEC_CTRL at each entry/exit point



>>> On 18.01.18 at 16:46, <andrew.cooper3@xxxxxxxxxx> wrote:
> We need to be able to either set or clear IBRS in Xen context, as well as
> restore appropriate guest values in guest context.  See the documentation in
> asm-x86/spec_ctrl_asm.h for details.
> 
> Writes to %cr3 are slower when SPEC_CTRL.IBRS is set, so the
> SPEC_CTRL_{ENTRY/EXIT}* positioning is important, and optimised for the
> expected common case of Xen not using IBRS itself.

And this expectation is because on pre-Skylake we're fine using just
retpoline, and we expect post-Skylake to not have the issue anymore?
Such reasoning would help being spelled out here.

As to the behavior of CR3 writes - is this written down anywhere in
Intel's and/or AMD's docs? Or else, how do you know this is uniformly
the case on all past, current, and future hardware?

> @@ -99,6 +106,10 @@ UNLIKELY_END(realmode)
>  .Lvmx_vmentry_fail:
>          sti
>          SAVE_ALL
> +
> +        SPEC_CTRL_ENTRY_FROM_PV /* Req: %rsp=regs/cpuinfo Clob: acd */

I think the use of the PV variant here requires a comment.

> @@ -142,6 +146,13 @@ ENTRY(compat_restore_all_guest)
>          .popsection
>          or    $X86_EFLAGS_IF,%r11
>          mov   %r11d,UREGS_eflags(%rsp)
> +
> +        mov VCPU_arch_msr(%rbx), %rax
> +        mov VCPUMSR_spec_ctrl_raw(%rax), %eax

I understand that it's code like this which is missing from the SVM
and VMX exit paths.

> @@ -729,6 +760,9 @@ ENTRY(nmi)
>  handle_ist_exception:
>          SAVE_ALL CLAC
>  
> +        SPEC_CTRL_ENTRY_FROM_INTR /* Req: %rsp=regs, Clob: acd */
> +        /* WARNING! `ret`, `call *`, `jmp *` not safe before this point. */

Following my considerations towards alternative patching to
eliminate as much overhead as possible from the Meltdown
band-aid in case it is being disabled, I'm rather hesitant to see any
patchable code being introduced into the NMI/#MC entry paths
without the patching logic first being made safe in this regard.
Exceptions coming here aren't very frequent (except perhaps on
hardware about to die), so the path isn't performance critical.
Therefore I think we should try to avoid any patching here, and
just conditionals instead. This in fact is one of the reasons why I
didn't want to macro-ize the assembly additions done in the
Meltdown band-aid.

I do realize that this then also affects the exit-to-Xen path,
which I agree is less desirable to use conditionals on.

> --- /dev/null
> +++ b/xen/include/asm-x86/spec_ctrl_asm.h
> @@ -0,0 +1,227 @@
> +/******************************************************************************
> + * include/asm-x86/spec_ctrl.h
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; If not, see <http://www.gnu.org/licenses/>.
> + *
> + * Copyright (c) 2017-2018 Citrix Systems Ltd.
> + */
> +
> +#ifndef __X86_SPEC_CTRL_ASM_H__
> +#define __X86_SPEC_CTRL_ASM_H__
> +
> +#ifdef __ASSEMBLY__
> +#include <asm/msr.h>
> +
> +/*
> + * Saving and restoring MSR_SPEC_CTRL state is a little tricky.
> + *
> + * We want the guests choice of SPEC_CTRL while in guest context, and IBRS
> + * (set or clear, depending on the hardware) while running in Xen context.

To me this continues to be somewhat strange to read. One possibility
to improve it would be to move the opening parenthesis after "set or
clear", another would be to say "..., and Xen's choice (set or ...".
But you're the native speaker, so I'm open to other alternatives you
may consider even better.

> + * Therefore, a simplistic algorithm is:
> + *
> + *  - Set/clear IBRS on entry to Xen
> + *  - Set the guests' choice on exit to guest
> + *  - Leave SPEC_CTRL unchanged on exit to xen
> + *
> + * There are two complicating factors:
> + *  1) HVM guests can have direct access to the MSR, so it can change
> + *     behind Xen's back.
> + *  2) An NMI or MCE can interrupt at any point, including early in the entry
> + *     path, or late in the exit path after restoring the guest value.  This
> + *     will corrupt the guest value.
> + *
> + * Factor 1 is dealt with by relying on NMIs/MCEs being blocked immediately
> + * after VMEXIT.  The VMEXIT-specific code reads MSR_SPEC_CTRL and updates
> + * current before loading Xen's MSR_SPEC_CTRL setting.
> + *
> + * Factor 2 is harder.  We maintain a shadow_spec_ctrl value, and
> + * use_shadow_spec_ctrl boolean per cpu.  The synchronous use is:
> + *
> + *  1) Store guest value in shadow_spec_ctrl
> + *  2) Set use_shadow_spec_ctrl boolean
> + *  3) Load guest value into MSR_SPEC_CTRL
> + *  4) Exit to guest
> + *  5) Entry from guest
> + *  6) Clear use_shadow_spec_ctrl boolean

7) Load Xen value into MSR_SPEC_CTRL

(It is important to spell out that the MSR write in _both_ cases
needs to come after the write of the boolean.)

> +.macro DO_SPEC_CTRL_ENTRY_FROM_VMEXIT ibrs_val:req
> +/*
> + * Requires %rbx=current, %rsp=regs/cpuinfo
> + * Clobbers %rax, %rcx, %rdx
> + *
> + * The common case is that a guest has direct access to MSR_SPEC_CTRL, at
> + * which point we need to save the guest value before setting IBRS for Xen.
> + * Unilaterally saving the guest value is shorter and faster than checking.
> + */
> +    mov $MSR_SPEC_CTRL, %ecx
> +    rdmsr
> +
> +    /* Stash the value from hardware. */
> +    mov VCPU_arch_msr(%rbx), %rdx
> +    mov %al, VCPUMSR_spec_ctrl_raw(%rdx)

Patch 3 makes this a 32-bit field now - why still %al?

> +.macro DO_SPEC_CTRL_ENTRY maybexen:req ibrs_val:req
> +/*
> + * Requires %rsp=regs (also cpuinfo if !maybexen)
> + * Clobbers %rax, %rcx, %rdx
> + *
> + * PV guests can't update MSR_SPEC_CTRL behind Xen's back, so no need to read
> + * it back.  Entries from guest context need to clear SPEC_CTRL shadowing,
> + * while entries from Xen must leave shadowing in its current state.
> + */
> +    mov $MSR_SPEC_CTRL, %ecx
> +
> +    .if \maybexen
> +        testb $3, UREGS_cs(%rsp)
> +        jz .L\@_entry_from_xen
> +    .endif
> +
> +    /*
> +     * Clear SPEC_CTRL shadowing *before* loading Xen's value.  If entering
> +     * from a possibly-xen context, %rsp doesn't necessarily alias the 
> cpuinfo
> +     * block so calculate the position directly.
> +     */
> +    .if \maybexen
> +        GET_STACK_END(dx)

In almost all cases (and perhaps in all ones with maybexen set) the
macro invocation sits right ahead of a GET_STACK_BASE(). If you
swapped them and allowed the register to be passed in, you could
avoid this load.

> +.macro DO_SPEC_CTRL_EXIT_TO_GUEST
> +/*
> + * Requires %eax=spec_ctrl, %rsp=regs/cpuinfo
> + * Clobbers %rax, %rcx, %rdx

%rax isn't really being clobbered anymore. The clobber annotations
on all use sites appear to be similarly stale.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.