[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 4/4] x86/asm: Rewrite sync_core() to use IRET-to-self



On Mon, Dec 5, 2016 at 11:52 PM, Borislav Petkov <bp@xxxxxxxxx> wrote:
> On Mon, Dec 05, 2016 at 01:32:43PM -0800, Andy Lutomirski wrote:
>> Aside from being excessively slow, CPUID is problematic: Linux runs
>> on a handful of CPUs that don't have CPUID.  Use IRET-to-self
>> instead.  IRET-to-self works everywhere, so it makes testing easy.
>>
>> For reference, On my laptop, IRET-to-self is ~110ns,
>> CPUID(eax=1, ecx=0) is ~83ns on native and very very slow under KVM,
>> and MOV-to-CR2 is ~42ns.
>>
>> While we're at it: sync_core() serves a very specific purpose.
>> Document it.
>>
>> Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
>> Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxx>
>> ---
>>  arch/x86/include/asm/processor.h | 77 
>> ++++++++++++++++++++++++++++------------
>>  1 file changed, 55 insertions(+), 22 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/processor.h 
>> b/arch/x86/include/asm/processor.h
>> index 64fbc937d586..201a956e345f 100644
>> --- a/arch/x86/include/asm/processor.h
>> +++ b/arch/x86/include/asm/processor.h
>> @@ -590,33 +590,66 @@ static __always_inline void cpu_relax(void)
>>
>>  #define cpu_relax_lowlatency() cpu_relax()
>>
>> -/* Stop speculative execution and prefetching of modified code. */
>> +/*
>> + * This function forces the icache and prefetched instruction stream to
>> + * catch up with reality in two very specific cases:
>> + *
>> + *  a) Text was modified using one virtual address and is about to be 
>> executed
>> + *     from the same physical page at a different virtual address.
>> + *
>> + *  b) Text was modified on a different CPU, may subsequently be
>> + *     executed on this CPU, and you want to make sure the new version
>> + *     gets executed.  This generally means you're calling this in a IPI.
>> + *
>> + * If you're calling this for a different reason, you're probably doing
>> + * it wrong.
>
> "... and think hard before you call this - it is slow."
>
> I'd add that now that it is even slower than CPUID.

But only barely.  And it's hugely faster than CPUID under KVM or
similar.  And it works on all CPUs.

>
>> + */
>>  static inline void sync_core(void)
>>  {
>> -     int tmp;
>> -
>> -#ifdef CONFIG_X86_32
>>       /*
>> -      * Do a CPUID if available, otherwise do a jump.  The jump
>> -      * can conveniently enough be the jump around CPUID.
>> +      * There are quite a few ways to do this.  IRET-to-self is nice
>> +      * because it works on every CPU, at any CPL (so it's compatible
>> +      * with paravirtualization), and it never exits to a hypervisor.
>> +      * The only down sides are that it's a bit slow (it seems to be
>> +      * a bit more than 2x slower than the fastest options) and that
>> +      * it unmasks NMIs.
>
> Ewww, I hadn't thought of that angle. Aren't we going to get in all
> kinds of hard to debug issues due to that couple of cycles window of
> unmasked NMIs?
>
> We sync_core in some NMI handler and then right in the middle of it we
> get another NMI. Yeah, we have the nested NMI stuff still but I'd like
> to avoid complications if possible.

I'll counter with:

1. Why on earth would an NMI call sync_core()?

2. We have very careful and code to handle this issue, and NMIs really
do cause page faults which have exactly the same problem.

So I'm not too worried.

>
>> The "push %cs" is needed because, in
>> +      * paravirtual environments, __KERNEL_CS may not be a valid CS
>> +      * value when we do IRET directly.
>> +      *
>> +      * In case NMI unmasking or performance every becomes a problem,
>> +      * the next best option appears to be MOV-to-CR2 and an
>> +      * unconditional jump.  That sequence also works on all CPUs,
>> +      * but it will fault at CPL3.
>
> Does it really have to be non-priviledged?

Unless we want to declare lguest unsupported, delete it from the tree,
or, sigh, actually maintain it, then yes :(  native_write_cr2() would
work on Xen, but it's slow.

>
> If not, there are a couple more serializing insns:
>
> "• Privileged serializing instructions — INVD, INVEPT, INVLPG,
> INVVPID, LGDT, LIDT, LLDT, LTR, MOV (to control register, with the
> exception of MOV CR83), MOV (to debug register), WBINVD, and WRMSR"
>
> What about INVD? It is expensive too :-)

Only if you write the patch and label it:

Snickered-at-by: Andy Lutomirski <luto@xxxxxxxxxx>

>
> Can't we use, MOV %dr or so? With previously saving/restoring the reg?
>
> WBINVD could be another candidate, albeit a big hammer.
>
> WRMSR maybe too. If it faults, it's fine too because then you have the
> IRET automagically. Hell, we could even make it fault on purpose by
> writing into an invalid MSR but then we're back to the unmasking NMIs...
> :-\

I still like MOV-to-CR2 better than all of these.

--Andy

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.