[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/build: Use new .nops directive when available



>>> On 16.08.18 at 13:48, <andrew.cooper3@xxxxxxxxxx> wrote:
> On 16/08/18 12:34, Jan Beulich wrote:
>>>>> On 16.08.18 at 12:42, <andrew.cooper3@xxxxxxxxxx> wrote:
>>> On 16/08/18 10:55, Roger Pau Monné wrote:
>>>> On Wed, Aug 15, 2018 at 06:57:38PM +0100, Andrew Cooper wrote:
>>>>> @@ -112,6 +125,11 @@ static void __init arch_init_ideal_nops(void)
>>>>>              ideal_nops = k8_nops;
>>>>>          break;
>>>>>      }
>>>>> +
>>>>> +#ifdef HAVE_AS_NOP_DIRECTIVE
>>>>> +    if ( memcmp(ideal_nops[ASM_NOP_MAX], toolchain_nops, ASM_NOP_MAX) == 
>>>>> 0 
> )
>>>>> +        toolchain_nops_are_ideal = true;
>>>>> +#endif
>>>> You are only comparing that the biggest nop instruction (9 bytes
>>>> AFAICT) generated by the assembler is what Xen believes to be the more
>>>> optimized version. What about shorter nops?
>>> They are all variations on a theme.
>>>
>>> For P6 nops, its the 0f 1f root which is important, which takes a modrm
>>> byte.  Traditionally, its always encoded with eax and uses redundant
>>> memory encodings for longer instructions.
>>>
>>> I can't think of any way of detecting if the optimised nops if the
>>> toolchain starts using alternative registers in the encoding, but I
>>> expect this case won't happen in practice.
>> It's not just the register encoding, but also the maximum single-insn
>> length that gets generated. Recall that until not very long ago we
>> had up to 8-byte NOP insns only? The view on the mod (as in ModRM)
>> usage may vary over time, as may the view on which or how many
>> prefixes are reasonable to have.
> 
> Strictly speaking, the ORM says "encode the least-recently live
> register", because all the hint nops are still subject to reg/reg
> dependencies.
> 
> However, we definitely can't take advantage of this, nor can the
> assembler.

Well, _we_ could, at least when tail padding patched in insns: I very
much hope we know what we've patched in, and hence at least what
registers were used most recently. This is not readily available today,
but could be made so.

>  Compilers can't either, because the exact length of the nop
> depends on other relocations.  Furthermore, the perf improvement from
> doing this would be fractional.

True.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.