[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 4/5] x86/boot: Simplify pagetable manipulation loops



On 17.01.2020 21:42, Andrew Cooper wrote:
> For __page_tables_{start,end} and L3 bootmap initialisation, the logic is
> unnecesserily complicated owing to its attempt to use the LOOP instruction,
> which results in an off-by-8 memory address owing to LOOP's termination
> condition.
> 
> Rewrite both loops for improved clarity and speed.
> 
> Misc notes:
>  * TEST $IMM, MEM can't macrofuse.  The loop has 0x1200 iterations, so pull
>    the $_PAGE_PRESENT constant out into a spare register to turn the TEST into
>    its %REG, MEM form, which can macrofuse.
>  * Avoid the use of %fs-relative references.  %esi-relative is the more common
>    form in the code, and doesn't suffer an address generation overhead.
>  * Avoid LOOP.  CMP/JB isn't microcoded and faster to execute in all cases.
>  * For a 4 interation trivial loop, even compilers unroll these.  The
>    generated code size is a fraction larger, but this is init and the asm is
>    far easier to follow.
>  * Reposition the l2=>l1 bootmap construction so the asm reads in pagetable
>    level order.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>
with two remarks/questions, but leaving it up to you whether
you want to adjust the code:

> --- a/xen/arch/x86/boot/head.S
> +++ b/xen/arch/x86/boot/head.S
> @@ -662,11 +662,17 @@ trampoline_setup:
>          mov     %edx,sym_fs(boot_tsc_stamp)+4
>  
>          /* Relocate pagetables to point at Xen's current location in memory. 
> */
> -        mov     $((__page_tables_end-__page_tables_start)/8),%ecx
> -1:      testl   $_PAGE_PRESENT,sym_fs(__page_tables_start)-8(,%ecx,8)
> +        mov     $_PAGE_PRESENT, %edx
> +        lea     sym_esi(__page_tables_start), %eax
> +        lea     sym_esi(__page_tables_end), %edi
> +
> +1:      testb   %dl, (%eax)  /* if page present */

When it's an immediate, using TESTB is generally helpful because
there's no (sign- or whatever-)extended immediate form of it.
When using a register, I think it would generally be better to
use native size, even if for register reads the partial register
access penalty may (today) be zero.

> @@ -701,22 +707,27 @@ trampoline_setup:
>          cmp     %edx, %ecx
>          jbe     1b
>  
> -        /* Initialize L3 boot-map page directory entries. */
> -        lea     
> __PAGE_HYPERVISOR+(L2_PAGETABLE_ENTRIES*8)*3+sym_esi(l2_bootmap),%eax
> -        mov     $4,%ecx
> -1:      mov     %eax,sym_fs(l3_bootmap)-8(,%ecx,8)
> -        sub     $(L2_PAGETABLE_ENTRIES*8),%eax
> -        loop    1b
> -
> -        /* Map the permanent trampoline page into l{1,2}_bootmap[]. */
> +        /* Map 4x l2_bootmap[] into l3_bootmap[0...3] */
> +        lea     __PAGE_HYPERVISOR + sym_esi(l2_bootmap), %eax
> +        mov     $PAGE_SIZE, %edx
> +        mov     %eax, 0  + sym_esi(l3_bootmap)
> +        add     %edx, %eax
> +        mov     %eax, 8  + sym_esi(l3_bootmap)
> +        add     %edx, %eax
> +        mov     %eax, 16 + sym_esi(l3_bootmap)
> +        add     %edx, %eax
> +        mov     %eax, 24 + sym_esi(l3_bootmap)

It took me a moment to realize the code is correct despite there
not being any mention of PAGE_SIZE between each of the MOVs. As
you don't view code size as a (primary) concern, perhaps worth
using

        add     $PAGE_SIZE, %eax

everywhere, the more that this has a special, ModR/M-less
encoding?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.