|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 4/5] x86/boot: Simplify pagetable manipulation loops
On 17.01.2020 21:42, Andrew Cooper wrote:
> For __page_tables_{start,end} and L3 bootmap initialisation, the logic is
> unnecesserily complicated owing to its attempt to use the LOOP instruction,
> which results in an off-by-8 memory address owing to LOOP's termination
> condition.
>
> Rewrite both loops for improved clarity and speed.
>
> Misc notes:
> * TEST $IMM, MEM can't macrofuse. The loop has 0x1200 iterations, so pull
> the $_PAGE_PRESENT constant out into a spare register to turn the TEST into
> its %REG, MEM form, which can macrofuse.
> * Avoid the use of %fs-relative references. %esi-relative is the more common
> form in the code, and doesn't suffer an address generation overhead.
> * Avoid LOOP. CMP/JB isn't microcoded and faster to execute in all cases.
> * For a 4 interation trivial loop, even compilers unroll these. The
> generated code size is a fraction larger, but this is init and the asm is
> far easier to follow.
> * Reposition the l2=>l1 bootmap construction so the asm reads in pagetable
> level order.
>
> No functional change.
>
> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>
with two remarks/questions, but leaving it up to you whether
you want to adjust the code:
> --- a/xen/arch/x86/boot/head.S
> +++ b/xen/arch/x86/boot/head.S
> @@ -662,11 +662,17 @@ trampoline_setup:
> mov %edx,sym_fs(boot_tsc_stamp)+4
>
> /* Relocate pagetables to point at Xen's current location in memory.
> */
> - mov $((__page_tables_end-__page_tables_start)/8),%ecx
> -1: testl $_PAGE_PRESENT,sym_fs(__page_tables_start)-8(,%ecx,8)
> + mov $_PAGE_PRESENT, %edx
> + lea sym_esi(__page_tables_start), %eax
> + lea sym_esi(__page_tables_end), %edi
> +
> +1: testb %dl, (%eax) /* if page present */
When it's an immediate, using TESTB is generally helpful because
there's no (sign- or whatever-)extended immediate form of it.
When using a register, I think it would generally be better to
use native size, even if for register reads the partial register
access penalty may (today) be zero.
> @@ -701,22 +707,27 @@ trampoline_setup:
> cmp %edx, %ecx
> jbe 1b
>
> - /* Initialize L3 boot-map page directory entries. */
> - lea
> __PAGE_HYPERVISOR+(L2_PAGETABLE_ENTRIES*8)*3+sym_esi(l2_bootmap),%eax
> - mov $4,%ecx
> -1: mov %eax,sym_fs(l3_bootmap)-8(,%ecx,8)
> - sub $(L2_PAGETABLE_ENTRIES*8),%eax
> - loop 1b
> -
> - /* Map the permanent trampoline page into l{1,2}_bootmap[]. */
> + /* Map 4x l2_bootmap[] into l3_bootmap[0...3] */
> + lea __PAGE_HYPERVISOR + sym_esi(l2_bootmap), %eax
> + mov $PAGE_SIZE, %edx
> + mov %eax, 0 + sym_esi(l3_bootmap)
> + add %edx, %eax
> + mov %eax, 8 + sym_esi(l3_bootmap)
> + add %edx, %eax
> + mov %eax, 16 + sym_esi(l3_bootmap)
> + add %edx, %eax
> + mov %eax, 24 + sym_esi(l3_bootmap)
It took me a moment to realize the code is correct despite there
not being any mention of PAGE_SIZE between each of the MOVs. As
you don't view code size as a (primary) concern, perhaps worth
using
add $PAGE_SIZE, %eax
everywhere, the more that this has a special, ModR/M-less
encoding?
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |