[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 22/23] x86: make Xen early boot code relocatable



On Mon, Jul 20, 2015 at 04:29:17PM +0200, Daniel Kiper wrote:
> Every multiboot protocol (regardless of version) compatible image must
> specify its load address (in ELF or multiboot header). Multiboot protocol
> compatible loader have to load image at specified address. However, there
> is no guarantee that the requested memory region (in case of Xen it starts
> at 1 MiB and ends at 17 MiB) where image should be loaded initially is a RAM
> and it is free (legacy BIOS platforms are merciful for Xen but I found at
> least one EFI platform on which Xen load address conflicts with EFI boot
> services; it is Dell PowerEdge R820 with latest firmware). To cope with
> that problem we must make Xen early boot code relocatable. This patch does
> that. However, it does not add multiboot2 protocol interface which is done
> in next patch.

s/next patch/"x86: add multiboot2 protocol support for relocatable image."
> 
> This patch changes following things:
>   - default load address is changed from 1 MiB to 2 MiB; I did that because
>     initial page tables are using 2 MiB huge pages and this way required
>     updates for them are quite easy; it means that e.g. we avoid spacial
>     cases for beginning and end of required memory region if it live at
>     address not aligned to 2 MiB,
>   - %ebp register is used as a storage for Xen image base address; this way
>     we can get this value very quickly if it is needed; however, %ebp register
>     is not used directly to access a given memory region,
>   - %fs register is filled with segment descriptor which describes memory 
> region
>     with Xen image (it could be relocated or not); it is used to access some 
> of

'memory region with Xen image' ? Not sure I follow?

Perhaps:
segment descriptor which starts (0) at Xen image base (_start).


>     Xen data in early boot code; potentially we can use above mentioned 
> segment
>     descriptor to access data using %ds:%esi and/or %es:%esi (e.g. movs*); 
> however,
>     I think that it could unnecessarily obfuscate code (e.g. we need at least
>     to operations to reload a given segment descriptor) and current solution

s/to/two/ ?
>     looks quite optimal.
> 
> Signed-off-by: Daniel Kiper <daniel.kiper@xxxxxxxxxx>
> ---
>  xen/arch/x86/Makefile          |    6 +-
>  xen/arch/x86/Rules.mk          |    4 +
>  xen/arch/x86/boot/head.S       |  165 
> ++++++++++++++++++++++++++++++----------
>  xen/arch/x86/boot/trampoline.S |   11 ++-
>  xen/arch/x86/boot/wakeup.S     |    6 +-
>  xen/arch/x86/boot/x86_64.S     |   34 ++++-----
>  xen/arch/x86/setup.c           |   33 ++++----
>  xen/arch/x86/x86_64/mm.c       |    2 +-
>  xen/arch/x86/xen.lds.S         |    2 +-
>  xen/include/asm-x86/config.h   |    3 +
>  xen/include/asm-x86/page.h     |    2 +-
>  11 files changed, 182 insertions(+), 86 deletions(-)
> 
> diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
> index 82c5a93..93069a8 100644
> --- a/xen/arch/x86/Makefile
> +++ b/xen/arch/x86/Makefile
> @@ -72,8 +72,10 @@ efi-$(x86_64) := $(shell if [ ! -r 
> $(BASEDIR)/include/xen/compile.h -o \
>                           echo '$(TARGET).efi'; fi)
>  
>  $(TARGET): $(TARGET)-syms $(efi-y) boot/mkelf32
> -     ./boot/mkelf32 $(TARGET)-syms $(TARGET) 0x100000 \
> -     `$(NM) -nr $(TARGET)-syms | head -n 1 | sed -e 's/^\([^ ]*\).*/0x\1/'`
> +#    THIS IS UGLY HACK! PLEASE DO NOT COMPLAIN. I WILL FIX IT IN NEXT 
> RELEASE.

OK :-)

> +     ./boot/mkelf32 $(TARGET)-syms $(TARGET) $(XEN_IMG_PHYS_START) 
> 0xffff82d081000000
> +#    ./boot/mkelf32 $(TARGET)-syms $(TARGET) 0x100000 \
> +#    `$(NM) -nr $(TARGET)-syms | head -n 1 | sed -e 's/^\([^ ]*\).*/0x\1/'`
>  
>  
>  ALL_OBJS := $(BASEDIR)/arch/x86/boot/built_in.o 
> $(BASEDIR)/arch/x86/efi/built_in.o $(ALL_OBJS)
> diff --git a/xen/arch/x86/Rules.mk b/xen/arch/x86/Rules.mk
> index 4a04a8a..7ccb8a0 100644
> --- a/xen/arch/x86/Rules.mk
> +++ b/xen/arch/x86/Rules.mk
> @@ -15,6 +15,10 @@ HAS_GDBSX := y
>  HAS_PDX := y
>  xenoprof := y
>  
> +XEN_IMG_PHYS_START = 0x200000
> +
> +CFLAGS += -DXEN_IMG_PHYS_START=$(XEN_IMG_PHYS_START)
> +
>  CFLAGS += -I$(BASEDIR)/include 
>  CFLAGS += -I$(BASEDIR)/include/asm-x86/mach-generic
>  CFLAGS += -I$(BASEDIR)/include/asm-x86/mach-default
> diff --git a/xen/arch/x86/boot/head.S b/xen/arch/x86/boot/head.S
> index 3f1054d..d484f68 100644
> --- a/xen/arch/x86/boot/head.S
> +++ b/xen/arch/x86/boot/head.S
> @@ -12,13 +12,15 @@
>          .text
>          .code32
>  
> -#define sym_phys(sym)     ((sym) - __XEN_VIRT_START)
> +#define sym_phys(sym)     ((sym) - __XEN_VIRT_START + XEN_IMG_PHYS_START - 
> XEN_IMG_OFFSET)
> +#define sym_offset(sym)   ((sym) - __XEN_VIRT_START)
>  
>  #define BOOT_CS32        0x0008
>  #define BOOT_CS64        0x0010
>  #define BOOT_DS          0x0018
>  #define BOOT_PSEUDORM_CS 0x0020
>  #define BOOT_PSEUDORM_DS 0x0028
> +#define BOOT_FS          0x0030
>  
>  #define MB2_HT(name)      (MULTIBOOT2_HEADER_TAG_##name)
>  #define MB2_TT(name)      (MULTIBOOT2_TAG_TYPE_##name)
> @@ -105,12 +107,13 @@ multiboot1_header_end:
>  
>          .word   0
>  gdt_boot_descr:
> -        .word   6*8-1
> -        .long   sym_phys(trampoline_gdt)
> +        .word   7*8-1
> +gdt_boot_descr_addr:
> +        .long   sym_offset(trampoline_gdt)
>          .long   0 /* Needed for 64-bit lgdt */
>  
>  cs32_switch_addr:
> -        .long   sym_phys(cs32_switch)
> +        .long   sym_offset(cs32_switch)
>          .word   BOOT_CS32
>  
>  .Lbad_cpu_msg: .asciz "ERR: Not a 64-bit CPU!"
> @@ -120,13 +123,13 @@ cs32_switch_addr:
>          .section .init.text, "ax", @progbits
>  
>  bad_cpu:
> -        mov     $(sym_phys(.Lbad_cpu_msg)),%esi # Error message
> +        lea     sym_offset(.Lbad_cpu_msg)(%ebp),%esi # Error message
>          jmp     print_err
>  not_multiboot:
> -        mov     $(sym_phys(.Lbad_ldr_msg)),%esi # Error message
> +        lea     sym_offset(.Lbad_ldr_msg)(%ebp),%esi # Error message
>          jmp     print_err
>  mb2_too_old:
> -        mov     $(sym_phys(.Lbad_mb2_ldr)),%esi # Error message
> +        lea     sym_offset(.Lbad_mb2_ldr)(%ebp),%esi # Error message
>  print_err:
>          mov     $0xB8000,%edi  # VGA framebuffer
>  1:      mov     (%esi),%bl
> @@ -151,6 +154,9 @@ print_err:
>  __efi64_start:
>          cld
>  
> +        /* Load default Xen image base address. */
> +        mov     $sym_phys(__image_base__),%ebp
> +
>          /* Check for Multiboot2 bootloader. */
>          cmp     $MULTIBOOT2_BOOTLOADER_MAGIC,%eax
>          je      efi_multiboot2_proto
> @@ -235,9 +241,11 @@ x86_32_switch:
>          cli
>  
>          /* Initialise GDT. */
> +        add     %ebp,gdt_boot_descr_addr(%rip)
>          lgdt    gdt_boot_descr(%rip)
>  
>          /* Reload code selector. */
> +        add     %ebp,cs32_switch_addr(%rip)
>          ljmpl   *cs32_switch_addr(%rip)
>  
>          .code32
> @@ -263,12 +271,8 @@ __start:
>          cld
>          cli
>  
> -        /* Initialise GDT and basic data segments. */
> -        lgdt    %cs:sym_phys(gdt_boot_descr)
> -        mov     $BOOT_DS,%ecx
> -        mov     %ecx,%ds
> -        mov     %ecx,%es
> -        mov     %ecx,%ss
> +        /* Load default Xen image base address. */
> +        mov     $sym_phys(__image_base__),%ebp
>  
>          /* Bootloaders may set multiboot{1,2}.mem_lower to a nonzero value. 
> */
>          xor     %edx,%edx
> @@ -319,6 +323,19 @@ multiboot2_proto:
>          jmp     0b
>  
>  trampoline_bios_setup:
> +        mov     %ebp,%esi
> +
> +        /* Initialise GDT and basic data segments. */
> +        add     %ebp,sym_offset(gdt_boot_descr_addr)(%esi)
> +        lgdt    sym_offset(gdt_boot_descr)(%esi)
> +
> +        mov     $BOOT_DS,%ecx
> +        mov     %ecx,%ds
> +        mov     %ecx,%es
> +        mov     %ecx,%fs
> +        mov     %ecx,%gs
> +        mov     %ecx,%ss
> +


The non-EFI boot path is now:

start
 \- __start
     \- multiboot2_proto
     |    jmp trampoline_bios_setup
     |
     \-and if not MB2: jmp trampoline_bios_setup.


In here you tweak the GDT and reload the %ds - but during
this call chain we do touch the %ds - via:

__start+27>:        testb  $0x1,(%rbx)
__start+30>:        cmovne 0x4(%rbx),%edx

which is OK (as MB1 says that the %ds has to cover up to 4GB).
But I wonder why the __start code had the segments reloaded so early?
Was the bootloader not setting the proper segments?

Let me double-check what SYSLINUX's mboot.c32 does. Perhaps
it had done something odd in the past.

>          /* Set up trampoline segment 64k below EBDA */
>          movzwl  0x40e,%ecx          /* EBDA segment */
>          cmp     $0xa000,%ecx        /* sanity check (high) */
> @@ -340,33 +357,58 @@ trampoline_bios_setup:
>          cmovb   %edx,%ecx           /* and use the smaller */
>  
>  trampoline_setup:

Would it make sense to add:

/* Gets called from EFI (from x86_32_switch) and legacy (see above) boot 
loaders. */

> +        mov     %ebp,%esi
> +
> +        /* Initialize 0-15 bits of BOOT_FS segment descriptor base address. 
> */
> +        mov     %ebp,%edx
> +        shl     $16,%edx
> +        or      %edx,(sym_offset(trampoline_gdt)+BOOT_FS)(%esi)
> +
> +        /* Initialize 16-23 bits of BOOT_FS segment descriptor base address. 
> */
> +        mov     %ebp,%edx
> +        shr     $16,%edx
> +        and     $0x000000ff,%edx
> +        or      %edx,(sym_offset(trampoline_gdt)+BOOT_FS+4)(%esi)
> +
> +        /* Initialize 24-31 bits of BOOT_FS segment descriptor base address. 
> */
> +        mov     %ebp,%edx
> +        and     $0xff000000,%edx
> +        or      %edx,(sym_offset(trampoline_gdt)+BOOT_FS+4)(%esi)
> +
> +        /* Initialize %fs and later use it to access Xen data if possible. */
> +        mov     $BOOT_FS,%edx
> +        mov     %edx,%fs
> +

We just modified the GDT. Should we reload it (lgdt?)?

>          /* Reserve 64kb for the trampoline. */
>          sub     $0x1000,%ecx
>  
>          /* From arch/x86/smpboot.c: start_eip had better be page-aligned! */
>          xor     %cl, %cl
>          shl     $4, %ecx
> -        mov     %ecx,sym_phys(trampoline_phys)
> +        mov     %ecx,%fs:sym_offset(trampoline_phys)
> +
> +        /* Save Xen image base address for later use. */
> +        mov     %ebp,%fs:sym_offset(xen_img_base_phys_addr)
>  
>          /* Save the Multiboot info struct (after relocation) for later use. 
> */
> -        mov     $sym_phys(cpu0_stack)+1024,%esp
> +        lea     (sym_offset(cpu0_stack)+1024)(%ebp),%esp
>          push    %eax                /* Multiboot magic. */
>          push    %ebx                /* Multiboot information address. */
>          push    %ecx                /* Boot trampoline address. */
>          call    reloc
>          add     $12,%esp            /* Remove reloc() args from stack. */
> -        mov     %eax,sym_phys(multiboot_ptr)
> +        mov     %eax,%fs:sym_offset(multiboot_ptr)
>  
>          /*
>           * Do not zero BSS on EFI platform here.
>           * It was initialized earlier.
>           */
> -        cmpb    $1,sym_phys(skip_realmode)
> +        cmpb    $1,%fs:sym_offset(skip_realmode)
>          je      1f
>  
>          /* Initialize BSS (no nasty surprises!). */
> -        mov     $sym_phys(__bss_start),%edi
> -        mov     $sym_phys(__bss_end),%ecx
> +        lea     sym_offset(__bss_start)(%ebp),%edi
> +        lea     sym_offset(__bss_end)(%ebp),%ecx
>          sub     %edi,%ecx
>          shr     $2,%ecx
>          xor     %eax,%eax
> @@ -381,8 +423,8 @@ trampoline_setup:
>          jbe     1f
>          mov     $0x80000001,%eax
>          cpuid
> -1:      mov     %edx,sym_phys(cpuid_ext_features)
> -        mov     
> %edx,sym_phys(boot_cpu_data)+CPUINFO_FEATURE_OFFSET(X86_FEATURE_LM)
> +1:      mov     %edx,%fs:sym_offset(cpuid_ext_features)
> +        mov     
> %edx,%fs:(sym_offset(boot_cpu_data)+CPUINFO_FEATURE_OFFSET(X86_FEATURE_LM))
>  
>          /* Check for availability of long mode. */
>          bt      $X86_FEATURE_LM & 0x1f,%edx
> @@ -390,72 +432,111 @@ trampoline_setup:
>  
>          /* Stash TSC to calculate a good approximation of time-since-boot */
>          rdtsc
> -        mov     %eax,sym_phys(boot_tsc_stamp)
> -        mov     %edx,sym_phys(boot_tsc_stamp+4)
> +        mov     %eax,%fs:sym_offset(boot_tsc_stamp)
> +        mov     %edx,%fs:sym_offset(boot_tsc_stamp+4)
>  
> -        /* Initialise L2 boot-map page table entries (16MB). */
> -        mov     $sym_phys(l2_bootmap),%edx
> -        mov     $PAGE_HYPERVISOR|_PAGE_PSE,%eax
> -        mov     $8,%ecx
> +        /* Update frame addreses in page tables. */
> +        lea     sym_offset(__page_tables_start)(%ebp),%edx
> +        mov     $((__page_tables_end-__page_tables_start)/8),%ecx
> +1:      testl   $_PAGE_PRESENT,(%edx)
> +        jz      2f
> +        add     %ebp,(%edx)
> +2:      add     $8,%edx
> +        loop    1b
> +
> +        /* Initialise L2 boot-map page table entries (14MB). */
> +        lea     sym_offset(l2_bootmap)(%ebp),%edx
> +        lea     sym_offset(start)(%ebp),%eax
> +        and     $~((1<<L2_PAGETABLE_SHIFT)-1),%eax
> +        mov     %eax,%ebx
> +        shr     $(L2_PAGETABLE_SHIFT-3),%ebx
> +        and     $(L2_PAGETABLE_ENTRIES*4*8-1),%ebx
> +        add     %ebx,%edx
> +        add     $(PAGE_HYPERVISOR|_PAGE_PSE),%eax
> +        mov     $7,%ecx
>  1:      mov     %eax,(%edx)
>          add     $8,%edx
>          add     $(1<<L2_PAGETABLE_SHIFT),%eax
>          loop    1b
> +
>          /* Initialise L3 boot-map page directory entry. */
> -        mov     $sym_phys(l2_bootmap)+__PAGE_HYPERVISOR,%eax
> -        mov     %eax,sym_phys(l3_bootmap) + 0*8
> +        lea     (sym_offset(l2_bootmap)+__PAGE_HYPERVISOR)(%ebp),%eax
> +        lea     sym_offset(l3_bootmap)(%ebp),%ebx
> +        mov     $4,%ecx
> +1:      mov     %eax,(%ebx)
> +        add     $8,%ebx
> +        add     $(L2_PAGETABLE_ENTRIES*8),%eax
> +        loop    1b
> +
> +        /* Initialise L2 direct map page table entries (14MB). */
> +        lea     sym_offset(l2_identmap)(%ebp),%edx
> +        lea     sym_offset(start)(%ebp),%eax
> +        and     $~((1<<L2_PAGETABLE_SHIFT)-1),%eax
> +        mov     %eax,%ebx
> +        shr     $(L2_PAGETABLE_SHIFT-3),%ebx
> +        and     $(L2_PAGETABLE_ENTRIES*4*8-1),%ebx
> +        add     %ebx,%edx
> +        add     $(PAGE_HYPERVISOR|_PAGE_PSE),%eax
> +        mov     $7,%ecx
> +1:      mov     %eax,(%edx)
> +        add     $8,%edx
> +        add     $(1<<L2_PAGETABLE_SHIFT),%eax
> +        loop    1b
> +
>          /* Hook 4kB mappings of first 2MB of memory into L2. */
> -        mov     $sym_phys(l1_identmap)+__PAGE_HYPERVISOR,%edi
> -        mov     %edi,sym_phys(l2_xenmap)
> -        mov     %edi,sym_phys(l2_bootmap)
> +        lea     (sym_offset(l1_identmap)+__PAGE_HYPERVISOR)(%ebp),%edi
> +        mov     %edi,%fs:sym_offset(l2_bootmap)

But not to l2_xenmap?

>  
>          /* Apply relocations to bootstrap trampoline. */
> -        mov     sym_phys(trampoline_phys),%edx
> -        mov     $sym_phys(__trampoline_rel_start),%edi
> +        mov     %fs:sym_offset(trampoline_phys),%edx
> +        lea     sym_offset(__trampoline_rel_start)(%ebp),%edi
> +        lea     sym_offset(__trampoline_rel_stop)(%ebp),%esi
>  1:
>          mov     (%edi),%eax
>          add     %edx,(%edi,%eax)
>          add     $4,%edi
> -        cmp     $sym_phys(__trampoline_rel_stop),%edi
> +        cmp     %esi,%edi
>          jb      1b
>  
>          /* Patch in the trampoline segment. */
>          shr     $4,%edx
> -        mov     $sym_phys(__trampoline_seg_start),%edi
> +        lea     sym_offset(__trampoline_seg_start)(%ebp),%edi
> +        lea     sym_offset(__trampoline_seg_stop)(%ebp),%esi
>  1:
>          mov     (%edi),%eax
>          mov     %dx,(%edi,%eax)
>          add     $4,%edi
> -        cmp     $sym_phys(__trampoline_seg_stop),%edi
> +        cmp     %esi,%edi
>          jb      1b
>  
>          /* Do not parse command line on EFI platform here. */
> -        cmpb    $1,sym_phys(skip_realmode)
> +        cmpb    $1,%fs:sym_offset(skip_realmode)
>          je      1f
>  
>          /* Bail if there is no command line to parse. */
> -        mov     sym_phys(multiboot_ptr),%ebx
> +        mov     %fs:sym_offset(multiboot_ptr),%ebx
>          testl   $MBI_CMDLINE,MB_flags(%ebx)
>          jz      1f
>  
>          cmpl    $0,MB_cmdline(%ebx)
>          jz      1f
>  
> -        pushl   $sym_phys(early_boot_opts)
> +        lea     sym_offset(early_boot_opts)(%ebp),%eax
> +        push    %eax
>          pushl   MB_cmdline(%ebx)
>          call    cmdline_parse_early
>          add     $8,%esp             /* Remove cmdline_parse_early() args 
> from stack. */
>  
>  1:
>          /* Switch to low-memory stack.  */
> -        mov     sym_phys(trampoline_phys),%edi
> +        mov     %fs:sym_offset(trampoline_phys),%edi
>          lea     0x10000(%edi),%esp
>          lea     trampoline_boot_cpu_entry-trampoline_start(%edi),%eax
>          pushl   $BOOT_CS32
>          push    %eax
>  
>          /* Copy bootstrap trampoline to low memory, below 1MB. */
> -        mov     $sym_phys(trampoline_start),%esi
> +        lea     sym_offset(trampoline_start)(%ebp),%esi
>          mov     $trampoline_end - trampoline_start,%ecx
>          rep     movsb
>  
> diff --git a/xen/arch/x86/boot/trampoline.S b/xen/arch/x86/boot/trampoline.S
> index 3c2714d..a8909ce 100644
> --- a/xen/arch/x86/boot/trampoline.S
> +++ b/xen/arch/x86/boot/trampoline.S
> @@ -52,12 +52,20 @@ trampoline_gdt:
>          /* 0x0028: real-mode data @ BOOT_TRAMPOLINE */
>          .long   0x0000ffff
>          .long   0x00009200
> +        /*
> +         * 0x0030: ring 0 Xen data, 16 MiB size, base
> +         * address is initialized during runtime.

s/initialized/computed/
> +         */
> +        .quad   0x00c0920000001000
>  
>          .pushsection .trampoline_rel, "a"
>          .long   trampoline_gdt + BOOT_PSEUDORM_CS + 2 - .
>          .long   trampoline_gdt + BOOT_PSEUDORM_DS + 2 - .
>          .popsection
>  
> +GLOBAL(xen_img_base_phys_addr)
> +        .long   0
> +
>  GLOBAL(cpuid_ext_features)
>          .long   0
>  
> @@ -82,7 +90,8 @@ trampoline_protmode_entry:
>          mov     %ecx,%cr4
>  
>          /* Load pagetable base register. */
> -        mov     $sym_phys(idle_pg_table),%eax
> +        mov     bootsym_rel(xen_img_base_phys_addr,4,%eax)
> +        lea     sym_offset(idle_pg_table)(%eax),%eax
>          add     bootsym_rel(trampoline_xen_phys_start,4,%eax)
>          mov     %eax,%cr3
>  
> diff --git a/xen/arch/x86/boot/wakeup.S b/xen/arch/x86/boot/wakeup.S
> index 08ea9b2..ff80f7f 100644
> --- a/xen/arch/x86/boot/wakeup.S
> +++ b/xen/arch/x86/boot/wakeup.S
> @@ -119,8 +119,10 @@ wakeup_32:
>          mov     %eax, %ss
>          mov     $bootsym_rel(wakeup_stack, 4, %esp)
>  
> +        mov     bootsym_rel(xen_img_base_phys_addr, 4, %ebx)
> +
>          # check saved magic again
> -        mov     $sym_phys(saved_magic), %eax
> +        lea     sym_offset(saved_magic)(%ebx), %eax
>          add     bootsym_rel(trampoline_xen_phys_start, 4, %eax)
>          mov     (%eax), %eax
>          cmp     $0x9abcdef0, %eax
> @@ -133,7 +135,7 @@ wakeup_32:
>          mov     %ecx, %cr4
>  
>          /* Load pagetable base register */
> -        mov     $sym_phys(idle_pg_table),%eax
> +        lea     sym_offset(idle_pg_table)(%ebx),%eax
>          add     bootsym_rel(trampoline_xen_phys_start,4,%eax)
>          mov     %eax,%cr3
>  
> diff --git a/xen/arch/x86/boot/x86_64.S b/xen/arch/x86/boot/x86_64.S
> index c8bf9d0..ae4bebd 100644
> --- a/xen/arch/x86/boot/x86_64.S
> +++ b/xen/arch/x86/boot/x86_64.S
> @@ -81,7 +81,6 @@ GLOBAL(boot_cpu_compat_gdt_table)
>          .quad 0x0000910000000000     /* per-CPU entry (limit == cpu)      */
>          .align PAGE_SIZE, 0
>  
> -GLOBAL(__page_tables_start)
>  /*
>   * Mapping of first 2 megabytes of memory. This is mapped with 4kB mappings
>   * to avoid type conflicts with fixed-range MTRRs covering the lowest 
> megabyte
> @@ -101,21 +100,18 @@ GLOBAL(l1_identmap)
>          .endr
>          .size l1_identmap, . - l1_identmap
>  
> -/* Mapping of first 16 megabytes of memory. */

Don't want to just update the comment?

> +GLOBAL(__page_tables_start)
> +
>  GLOBAL(l2_identmap)

And perhaps explain how this page is being updated at runtime?

> -        .quad sym_phys(l1_identmap) + __PAGE_HYPERVISOR
> -        pfn = 0
> -        .rept 7
> -        pfn = pfn + (1 << PAGETABLE_ORDER)
> -        .quad (pfn << PAGE_SHIFT) | PAGE_HYPERVISOR | _PAGE_PSE
> -        .endr
> -        .fill 4 * L2_PAGETABLE_ENTRIES - 8, 8, 0
> +        .quad sym_offset(l1_identmap) + __PAGE_HYPERVISOR
> +        .fill 4 * L2_PAGETABLE_ENTRIES - 1, 8, 0
>          .size l2_identmap, . - l2_identmap
>  
>  GLOBAL(l2_xenmap)
> -        idx = 0
> -        .rept 8
> -        .quad sym_phys(__image_base__) + (idx << L2_PAGETABLE_SHIFT) + 
> (PAGE_HYPERVISOR | _PAGE_PSE)
> +        .quad 0
> +        idx = 1
> +        .rept 7
> +        .quad sym_offset(__image_base__) + (idx << L2_PAGETABLE_SHIFT) + 
> (PAGE_HYPERVISOR | _PAGE_PSE)
>          idx = idx + 1
>          .endr
>          .fill L2_PAGETABLE_ENTRIES - 8, 8, 0
> @@ -125,7 +121,7 @@ l2_fixmap:
>          idx = 0
>          .rept L2_PAGETABLE_ENTRIES
>          .if idx == l2_table_offset(FIXADDR_TOP - 1)
> -        .quad sym_phys(l1_fixmap) + __PAGE_HYPERVISOR
> +        .quad sym_offset(l1_fixmap) + __PAGE_HYPERVISOR
>          .else
>          .quad 0
>          .endif
> @@ -136,7 +132,7 @@ l2_fixmap:
>  GLOBAL(l3_identmap)
>          idx = 0
>          .rept 4
> -        .quad sym_phys(l2_identmap) + (idx << PAGE_SHIFT) + __PAGE_HYPERVISOR
> +        .quad sym_offset(l2_identmap) + (idx << PAGE_SHIFT) + 
> __PAGE_HYPERVISOR
>          idx = idx + 1
>          .endr
>          .fill L3_PAGETABLE_ENTRIES - 4, 8, 0
> @@ -146,9 +142,9 @@ l3_xenmap:
>          idx = 0
>          .rept L3_PAGETABLE_ENTRIES
>          .if idx == l3_table_offset(XEN_VIRT_START)
> -        .quad sym_phys(l2_xenmap) + __PAGE_HYPERVISOR
> +        .quad sym_offset(l2_xenmap) + __PAGE_HYPERVISOR
>          .elseif idx == l3_table_offset(FIXADDR_TOP - 1)
> -        .quad sym_phys(l2_fixmap) + __PAGE_HYPERVISOR
> +        .quad sym_offset(l2_fixmap) + __PAGE_HYPERVISOR
>          .else
>          .quad 0
>          .endif
> @@ -158,13 +154,13 @@ l3_xenmap:
>  
>  /* Top-level master (and idle-domain) page directory. */
>  GLOBAL(idle_pg_table)
> -        .quad sym_phys(l3_bootmap) + __PAGE_HYPERVISOR
> +        .quad sym_offset(l3_bootmap) + __PAGE_HYPERVISOR
>          idx = 1
>          .rept L4_PAGETABLE_ENTRIES - 1
>          .if idx == l4_table_offset(DIRECTMAP_VIRT_START)
> -        .quad sym_phys(l3_identmap) + __PAGE_HYPERVISOR
> +        .quad sym_offset(l3_identmap) + __PAGE_HYPERVISOR
>          .elseif idx == l4_table_offset(XEN_VIRT_START)
> -        .quad sym_phys(l3_xenmap) + __PAGE_HYPERVISOR
> +        .quad sym_offset(l3_xenmap) + __PAGE_HYPERVISOR
>          .else
>          .quad 0
>          .endif

> diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
> index 8bec67f..8172520 100644
> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -291,9 +291,6 @@ static void *__init bootstrap_map(const module_t *mod)
>      if ( start >= end )
>          return NULL;
>  
> -    if ( end <= BOOTSTRAP_MAP_BASE )
> -        return (void *)(unsigned long)start;
> -
>      ret = (void *)(map_cur + (unsigned long)(start & mask));
>      start &= ~mask;
>      end = (end + mask) & ~mask;
> @@ -641,6 +638,9 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>  
>      printk("Command line: %s\n", cmdline);
>  
> +    printk("Xen image base address: 0x%08lx\n",
> +           xen_phys_start ? xen_phys_start : (unsigned 
> long)xen_img_base_phys_addr);
> +
>      printk("Video information:\n");
>  
>      /* Print VGA display mode information. */
> @@ -835,10 +835,8 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>          uint64_t s, e, mask = (1UL << L2_PAGETABLE_SHIFT) - 1;
>          uint64_t end, limit = ARRAY_SIZE(l2_identmap) << L2_PAGETABLE_SHIFT;
>  
> -        /* Superpage-aligned chunks from BOOTSTRAP_MAP_BASE. */
>          s = (boot_e820.map[i].addr + mask) & ~mask;
>          e = (boot_e820.map[i].addr + boot_e820.map[i].size) & ~mask;
> -        s = max_t(uint64_t, s, BOOTSTRAP_MAP_BASE);
>          if ( (boot_e820.map[i].type != E820_RAM) || (s >= e) )
>              continue;
>  
> @@ -876,7 +874,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>              /* Select relocation address. */
>              e = end - reloc_size;
>              xen_phys_start = e;
> -            bootsym(trampoline_xen_phys_start) = e;
> +            bootsym(trampoline_xen_phys_start) = e - xen_img_base_phys_addr;
>  
>              /*
>               * Perform relocation to new physical address.
> @@ -886,7 +884,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>               */
>              load_start = (unsigned long)_start - XEN_VIRT_START;
>              barrier();
> -            move_memory(e + load_start, load_start, _end - _start, 1);
> +            move_memory(e + load_start, load_start + xen_img_base_phys_addr, 
> _end - _start, 1);
>  
>              /* Walk initial pagetables, relocating page directory entries. */
>              pl4e = __va(__pa(idle_pg_table));
> @@ -895,27 +893,27 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>                  if ( !(l4e_get_flags(*pl4e) & _PAGE_PRESENT) )
>                      continue;
>                  *pl4e = l4e_from_intpte(l4e_get_intpte(*pl4e) +
> -                                        xen_phys_start);
> +                                        xen_phys_start - 
> xen_img_base_phys_addr);
>                  pl3e = l4e_to_l3e(*pl4e);
>                  for ( j = 0; j < L3_PAGETABLE_ENTRIES; j++, pl3e++ )
>                  {
>                      /* Not present, 1GB mapping, or already relocated? */
>                      if ( !(l3e_get_flags(*pl3e) & _PAGE_PRESENT) ||
>                           (l3e_get_flags(*pl3e) & _PAGE_PSE) ||
> -                         (l3e_get_pfn(*pl3e) > 0x1000) )
> +                         (l3e_get_pfn(*pl3e) > PFN_DOWN(xen_phys_start)) )
>                          continue;
>                      *pl3e = l3e_from_intpte(l3e_get_intpte(*pl3e) +
> -                                            xen_phys_start);
> +                                            xen_phys_start - 
> xen_img_base_phys_addr);
>                      pl2e = l3e_to_l2e(*pl3e);
>                      for ( k = 0; k < L2_PAGETABLE_ENTRIES; k++, pl2e++ )
>                      {
>                          /* Not present, PSE, or already relocated? */
>                          if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) ||
>                               (l2e_get_flags(*pl2e) & _PAGE_PSE) ||
> -                             (l2e_get_pfn(*pl2e) > 0x1000) )
> +                             (l2e_get_pfn(*pl2e) > PFN_DOWN(xen_phys_start)) 
> )
>                              continue;
>                          *pl2e = l2e_from_intpte(l2e_get_intpte(*pl2e) +
> -                                                xen_phys_start);
> +                                                xen_phys_start - 
> xen_img_base_phys_addr);
>                      }
>                  }
>              }
> @@ -926,10 +924,10 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>                                     PAGE_HYPERVISOR_RWX | _PAGE_PSE);
>              for ( i = 1; i < L2_PAGETABLE_ENTRIES; i++, pl2e++ )
>              {
> -                if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) )
> +                if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) || 
> (l2e_get_pfn(*pl2e) > PFN_DOWN(xen_phys_start)))

Could this be split in two lines?

>                      continue;
>                  *pl2e = l2e_from_intpte(l2e_get_intpte(*pl2e) +
> -                                        xen_phys_start);
> +                                        xen_phys_start - 
> xen_img_base_phys_addr);
>              }
>  
>              /* Re-sync the stack and then switch to relocated pagetables. */
> @@ -998,6 +996,9 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>  
>      if ( !xen_phys_start )
>          panic("Not enough memory to relocate Xen.");
> +
> +    printk("New Xen image base address: 0x%08lx\n", xen_phys_start);
> +
>      reserve_e820_ram(&boot_e820, __pa(&_start), __pa(&_end));
>  
>      /* Late kexec reservation (dynamic start address). */
> @@ -1070,14 +1071,12 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>  
>          set_pdx_range(s >> PAGE_SHIFT, e >> PAGE_SHIFT);
>  
> -        /* Need to create mappings above BOOTSTRAP_MAP_BASE. */
> -        map_s = max_t(uint64_t, s, BOOTSTRAP_MAP_BASE);
> +        map_s = s;
>          map_e = min_t(uint64_t, e,
>                        ARRAY_SIZE(l2_identmap) << L2_PAGETABLE_SHIFT);
>  
>          /* Pass mapped memory to allocator /before/ creating new mappings. */
>          init_boot_pages(s, min(map_s, e));
> -        s = map_s;
>          if ( s < map_e )
>          {
>              uint64_t mask = (1UL << L2_PAGETABLE_SHIFT) - 1;
> diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c
> index 98310f3..baa6461 100644
> --- a/xen/arch/x86/x86_64/mm.c
> +++ b/xen/arch/x86/x86_64/mm.c
> @@ -44,7 +44,7 @@ unsigned int __read_mostly m2p_compat_vstart = 
> __HYPERVISOR_COMPAT_VIRT_START;
>  
>  /* Enough page directories to map into the bottom 1GB. */

>  l3_pgentry_t __section(".bss.page_aligned") l3_bootmap[L3_PAGETABLE_ENTRIES];
> -l2_pgentry_t __section(".bss.page_aligned") l2_bootmap[L2_PAGETABLE_ENTRIES];
> +l2_pgentry_t __section(".bss.page_aligned") l2_bootmap[4 * 
> L2_PAGETABLE_ENTRIES];

16KB?

? Confused.
>  
>  l2_pgentry_t *compat_idle_pg_table_l2;
>  
> diff --git a/xen/arch/x86/xen.lds.S b/xen/arch/x86/xen.lds.S
> index a399615..b666a3f 100644
> --- a/xen/arch/x86/xen.lds.S
> +++ b/xen/arch/x86/xen.lds.S
> @@ -38,7 +38,7 @@ SECTIONS
>    . = __XEN_VIRT_START;
>    __image_base__ = .;
>  #endif
> -  . = __XEN_VIRT_START + MB(1);
> +  . = __XEN_VIRT_START + XEN_IMG_OFFSET;
>    _start = .;
>    .text : {
>          _stext = .;            /* Text and read-only data */
> diff --git a/xen/include/asm-x86/config.h b/xen/include/asm-x86/config.h
> index 3e9be83..6d21cb7 100644
> --- a/xen/include/asm-x86/config.h
> +++ b/xen/include/asm-x86/config.h
> @@ -114,6 +114,7 @@ extern unsigned long trampoline_phys;
>                   trampoline_phys-__pa(trampoline_start)))
>  extern char trampoline_start[], trampoline_end[];
>  extern char trampoline_realmode_entry[];
> +extern unsigned int xen_img_base_phys_addr;
>  extern unsigned int trampoline_xen_phys_start;
>  extern unsigned char trampoline_cpu_started;
>  extern char wakeup_start[];
> @@ -280,6 +281,8 @@ extern unsigned char boot_edid_info[128];
>  #endif
>  #define DIRECTMAP_VIRT_END      (DIRECTMAP_VIRT_START + DIRECTMAP_SIZE)
>  
> +#define XEN_IMG_OFFSET          0x200000
> +
>  #ifndef __ASSEMBLY__
>  
>  /* This is not a fixed value, just a lower limit. */
> diff --git a/xen/include/asm-x86/page.h b/xen/include/asm-x86/page.h
> index 87b3341..27481ac 100644
> --- a/xen/include/asm-x86/page.h
> +++ b/xen/include/asm-x86/page.h
> @@ -283,7 +283,7 @@ extern root_pgentry_t 
> idle_pg_table[ROOT_PAGETABLE_ENTRIES];
>  extern l2_pgentry_t  *compat_idle_pg_table_l2;
>  extern unsigned int   m2p_compat_vstart;
>  extern l2_pgentry_t l2_xenmap[L2_PAGETABLE_ENTRIES],
> -    l2_bootmap[L2_PAGETABLE_ENTRIES];
> +    l2_bootmap[4*L2_PAGETABLE_ENTRIES];

? Why do we need to expand this to be 16kB?

>  extern l3_pgentry_t l3_bootmap[L3_PAGETABLE_ENTRIES];
>  extern l2_pgentry_t l2_identmap[4*L2_PAGETABLE_ENTRIES];
>  extern l1_pgentry_t l1_identmap[L1_PAGETABLE_ENTRIES],
> -- 
> 1.7.10.4
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> http://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.