[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 22/23] x86: make Xen early boot code relocatable
On Tue, Aug 11, 2015 at 12:48:06PM -0400, Konrad Rzeszutek Wilk wrote: > On Mon, Jul 20, 2015 at 04:29:17PM +0200, Daniel Kiper wrote: > > Every multiboot protocol (regardless of version) compatible image must > > specify its load address (in ELF or multiboot header). Multiboot protocol > > compatible loader have to load image at specified address. However, there > > is no guarantee that the requested memory region (in case of Xen it starts > > at 1 MiB and ends at 17 MiB) where image should be loaded initially is a RAM > > and it is free (legacy BIOS platforms are merciful for Xen but I found at > > least one EFI platform on which Xen load address conflicts with EFI boot > > services; it is Dell PowerEdge R820 with latest firmware). To cope with > > that problem we must make Xen early boot code relocatable. This patch does > > that. However, it does not add multiboot2 protocol interface which is done > > in next patch. > > s/next patch/"x86: add multiboot2 protocol support for relocatable image." > > > > This patch changes following things: > > - default load address is changed from 1 MiB to 2 MiB; I did that because > > initial page tables are using 2 MiB huge pages and this way required > > updates for them are quite easy; it means that e.g. we avoid spacial > > cases for beginning and end of required memory region if it live at > > address not aligned to 2 MiB, > > - %ebp register is used as a storage for Xen image base address; this way > > we can get this value very quickly if it is needed; however, %ebp > > register > > is not used directly to access a given memory region, > > - %fs register is filled with segment descriptor which describes memory > > region > > with Xen image (it could be relocated or not); it is used to access > > some of > > 'memory region with Xen image' ? Not sure I follow? > > Perhaps: > segment descriptor which starts (0) at Xen image base (_start). > > > > Xen data in early boot code; potentially we can use above mentioned > > segment > > descriptor to access data using %ds:%esi and/or %es:%esi (e.g. movs*); > > however, > > I think that it could unnecessarily obfuscate code (e.g. we need at > > least > > to operations to reload a given segment descriptor) and current solution > > s/to/two/ ? > > looks quite optimal. > > > > Signed-off-by: Daniel Kiper <daniel.kiper@xxxxxxxxxx> [...] > > diff --git a/xen/arch/x86/boot/head.S b/xen/arch/x86/boot/head.S > > index 3f1054d..d484f68 100644 > > --- a/xen/arch/x86/boot/head.S > > +++ b/xen/arch/x86/boot/head.S [...] > > trampoline_bios_setup: > > + mov %ebp,%esi > > + > > + /* Initialise GDT and basic data segments. */ > > + add %ebp,sym_offset(gdt_boot_descr_addr)(%esi) > > + lgdt sym_offset(gdt_boot_descr)(%esi) > > + > > + mov $BOOT_DS,%ecx > > + mov %ecx,%ds > > + mov %ecx,%es > > + mov %ecx,%fs > > + mov %ecx,%gs > > + mov %ecx,%ss > > + > > > The non-EFI boot path is now: > > start > \- __start > \- multiboot2_proto > | jmp trampoline_bios_setup > | > \-and if not MB2: jmp trampoline_bios_setup. > > > In here you tweak the GDT and reload the %ds - but during > this call chain we do touch the %ds - via: > > __start+27>: testb $0x1,(%rbx) > __start+30>: cmovne 0x4(%rbx),%edx > > which is OK (as MB1 says that the %ds has to cover up to 4GB). > But I wonder why the __start code had the segments reloaded so early? > Was the bootloader not setting the proper segments? This is very good question. I was asking myself about that thing at least once. Sadly, I cannot find real explanation. > Let me double-check what SYSLINUX's mboot.c32 does. Perhaps > it had done something odd in the past. Good idea! > > /* Set up trampoline segment 64k below EBDA */ > > movzwl 0x40e,%ecx /* EBDA segment */ > > cmp $0xa000,%ecx /* sanity check (high) */ > > @@ -340,33 +357,58 @@ trampoline_bios_setup: > > cmovb %edx,%ecx /* and use the smaller */ > > > > trampoline_setup: > > Would it make sense to add: > > /* Gets called from EFI (from x86_32_switch) and legacy (see above) boot > loaders. */ > > > + mov %ebp,%esi > > + > > + /* Initialize 0-15 bits of BOOT_FS segment descriptor base > > address. */ > > + mov %ebp,%edx > > + shl $16,%edx > > + or %edx,(sym_offset(trampoline_gdt)+BOOT_FS)(%esi) > > + > > + /* Initialize 16-23 bits of BOOT_FS segment descriptor base > > address. */ > > + mov %ebp,%edx > > + shr $16,%edx > > + and $0x000000ff,%edx > > + or %edx,(sym_offset(trampoline_gdt)+BOOT_FS+4)(%esi) > > + > > + /* Initialize 24-31 bits of BOOT_FS segment descriptor base > > address. */ > > + mov %ebp,%edx > > + and $0xff000000,%edx > > + or %edx,(sym_offset(trampoline_gdt)+BOOT_FS+4)(%esi) > > + > > + /* Initialize %fs and later use it to access Xen data if possible. > > */ > > + mov $BOOT_FS,%edx > > + mov %edx,%fs > > + > > We just modified the GDT. Should we reload it (lgdt?)? I do not think it is needed. Intel 64 and IA-32 Architectures Software Developerâs Manual, Volume 2 (2A, 2B & 2C): Instruction Set Reference, A-Z says: LGDT ... Loads the values in the source operand into the global descriptor table register (GDTR)... ...and ... MOV ... If the destination operand is a segment register (DS, ES, FS, GS, or SS), the source operand must be a valid segment selector. In protected mode, moving a segment selector into a segment register automatically causes the segment descriptor information associated with that segment selector to be loaded into the hidden (shadow) part of the segment register. While loading this information, the segment selector and segment descriptor information is validated (see the "Operation" algorithm below). The segment descriptor data is obtained from the GDT or LDT entry for the specified segment selector. Additionally, Intel 64 and IA-32 Architectures Software Developerâs Manual Volume 3 (3A, 3B & 3C): System Programming Guide, section 3.4.3, Segment Registers says: Every segment register has a "visible" part and a "hidden" part. (The hidden part is sometimes referred to as a "descriptor cache" or a "shadow register.") When a segment selector is loaded into the visible part of a segment register, the processor also loads the hidden part of the segment register with the base address, segment limit, and access control information from the segment descriptor pointed to by the segment selector. The information cached in the segment register (visible and hidden) allows the processor to translate addresses without taking extra bus cycles to read the base address and limit from the segment descriptor. In systems in which multiple processors have access to the same descriptor tables, it is the responsibility of software to reload the segment registers when the descriptor tables are modified (side note: GDTR reload is not required! probably the same applies to UP systems and if CPU update own active GDT). If this is not done, an old segment descriptor cached in a segment register might be used after its memory-resident version has been modified. AIUI, only GDT address and limit are loaded into GDTR when lgdt is executed. Segment descriptor is loaded directly from memory into segment register (hiden part) only when relevant load instruction is used (e.g. mov %edx,%fs). Though GDT content probably could be stored in CPU cache but nothing suggest that CPU caching interfere in one way or another with segment descriptor load. So, it looks that every change in active GDT is immediately available for relevant segment descriptor load instruction. I was not able to find anything which invalidates above. So, everything suggest that we do not need an extra lgdt. > > /* Reserve 64kb for the trampoline. */ > > sub $0x1000,%ecx [...] > > /* Hook 4kB mappings of first 2MB of memory into L2. */ > > - mov $sym_phys(l1_identmap)+__PAGE_HYPERVISOR,%edi > > - mov %edi,sym_phys(l2_xenmap) > > - mov %edi,sym_phys(l2_bootmap) > > + lea (sym_offset(l1_identmap)+__PAGE_HYPERVISOR)(%ebp),%edi > > + mov %edi,%fs:sym_offset(l2_bootmap) > > But not to l2_xenmap? No, l?_xenmap maps Xen code and data only. > > /* Apply relocations to bootstrap trampoline. */ > > - mov sym_phys(trampoline_phys),%edx > > - mov $sym_phys(__trampoline_rel_start),%edi > > + mov %fs:sym_offset(trampoline_phys),%edx > > + lea sym_offset(__trampoline_rel_start)(%ebp),%edi > > + lea sym_offset(__trampoline_rel_stop)(%ebp),%esi [...] > > diff --git a/xen/include/asm-x86/page.h b/xen/include/asm-x86/page.h > > index 87b3341..27481ac 100644 > > --- a/xen/include/asm-x86/page.h > > +++ b/xen/include/asm-x86/page.h > > @@ -283,7 +283,7 @@ extern root_pgentry_t > > idle_pg_table[ROOT_PAGETABLE_ENTRIES]; > > extern l2_pgentry_t *compat_idle_pg_table_l2; > > extern unsigned int m2p_compat_vstart; > > extern l2_pgentry_t l2_xenmap[L2_PAGETABLE_ENTRIES], > > - l2_bootmap[L2_PAGETABLE_ENTRIES]; > > + l2_bootmap[4*L2_PAGETABLE_ENTRIES]; > > ? Why do we need to expand this to be 16kB? TBH, we need 8 KiB in the worst case. The worst case is when next GiB starts (e.g. 1 GiB ends and 2 GiB starts) in the middle of Xen image. In this situation we must hook up lower l2_bootmap table with lower l3_bootmap entry, higher l2_bootmap table with higher l3_bootmap entry and finally fill l2_bootmap relevant tables in proper way. Sadly, this method requires more calculations. To avoid that I have added 3 l2_bootmap tables and simply hook up one after another with relevant l3_bootmap entries. However, if you wish we can reduce number of l2_bootmap tables to two. This way code will be more complicated but we will save about 8 KiB. Daniel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |