[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v11 07/13] x86: add multiboot2 protocol support for EFI platforms



On 1/11/17 1:47 PM, Daniel Kiper wrote:
> On Tue, Jan 10, 2017 at 02:51:27PM -0600, Doug Goldstein wrote:
>> On 1/9/17 7:37 PM, Doug Goldstein wrote:
>>> On 12/5/16 4:25 PM, Daniel Kiper wrote:
>>
>>>> diff --git a/xen/arch/x86/efi/efi-boot.h b/xen/arch/x86/efi/efi-boot.h
>>>> index 62c010e..dc857d8 100644
>>>> --- a/xen/arch/x86/efi/efi-boot.h
>>>> +++ b/xen/arch/x86/efi/efi-boot.h
>>>> @@ -146,6 +146,8 @@ static void __init 
>>>> efi_arch_process_memory_map(EFI_SYSTEM_TABLE *SystemTable,
>>>>  {
>>>>      struct e820entry *e;
>>>>      unsigned int i;
>>>> +    /* Check for extra mem for mbi data if Xen is loaded via multiboot2 
>>>> protocol. */
>>>> +    UINTN extra_mem = efi_enabled(EFI_LOADER) ? 0 : (64 << 10);
>>>
>>> Just wondering where the constant came from? And if there should be a
>>> little bit of information about it. To me its just weird to shift 64.
>>
>> Its the size of the stack used in the assembly code.
> 
> No, it is trampoline region size.

trampoline + stack in head.S We take the address where we're going to
copy the trampoline and set the stack to 0x10000 past it.


> 
>>>>      /* Populate E820 table and check trampoline area availability. */
>>>>      e = e820map - 1;
>>>> @@ -168,7 +170,8 @@ static void __init 
>>>> efi_arch_process_memory_map(EFI_SYSTEM_TABLE *SystemTable,
>>>>              /* fall through */
>>>>          case EfiConventionalMemory:
>>>>              if ( !trampoline_phys && desc->PhysicalStart + len <= 
>>>> 0x100000 &&
>>>> -                 len >= cfg.size && desc->PhysicalStart + len > cfg.addr )
>>>> +                 len >= cfg.size + extra_mem &&
>>>> +                 desc->PhysicalStart + len > cfg.addr )
>>>>                  cfg.addr = (desc->PhysicalStart + len - cfg.size) & 
>>>> PAGE_MASK;
>>>
>>> So this is where the current series blows up and fails on real hardware.
>>
>> Honestly this was my misunderstanding and this shouldn't ever be used to
>> get memory for the trampoline. This also has the bug in it that it needs
>> to be:
>>
>> ASSERT(cfg.size > 0);
>> cfg.addr = (desc->PhysicalStart + len - (cfg.size + extra_mem) & PAGE_MASK;
> 
> As I said earlier. This extra_mem stuff is (maybe) wrong and should be fixed
> in one way or another. Hmmm... It looks OK. I will double check it because
> I do not looked at this code long time and maybe I am missing something.

cfg.size needs to be the size of the trampolines + stack.


>>> No where in the EFI + MB2 code path is cfg.size ever initialized. Its
>>> only initialized in the straight EFI case. The result is that cfg.addr
>>> is set to the section immediately following this. Took a bit to
>>> trackdown because I checked for memory overlaps with where Xen was
>>> loaded and where it was relocated to but forgot to check for overlaps
>>> with the trampoline code. This is the address where the trampoline jumps
>>> are copied.
>>>
>>> Personally I'd like to see an ASSERT added or the code swizzled around
>>> in such a way that its not possible to get into a bad state. But this is
>>> probably another patch series.
>>>
>>>>              /* fall through */
>>>>          case EfiLoaderCode:
>>>> @@ -210,12 +213,14 @@ static void *__init 
>>>> efi_arch_allocate_mmap_buffer(UINTN map_size)
>>>>
>>>>  static void __init efi_arch_pre_exit_boot(void)
>>>>  {
>>>> -    if ( !trampoline_phys )
>>>> -    {
>>>> -        if ( !cfg.addr )
>>>> -            blexit(L"No memory for trampoline");
>>>> +    if ( trampoline_phys )
>>>> +        return;
>>>> +
>>>> +    if ( !cfg.addr )
>>>> +        blexit(L"No memory for trampoline");
>>>> +
>>>> +    if ( efi_enabled(EFI_LOADER) )
>>>>          relocate_trampoline(cfg.addr);
>>
>> Why is this call even here anymore? Its called in
>> efi_arch_memory_setup() already. If it was unable to allocate memory
>> under the 1mb region its just going to trample over ANY conventional
>> memory region that might be in use.
> 
> Trampoline is relocated in xen/arch/x86/boot/head.S.

For the MB2/MB case. But for the straight EFI case its called in
efi_arch_memory_setup() and then you've added the wrapper of
efi_enabled(EFI_LOADER) which in theory would have it called again in
the straight EFI case if trampoline_phys isn't set and cfg.addr is set.


> 
>>>> -    }
>>>>  }
>>>>
>>>>  static void __init noreturn efi_arch_post_exit_boot(void)
>>>> @@ -653,6 +658,43 @@ static bool_t __init 
>>>> efi_arch_use_config_file(EFI_SYSTEM_TABLE *SystemTable)
>>>>
>>>>  static void efi_arch_flush_dcache_area(const void *vaddr, UINTN size) { }
>>>>
>>>> +paddr_t __init efi_multiboot2(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
>>>> *SystemTable)
>>>> +{
>>>> +    EFI_GRAPHICS_OUTPUT_PROTOCOL *gop;
>>>> +    UINTN cols, gop_mode = ~0, rows;
>>>> +
>>>> +    __set_bit(EFI_BOOT, &efi_flags);
>>>> +    __set_bit(EFI_RS, &efi_flags);
>>>> +
>>>> +    efi_init(ImageHandle, SystemTable);
>>>> +
>>>> +    efi_console_set_mode();
>>>> +
>>>> +    if ( StdOut->QueryMode(StdOut, StdOut->Mode->Mode,
>>>> +                           &cols, &rows) == EFI_SUCCESS )
>>>> +        efi_arch_console_init(cols, rows);
>>>> +
>>>> +    gop = efi_get_gop();
>>>> +
>>>> +    if ( gop )
>>>> +        gop_mode = efi_find_gop_mode(gop, 0, 0, 0);
>>>> +
>>>> +    efi_arch_edd();
>>>> +    efi_arch_cpu();
>>>> +
>>>> +    efi_tables();
>>>> +    setup_efi_pci();
>>>> +    efi_variables();
>>>
>>> This is probably where you missed the call to "efi_arch_memory_setup();"
>>> that gave me hell above.
>>
>> Well it turns out that calling "efi_arch_memory_setup()" isn't correct
>> because it also messes with the page tables AND also does the trampoline
> 
> Yep.
> 
>> relocation. Which after this call finishes then it does the trampoline
>> relocations in assembly. The code currently makes the assumption it can
> 
> I am not sure what do you mean here.
> 
>> use any conventional memory range below 1mb (and unfortunately does the
>> math incorrectly and instead uses the region following the conventional
> 
> Which code? Which math?

The code where cfg.size = 0 and the extra_mem was missing.

> 
>> memory region). You need to use AllocatePages() otherwise you are
>> trampling memory that might have been allocated by the bootloader or any
> 
> Bootloader code/data should be dead here.

Correct. Unfortunately on my Lenovo laptop and my Intel NUCs I can't
currently call ExitBootServices and a timer that iPXE has wired up has
some memory reserved down there and it was getting trampled. The real
answer is that we need to fix up stock Xen to be able to always call EBS.

> 
>> multiboot modules (e.g. tboot) prior to Xen.
> 
> How come?
> 
>> I'm just going to write a patch to fix the issues that I've found thus
>> far. I believe at this point there is something wrong with the page
>> tables because setting cr0 for the non-0 CPU in
>> trampoline_protmode_entry causes the machine to reboot.
> 
> I have a feeling that problem lays somewhere else. Probably far before
> trampoline_protmode_entry.
> 
>> I've also found where we have the possibility to call
>> relocate_trampoline() twice in the EFI case. Its protected by a check to
>> trampoline_phys but I'm not sure why we even have the code there to be
>> able to do so.
> 
> Consider case when boot services use memory below 1 MiB.
> 
> Thank you for testing my patch series.
> 
> Daniel
> 

So do we really need to go down below 1MiB? We're never going into
16-bit mode. Unless the other CPUs are starting up in 16-bit mode.

I'll keep whacking away at this until it lands on any piece of hardware
I've got.

-- 
Doug Goldstein

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.