[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 2/5] x86/boot: Split bootsym() into four types of relocations



On Fri, 2019-08-30 at 17:10 +0200, Jan Beulich wrote:
> On 21.08.2019 18:35, David Woodhouse wrote:
> > --- a/xen/arch/x86/boot/head.S
> > +++ b/xen/arch/x86/boot/head.S
> > @@ -699,14 +699,30 @@ trampoline_setup:
> >          cmp     $sym_offs(__trampoline_rel_stop),%edi
> >          jb      1b
> >  
> > -        /* Patch in the trampoline segment. */
> > +        mov     $sym_offs(__trampoline32_rel_start),%edi
> > +1:
> > +        mov     %fs:(%edi),%eax
> > +        add     %edx,%fs:(%edi,%eax)
> > +        add     $4,%edi
> > +        cmp     $sym_offs(__trampoline32_rel_stop),%edi
> > +        jb      1b
> > +
> > +        mov     $sym_offs(__bootsym_rel_start),%edi
> > +1:
> > +        mov     %fs:(%edi),%eax
> > +        add     %edx,%fs:(%edi,%eax)
> > +        add     $4,%edi
> > +        cmp     $sym_offs(__bootsym_rel_stop),%edi
> > +        jb      1b
> 
> With the smaller sets now - are we risking misbehavior if one
> of the relocation sets ends up empty? This wasn't reasonable to
> expect before, but I think it would be nice to have a build-time
> check rather than a hard to debug crash in case this happens.

Or just code it differently as a while() instead of a do{}while() so
that it actually copes with a zero-length section. 

> > --- a/xen/arch/x86/boot/trampoline.S
> > +++ b/xen/arch/x86/boot/trampoline.S
> > @@ -16,21 +16,62 @@
> >   * not guaranteed to persist.
> >   */
> >  
> > -/* NB. bootsym() is only usable in real mode, or via BOOT_PSEUDORM_DS. */
> > +/*
> > + * There are four sets of relocations:
> > + *
> > + * bootsym():     Boot-time code relocated to low memory and run only once.
> > + *                Only usable at boot; in real mode or via 
> > BOOT_PSEUDORM_DS.
> > + * bootdatasym(): Boot-time BIOS-discovered data, relocated back up to Xen
> > + *                image after discovery.
> > + * trampsym():    Permanent trampoline code relocated into low memory for 
> > AP
> > + *                startup and wakeup.
> > + * tramp32sym():  32-bit trampoline code which at boot can be used directly
> > + *                from the Xen image in memory, but which will need to be
> > + *                relocated into low (well, into *mapped*) memory in order
> > + *                to be used for AP startup.
> > + */
> >  #undef bootsym
> >  #define bootsym(s) ((s)-trampoline_start)
> >  
> >  #define bootsym_rel(sym, off, opnd...)     \
> >          bootsym(sym),##opnd;               \
> >  111:;                                      \
> > -        .pushsection .trampoline_rel, "a"; \
> > +        .pushsection .bootsym_rel, "a";    \
> >          .long 111b - (off) - .;            \
> >          .popsection
> >  
> >  #define bootsym_segrel(sym, off)           \
> >          $0,$bootsym(sym);                  \
> >  111:;                                      \
> > -        .pushsection .trampoline_seg, "a"; \
> > +        .pushsection .bootsym_seg, "a";    \
> > +        .long 111b - (off) - .;            \
> > +        .popsection
> > +
> > +#define bootdatasym(s) ((s)-trampoline_start)
> > +#define bootdatasym_rel(sym, off, opnd...) \
> > +        bootdatasym(sym),##opnd;           \
> > +111:;                                      \
> > +        .pushsection .bootdatasym_rel, "a";\
> > +        .long 111b - (off) - .;            \
> > +        .popsection
> > +
> > +#undef trampsym
> 
> Why this and ...
> 
> > +#define trampsym(s) ((s)-trampoline_start)
> > +
> > +#define trampsym_rel(sym, off, opnd...)    \
> > +        trampsym(sym),##opnd;              \
> > +111:;                                      \
> > +        .pushsection .trampsym_rel, "a";   \
> > +        .long 111b - (off) - .;            \
> > +        .popsection
> > +
> > +#undef tramp32sym
> 
> ... this #undef? You have none ahead of the bootdatasym #define-s,
> and (other than for bootsym) there's not conflicting C level one
> afaics.
> 
> > +#define tramp32sym(s) ((s)-trampoline_start)
> > +
> > +#define tramp32sym_rel(sym, off, opnd...)  \
> > +        tramp32sym(sym),##opnd;            \
> > +111:;                                      \
> > +        .pushsection .tramp32sym_rel, "a"; \
> >          .long 111b - (off) - .;            \
> >          .popsection
> 
> After your reply to my comment regarding the redundancy here I've
> checked (in your git branch) how things end up. Am I mistaken, or
> are the trampsym and tramp32sym #define-s entirely identical
> (except for the relocations section name)? Even between the others
> there's little enough difference, so it continues to be unclear to
> me why you think it's better to have four instances of about the
> same (not entirely trivial) thing.

The distinction is that in a no-real-mode boot tramp32 is used in place
in the Xen image at the physical address it happened to be loaded at,
and then *again* later in the AP/wakeup path. In the latter case it
needs to be moved to low memory (or we need to put the physical
location into idle_pg_table which seemed to be harder, as discussed).

So tramp32 symbols get relocated *twice*, while the plain tramp symbols
don't, but actually we could probably ditch the distinction and treat
them all the same, which would reduce the four categories to three.

I'll take a look.

I suppose we could also combine bootsym and bootdatasym, and copy that
*whole* section back up to the Xen image; both code and data. But I'm
inclined to prefer keeping them separate and only copying the data back
up.

> > @@ -48,16 +89,19 @@
> >  GLOBAL(trampoline_realmode_entry)
> >          mov     %cs,%ax
> >          mov     %ax,%ds
> > -        movb    $0xA5,bootsym(trampoline_cpu_started)
> > +        movb    $0xA5,trampsym(trampoline_cpu_started)
> >          cld
> >          cli
> > -        lidt    bootsym(idt_48)
> > -        lgdt    bootsym(gdt_48)
> > +        lidt    trampsym(idt_48)
> > +        lgdt    trampsym(gdt_48)
> >          mov     $1,%bl                    # EBX != 0 indicates we are an AP
> >          xor     %ax, %ax
> >          inc     %ax
> >          lmsw    %ax                       # CR0.PE = 1 (enter protected 
> > mode)
> > -        ljmpl   $BOOT_CS32,$bootsym_rel(trampoline_protmode_entry,6)
> > +        ljmpl   $BOOT_CS32,$tramp32sym_rel(trampoline_protmode_entry,6)
> > +
> > +GLOBAL(trampoline_cpu_started)
> > +        .byte   0
> 
> The movement of this item here seems unrelated to this change; it's
> also not mentioned in the description.

Andy's already moved that elsewhere anyway; I'll undo that as I rebase.

> > @@ -115,10 +115,10 @@ static void __init relocate_trampoline(unsigned long 
> > phys)
> >            trampoline_ptr < __trampoline_rel_stop;
> >            ++trampoline_ptr )
> >          *(u32 *)(*trampoline_ptr + (long)trampoline_ptr) += phys;
> > -    for ( trampoline_ptr = __trampoline_seg_start;
> > -          trampoline_ptr < __trampoline_seg_stop;
> > +    for ( trampoline_ptr = __trampoline32_rel_start;
> > +          trampoline_ptr < __trampoline32_rel_stop;
> >            ++trampoline_ptr )
> > -        *(u16 *)(*trampoline_ptr + (long)trampoline_ptr) = phys >> 4;
> > +        *(u32 *)(*trampoline_ptr + (long)trampoline_ptr) += phys;
> >  }
> 
> Seeing this and adding in the comment about the redundant tramp*sym
> macros I wonder why the relocations can't be put together in a single
> section, and there be just a single loop here. (I realize this
> entire function gets deleted from here later on, but anyway.)

Yeah, I think it's worth the harmless double-relocation in the non-EFI
case to treat everything (well tramp vs. tramp32) the same there. I'll
do that.

Thanks.

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.