[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4 3/4] libelf: rewrite symtab/strtab loading



>>> On 29.02.16 at 11:57, <roger.pau@xxxxxxxxxx> wrote:
> El 29/2/16 a les 10:31, Jan Beulich ha escrit:
>>>>> On 26.02.16 at 18:02, <roger.pau@xxxxxxxxxx> wrote:
>>> The layout is as follows (I should add this to the patch itself as a
>>> comment, since I guess this is still quite confusing):
>>>
>>> +------------------------+
>>> |                        |
>>> | KERNEL                 |
>>> |                        |
>>> +------------------------+ pend
>>> | ALIGNMENT              |
>>> +------------------------+ bsd_symtab_pstart
>>> |                        |
>>> | size                   | bsd_symtab_pend - bsd_symtab_pstart
>>> |                        |
>>> +------------------------+
>>> |                        |
>>> | ELF header             |
>>> |                        |
>>> +------------------------+
>>> |                        |
>>> | ELF section header 0   | Dummy section header
>>> |                        |
>>> +------------------------+
>>> |                        |
>>> | ELF section header 1   | SYMTAB section header
>>> |                        |
>>> +------------------------+
>>> |                        |
>>> | ELF section header 2   | STRTAB section header
>>> |                        |
>>> +------------------------+
>>> |                        |
>>> | SYMTAB                 |
>>> |                        |
>>> +------------------------+
>>> |                        |
>>> | STRTAB                 |
>>> |                        |
>>> +------------------------+ bsd_symtab_pend
>>>
>>> There are always only tree sections because that's all we need, section
>>> 0 is ignored, section 1 is used to describe the SYMTAB, and section 2 is
>>> used to describe the STRTAB.
>> 
>> All fine, but this still doesn't clarify how the kernel learns where
>> bsd_symtab_pstart is.
> 
> The BSDs linker scripts places an "end" symbol after all the loaded
> sections:
> 
> https://svnweb.freebsd.org/base/head/sys/conf/ldscript.amd64?revision=284870&; 
> view=co

That's fine. But how do they know it is legitimate to even touch what
follows, not to speak of assign meaning to the value found there?
And are there no alignment/padding considerations necessary?

>>> According to the ELF spec there can only be
>>> one SYMTAB, so that's all we need.
>> 
>> Did you perhaps overlook the spec saying "... but this restriction
>> may be relaxed in the future", plus the fact that we're talking
>> about an executable file here (which commonly have two symbol
>> tables - .dynsym and .symtab), not an object one? (This isn't to
>> say that you need to make the code handle multiple ones, if you
>> know that *BSD will only ever have one.)
> 
> DYNSYM is just a subset of SYMTAB, BSDs prefer (it's not a strict
> requirement) the SYMTAB because of the in-kernel debugger. Also DYNSYM
> is already loaded by default because it's covered by the program headers.
> 
> I can add support for loading multiple SYMTABs/STRTABs, but shouldn't we
> wait until the spec is updated? As I read the spec ATM, an ELF file with
> multiple SYMTABs is invalid. I assume that if the ELF format ever allows
> more than one SYMTAB, the version is going to be bumped at least (or
> some other field is going to be used to signal this change).

As said I don't see the need for you to add multiple symbol table
support. I only meant to point out that the potential exists.

>>>>>      sz = elf_round_up(elf, sz);
>>>>>  
>>>>> -    /* Space for the symbol and string tables. */
>>>>> +    /* Space for the symbol and string table. */
>>>>>      for ( i = 0; i < elf_shdr_count(elf); i++ )
>>>>>      {
>>>>>          shdr = elf_shdr_by_index(elf, i);
>>>>>          if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) )
>>>>>              /* input has an insane section header count field */
>>>>>              break;
>>>>> -        type = elf_uval(elf, shdr, sh_type);
>>>>> -        if ( (type == SHT_STRTAB) || (type == SHT_SYMTAB) )
>>>>> -            sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size));
>>>>> +
>>>>> +        if ( elf_uval(elf, shdr, sh_type) != SHT_SYMTAB )
>>>>> +            continue;
>>>>> +
>>>>> +        sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size));
>>>>> +        shdr = elf_shdr_by_index(elf, elf_uval(elf, shdr, sh_link));
>>>>> +
>>>>> +        if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) )
>>>>> +            /* input has an insane section header count field */
>>>>> +            break;
>>>>> +
>>>>> +        if ( elf_uval(elf, shdr, sh_type) != SHT_STRTAB )
>>>>> +            /* Invalid symtab -> strtab link */
>>>>> +            break;
>>>>
>>>> This is not sufficient - what if sh_link is out of bounds, but the
>>>> bogusly accessed field happens to hold SHT_STRTAB? (Oddly
>>>> enough you have at least an SHN_UNDEF check in the second
>>>> loop below.)
>>>
>>> The out-of-bounds access would be detected by elf_access_ok (if it's out
>>> of the scope of the ELF file, which is all we care about). In fact the
>>> elf_access_ok above could be removed because elf_uval already performs
>>> out-of-bounds checks. The result is definitely no worse that what we are
>>> doing ATM.
>> 
>> No, the out of bounds check should be more strict than just
>> considering the whole image: The image is broken if the link
>> points to a non-existing section.
> 
> Ah, do you mean I should mark the image as broken if either of the
> checks fail?

Perhaps, but my main point continue to be that there is a check
missing here.

>>>>> @@ -186,79 +199,169 @@ void elf_parse_bsdsyms(struct elf_binary *elf, 
>>>>> uint64_t pstart)
>>>>>  
>>>>>  static void elf_load_bsdsyms(struct elf_binary *elf)
>>>>>  {
>>>>> -    ELF_HANDLE_DECL(elf_ehdr) sym_ehdr;
>>>>> -    unsigned long sz;
>>>>> -    elf_ptrval maxva;
>>>>> -    elf_ptrval symbase;
>>>>> -    elf_ptrval symtab_addr;
>>>>> -    ELF_HANDLE_DECL(elf_shdr) shdr;
>>>>> -    unsigned i, type;
>>>>> +    /*
>>>>> +     * Header that is placed at the end of the kernel and allows
>>>>> +     * the OS to find where the symtab and strtab have been loaded.
>>>>> +     * It mimics a valid ELF file header, although it only contains
>>>>> +     * a symtab and a strtab section.
>>>>> +     *
>>>>> +     * NB: according to the ELF spec there's only ONE symtab per ELF
>>>>> +     * file, and accordingly we will only load the corresponding
>>>>> +     * strtab, so we only need three section headers in our fake ELF
>>>>> +     * header (first section header is always a dummy).
>>>>> +     */
>>>>> +    struct {
>>>>> +        uint32_t size;
>>>>> +        struct {
>>>>> +            elf_ehdr header;
>>>>> +            elf_shdr section[3];
>>>>> +        } __attribute__((packed)) elf_header;
>>>>> +    } __attribute__((packed)) header;
>>>>
>>>> If this is placed at the end, how can the OS reasonably use it
>>>> without knowing that there are exactly 3 section header
>>>> entries? I.e. how would it find the base of this structure?
>>>
>>> See the commend below, the base is found at the end of the kernel, and
>>> then the ELF header contains the right pointers to the sections headers
>>> (by appropriately setting e_shoff).
>> 
>> Thing is - I can't see which "comment below" you try to refer me to.
> 
> I was trying to point you to the big diagram/layout that I've posted at
> the start of the email.
> 
> It doesn't need to know there are exactly 3 sections, this is fetched
> form the ELF header e_shnum field. And the ELF header is fetched using
> the "end" symbol as a reference to pend.

Ah, okay, so the ABI is _not_ for the kernel to start looking from
the end, but to start looking from its own image end.

>>>>> +    /* Zero the dummy section. */
>>>>
>>>> Dummy? And anyway I think you mean "section header" here.
>>>
>>> Yes, the right comment would be: zero the dummy section header.
>> 
>> But then still - why "dummy"? The 0-th section header has a purpose
>> beyond being a filler, even if that purpose doesn't matter for the
>> intentions you have here.
> 
> OK, what about if I replace the comment with:
> 
> /* Zero the undefined section header. */
> 
> I think that's more inline with the specification. I could also fill it
> with specific values if you think it's clearer:
> 
> elf_store_field_bitness(..., sh_name, 0);
> elf_store_field_bitness(..., sh_type, SHT_NULL);
> elf_store_field_bitness(..., sh_flags, 0);
> [...]
> 
> Which is equivalent to zeroing it.

Just zeroing is fine afaic.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.