[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 8/8] libelf: safety: Document safety principles in header file



Jan Beulich writes ("Re: [PATCH 8/8] libelf: safety: Document safety principles 
in header file"):
> On 09.12.16 at 16:44, <ian.jackson@xxxxxxxxxxxxx> wrote:
> > + *  - Stack local buffer variables containing information derived from
> > + *    the image (including structs, or byte buffers) must be
> > + *    completely zeroed, using elf_memset_unchecked (and an
> > + *    accompanying elf_iter_ok_counted) on entry to the function (or
> > + *    somewhere very obviously near there).
> > + *
> > + *        Rationale: This avoids uninitialised stack data being used
> > + *        as input to any of the loader.
> 
> What distinguishes a "buffer variable" from a plain variable? I.e.
> where is the boundary here as to which ones need zeroing?

The hazard is an undiscovered uninitialised variable.  "Undiscovered"
meaning static analysis tools won't find it (eg Coverity will find
many if not most normal uninitialised variables).

This is the biggest risk for variables which can be partially
initialised: ie, structs and arrays.  It is more difficult to reason
about variables which can be partially initialised.

> Also,
> I don't think zeroing is needed when a variable gets fully written
> over (like in a structure assignment). Do you perhaps want to
> limit the above to "directly derived from the image"?

Structure assignment does not necessarily write over the whole
variable.  It may leave the padding un-overwritten.

> > + *  - All integer variables should be unsigned.
> > + *
> > + *        Rationale: this avoids signed integer overflow (which has
> > + *        undefined behaviour in C, and if spotted by the compiler can
> > + *        cause it to generate bad code).
> 
> There are certainly cases where one explicitly needs negative
> values, and hence signed integer types. Therefore I don't think
> we can outright exclude this. Perhaps limit by "... variables with
> non-negative value range ..."?

There are AFAICT the following signed variables in libelf: some error
codes; the c parameter to elf_memset_safe; the `nr' parameters to
elf_xen_feature_set/get.

I think perhaps the exceptions should be mentioned.  The error codes
are unfortunate but they are safe and too hard to change, and should
get a comment.  The c and nr parameters should be made unsigned.  The
open-coded ints for some function return values should be changed to
elf_errorstatus.

I will update the series to improve this.

Anyway, it is not necessary to deal with actually negative
quantities.  If it should become necessary in the future then:

    + *  - If it is desirable to breach these rules, there should be a
    + *    comment explaining why this is OK.

> > + *  - Do not add code which is not necessary for libelf to function
> > + *    with correct input, or to avoid being vulnerable to incorrect
> > + *    input.  Do not add additional functionally-unnecessary checks
> > + *    for diagnosing problems with the image, or validating sanity of
> > + *    the input ELF.
...
> While I can understand your reasoning, I continue to think that
> libelf in a ELF loader, and hence should check certain things. The
> best example I currently know of are e_[ps]hentsize, which the
> code uses without checking the lower valid bound.

You don't give a counterargument to my reasoning:

> > + *        Rationale: Redundant checks have almost zero benefit because
> > + *        1. we do not expect ELF-generating tools to generate invalid
> > + *        ELFs so these checks' failure paths will very likely never
> > + *        be executed anywhere, and 2. anyone debugging such a
> > + *        hypothetical bad ELF will have a variety of tools available
> > + *        which will do a much better job of analysing it.

I don't find "libelf in a ELF loader" a convincing counterargument.

In practical terms, not checking e_[ps]hentsize means that
hypothetical broken images with too-small e_[ps]hentsize will result
in either a confusing error reported later by some other bit of libelf
(for example, due to an out-of-bounds access setting the `broken'
flag), or a badly-constructed guest.  This is IMO a quite tolerable
consequence.  Following a confusing error message the afflicted user
is very likely to do something sensible, like feeding their image to a
disassembler, which would immediately diagnose the problem.  Even
if the result is a badly-constructed guest, the symptoms are very
likely to induce an appropriate debugging response, and not likely to
cause great inconvenience (compared to a specific diagnosis).

Especially, since (and AFIACT you are not disputing this) we do not
expect that such a check's failure path would ever be executed
anywhere.  The expected benefit of the extra check is negligible.

The cost of the extra check is the risk of making mistakes, resulting
in security vulnerabilities.  Security vulnerabilties have a high
cost.  So even if the probability of making such a mistake is fairly
low, the expected cost of the extra checks is definitely
non-negligible.

> Overall I'm certainly not meaning to veto any of this, but I think it
> would be better for some other REST maintainer (agreeing with
> your way of thinking) to ack this.

Thanks anyway for your careful review and attention to detail.

George, since you seem to be reading this thread: what do you think
about this disagreement about the right direction to go ?  I think
that the fact that the other approach resulted in us committing a
vulnerability to staging is a strong indication that we need better
defence against (inevitable) human error.

Thanks,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.