[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v9 2/5] xen: introduce SYMBOLS_SUBTRACT and SYMBOLS_COMPARE



I would particularly welcome the opinion of hypervisor maintainers on
my type safety point, below.


Stefano Stabellini writes ("[Xen-devel] [PATCH v9 2/5] xen: introduce 
SYMBOLS_SUBTRACT and SYMBOLS_COMPARE"):
> +/*
> + * Calculate (end - start), where start and/or end are linker symbols,
> + * returning a ptrdiff_t. The size is in units of start's referent.
> + */
> +#define SYMBOLS_SUBTRACT(end, start) ({                                      
>  \
> +    (ptrdiff_t)((uintptr_t)(end) - (uintptr_t)(start)) / sizeof(*start);     
>  \
> +})

I'm afraid I have several problems with the details of these macros.

Firstly, I really don't like the lack a type check, which was present
in my proposal.  For example, given:
     extern struct wombat _foos_start[];
     extern char _end[];
and your macro definition, the two expressions
      SYMBOLS_SUBTRACT(_foos_start, _end)
     -SYMBOLS_SUBTRACT(_end, _foos_start)
produce different values because one is divided by sizeof(struct
wombat) and the other by sizeof(char).  This is hardly desirable.


Secondly, I find the argument ordering extremely confusing.  With your
macros DIFFERENCE(start,end) is negative but COMPARE(start,end) is
positive.  I suggest that you call the macro DIFFERENCE and have
DIFFERENCE(start,end) be positive.


Thirdly, in an earlier exchange:

> On Thu, 7 Feb 2019, Ian Jackson wrote:
> > FAOD, I think you should expect people to declare the linker symbols
> > either as I suggested:
> > 
> >      extern const struct wombat _wombats_start[];
> >      extern const struct abstract_symbol _wombats_end[];
> > 
> > (or along the lines of Jan's suggestion, but frankly I think that is
> > going to be too hard to sort out now.)
> 
> Yes, they are already declared this way, I would prefer to avoid
> changing the declaration as part of this series.

Are you sure things are not currently declared like this
     extern const struct wombat _wombats_start[];
     extern const struct wombat _wombats_end[];
?  Looking at your v9 series, that seems to be the case.

I think using the `abstract_symbol' version (or Jan's refinement of
it) is important because it arranges that accidental raw pointer
comparisons, which are UB, are a type error.

IOW this approach moves and centralises the burden of remembering (on
pain of miscompilation) that something is special about these
pointers, to the point where the linker symbols are introduced into
the C environment.  That is one place, and one where it is easier to
remember because something odd is already going on.

Unless you want to propose some other approach which will reliably
find these kind of linker symbol pointer comparison rule violations
and ensure that they are not deployed in production code ?


This type difference is why my proposal had two macros.

If both start and end have the same type, only one macro is necessary.
It should be one which checks the types are identical and returns a
signed value in units of the type.  That can be used for comparisons.
If desired, we could enhance the macro so that the compiler can prove
that the division can be removed when the result is compared for
inequality with 0.

But even with two types, it may be that there is only one macro needed
because the vast majority of use sites are formulaic.  You said
earlier:

> Most of the call sites only do things like (_end - _start) or (p >
> _end). I wanted to bring up cases that are not trivial.

When designing a general scheme for macros like this it is best to
consider the usual case and make it straightforward to use, and
bulletproof.  Ie presenting unusual cases as your examples is not
helpful to the design process for a macro.

IMO the situation you describe in the snipped I quote is what the
purpose of SYMBOL_DIFFERENCE is.  For the two examples you give,
always one of the arguments is _end.  So if we make the type of _end
be struct abstract_symbol[], SYMBOL_DIFFERENCE can (i) check that _end
is of that type (ii) return an answer in units of its other (first)
argument.

For pointers derived from _start, it is actually straightforwardly
legal to compare them with _start, or subtract _start from them.  So
no macrology is needed in that case.


Stepping back a bit, it is indeed the second symbol referrring to the
same memory object that triggers the compiler bug: the compiler
wrongly assumes that two extern declarations must refer to two
different objects.  Making the two declarations have different types
will simply prevent *all* triggers for this bug, because raw
comparisons or arithmetic between differently typed pointers is a
compile error.  Then all that is needed is to embed the correct usage
pattern into a macro (or, a sufficiently general set of correct usage
patterns that ad-hoc approaches are rare).


You also write:

> We have a couple of cases where we are "punning wildly between pointers
> and integers", for instance:
> 
> xen/arch/arm/arm64/livepatch.c:arch_livepatch_apply
> xen/arch/arm/setup.c:start_xen  line 772
> xen/arch/x86/setup.c:__start_xen  line 1382
> 
> I think it is OK to manually cast to (uintptr_t) in those cases as you
> suggest.

I haven't looked at these, but IMO whatever technique is used comes
with a proof obligation.

In the case of a macro, the proof obligation attaches to the author of
the macro.  The proof should show not only that correct use of the
macro will result in correct code; but there should also be discussion
of the risks of incorrect use of the macro - such incorrect use should
be prevented if that's reasonably possible.

In the case of an ad-hoc technique, the proof obligation comes with
the author of the exciting code.


Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.