[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v6 1/4] xen: introduce SYMBOL



>>> On 18.01.19 at 02:24, <sstabellini@xxxxxxxxxx> wrote:
> On Thu, 17 Jan 2019, Jan Beulich wrote:
>> >>> On 17.01.19 at 01:37, <sstabellini@xxxxxxxxxx> wrote:
>> > On Wed, 16 Jan 2019, Jan Beulich wrote:
>> >> In any event - since intermediate variables merely hide the
>> >> casting from the compiler, but they don't remove the casts, the
>> >> solution involving casts is better imo, for incurring less overhead.
>> > 
>> > This is where I completely disagree. The intermediate variables are not
>> > hiding casts from the compiler. There were never any pointers in this
>> > case.  The linker creates "symbols", not pointers, completely invisible
>> > from C land. Assembly uses these symbols to initialize variables. We
>> > expose these assembly variables as integer to C lands. LD scripts and
>> > assembly have their own terminology and rules: neither "_start" nor
>> > "start" are pointers at any point in time. The operations done in var.S
>> > is not a cast. The C spec is happy, the compiler is happy, MISRA-C is
>> > happy. And we get to avoid the ugly SYMBOL macro that Linux uses. It is
>> > really a win-win.
>> 
>> Well, that's a position one can take. But we have to settle on another
>> aspect then first: Does what is not done in C underly C's rules? I
>> thought you were of the opinion that what comes from linker scripts
>> does. In which case what comes from assembly files ought to, too.
>> (FAOD my implication is: If the answer is yes, both approaches
>> violate C's rules. If the answer is no, no change is needed at all.)
> 
> Great question, that is the core of the issue. Also, let me premise that
> I agree on the comments you made on the patches (I dislike "start_"
> too), and I can address them if we agree to continue down this path.
> 
> But no, I do not think that what is done outside of C-land should follow
> C rules. But I do not agree with your conclusion that in that case there
> is no difference between the approaches. Let's get more into the
> details.
> 
> 
> 1) SYMBOL_HIDE returning pointer type
> 
> Let's take _start and _end as an example. _start is born as a linker
> symbol, and it becomes a C pointer when we do:
> 
>   extern char _start[], _end[]
> 
> Now it is a pointer (actually I should say an array, but let's pretend
> they are the same thing for this discussion).
> 
> When we do:
> 
>   SYMBOL_HIDE(_end) - SYMBOL_HIDE(_start)
> 
> We are still subtracting pointers: the pointers returned by SYMBOL_HIDE.
> We cannot prove that they are pointers to the same object or subsequence
> objects in memory, so it is undefined behavior, which is not allowed.

Stop. No. We very much can prove they are - _end points at
one past the last element of _start[]. It is the compiler which
can't prove the opposite, and hence it can't leverage
undefined behavior for optimization purposes.

> 3) var.S + start_ as unsigned long
> 
> With this approach, _start is born as a linker symbol. It is never
> exported to C, so from C point of view, it doesn't exist. There is
> another variable named "start_" defined in assembly and initialized to
> _start. Now we go into C land with:
> 
>   extern uintptr_t start_, end_
> 
> start_ and end_ are uintptr_t from the beginning from C point of view.
> They have never been pointers or in any way connected to _start. They
> are "clean".
> 
> When we do:
> 
>   _end - _start
> 
> it is a subtraction between uintptr_t, which is allowed. When we do:
> 
>     for ( call = (const initcall_t *)initcall_start_;
>           (uintptr_t)call < presmp_initcall_end_;
> 
> The comparison is still between uintptr_t types, and the value of "call"
> still comes from an unsigned long initially. There is never a comparison
> between dubious pointers. (Interger to pointer conversions and pointer
> to integer conversions are allowed by MISRA with some limitations, but I
> am double-checking.) Even:
> 
>    (uintptr_t)random_pointer < presmp_initcall_end_
> 
> would be acceptable because presmp_initcall_end_ is an integer and has
> always been an integer from C point of view.

Well, as said - this is one of the possible positions to take. Personally
I see no difference between the helper symbols defined in
assembly sources, or in C sources the object files for which are never
made part of potential whole program optimization. Using C files for
this is still in conflict with the supposed undefined behavior, but I
think you agree that C and assembly files could be set up such that
the resulting binary data is identical. In which case it is bogus to call
one satisfactory, but not the other.

> However, there are still a couple of issued not correctly solved by v8
> of the series. For starters: 
> 
>         apply_alternatives((struct alt_instr *)alt_instructions_,
>                            (struct alt_instr *)alt_instructions_end_);
> 
> I can see how the pointers comparisons in apply_alternatives could be
> considered wrong given the way the pointers are initialized:
> 
>     for ( a = base = start; a < end; a++ )
>     {
> 
> start and end come from alt_instructions_ and alt_instructions_end_. It
> doesn't matter that alt_instructions_ and alt_instructions_end_ are
> "special", they could be perfectly normal integers and we would still
> have the same problem: we cannot prove that "start" and "end" point to
> the same object or subsequent objects in memory.
> 
> The way to fix it is by changing the parameters of apply_alternatives to
> interger types, making comparison between integers, and only using
> pointers to access the data.

You know my position on casts from integer to pointer types, especially
ones taking a type out of thin air. This applies to your addition to the
apply_alternatives() construct as well as the alternative of adding such
in order to access memory. The quote from the standard that I gave
makes such casts not provably (by the compiler) defined behavior as
well, so it all boils down to the same distinction as pointed out above in
the first part of my reply here: _We_ can prove it, but the compiler
can't. Hence we're still depending on whose proof is necessary to
eliminate MISRA's undefined behavior concerns.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.