[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4 01/12] x86: infrastructure to allow converting certain indirect calls to direct ones



>>> On 03.10.18 at 20:38, <andrew.cooper3@xxxxxxxxxx> wrote:
> Reviewing just the code generation at this point.

Thanks for having found the time.

> See the Linux source code for ASM_CALL_CONSTRAINT.  There is a potential
> code generation issue if you've got a call instruction inside an asm
> block if you don't list the stack pointer as a clobbered output.

I'll look into this, but to be honest I don't always trust the Linux
folks in such regards. Plus doesn't this consideration come a little
late, seeing that we already have inline asm()-s inserting CALLs
(see e.g. asm-x86/guest/hypercall.h)?

> Next, with Clang, there seems to be some a bug causing the function
> pointer to be spilled onto the stack
> 
> ffff82d08026e990 <foo2>:
> ffff82d08026e990:       50                      push   %rax
> ffff82d08026e991:       48 8b 05 40 bc 20 00    mov    0x20bc40(%rip),%rax   
>      # ffff82d08047a5d8 <hvm_funcs+0x130>
> ffff82d08026e998:       48 89 04 24             mov    %rax,(%rsp)
> ffff82d08026e99c:       ff 15 36 bc 20 00       callq  *0x20bc36(%rip)       
>  # ffff82d08047a5d8 <hvm_funcs+0x130>
> ffff82d08026e9a2:       31 c0                   xor    %eax,%eax
> ffff82d08026e9a4:       59                      pop    %rcx
> ffff82d08026e9a5:       c3                      retq   
> ffff82d08026e9a6:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
> ffff82d08026e9ad:       00 00 00 
> 
> 
> I'm not quite sure what is going on here, and the binary does boot, but
> the code gen is definitely not correct.

What's not correct here? There's just some pointless extra code
the compiler produces.

>  Given this and the GCC bugs
> you've found leading to the NO_ARG infrastructure, how about dropping
> all the compatibility hacks, and making the infrastructure fall back to
> a regular compiler-inserted function pointer call?

Well, if there were multiple issues left, I'd view this as an option.
But there's just one ugly workaround in this version, in
"x86/genapic: patch indirect calls to direct ones", and that even affects
gcc 8. So on the whole I'd rather not go this route.

> Next, the ASM'd calls aren't SYSV-ABI compliant.
> 
> extern void bar(void);
> 
> int foo1(void)
> {
>     hvm_funcs.wbinvd_intercept();
>     return 0;
> }
> 
> int foo2(void)
> {
>     alternative_vcall(hvm_funcs.wbinvd_intercept);
>     return 0;
> }
> 
> int bar1(void)
> {
>     bar();
>     return 0;
> }
> 
> ffff82d08026e1e0 <foo1>:
> ffff82d08026e1e0:       48 83 ec 08             sub    $0x8,%rsp
> ffff82d08026e1e4:       48 8b 05 c5 49 1d 00    mov    0x1d49c5(%rip),%rax   
>      # ffff82d080442bb0 <hvm_funcs+0x130>
> ffff82d08026e1eb:       e8 30 2d 0f 00          callq  ffff82d080360f20 
> <__x86_indirect_thunk_rax>
> ffff82d08026e1f0:       31 c0                   xor    %eax,%eax
> ffff82d08026e1f2:       48 83 c4 08             add    $0x8,%rsp
> ffff82d08026e1f6:       c3                      retq   
> ffff82d08026e1f7:       66 0f 1f 84 00 00 00    nopw   0x0(%rax,%rax,1)
> ffff82d08026e1fe:       00 00 
> 
> ffff82d08026e200 <foo2>:
> ffff82d08026e200:       ff 15 aa 49 1d 00       callq  *0x1d49aa(%rip)       
>  # ffff82d080442bb0 <hvm_funcs+0x130>
> ffff82d08026e206:       31 c0                   xor    %eax,%eax
> ffff82d08026e208:       c3                      retq   
> ffff82d08026e209:       0f 1f 80 00 00 00 00    nopl   0x0(%rax)
> 
> ffff82d08026e210 <bar1>:
> ffff82d08026e210:       48 83 ec 08             sub    $0x8,%rsp
> ffff82d08026e214:       e8 17 18 01 00          callq  ffff82d08027fa30 <bar>
> ffff82d08026e219:       31 c0                   xor    %eax,%eax
> ffff82d08026e21b:       48 83 c4 08             add    $0x8,%rsp
> ffff82d08026e21f:       c3                      retq   
> 
> foo2 which uses alternative_vcall() should be subtracting 8 from the
> stack pointer before the emitted call instruction.  I can't find any set
> of constraints which causes the stack to be set up correctly.

I think I have an idea how to arrange for this (adding an artificial
extra constraint referencing a local variable requiring 16-byte
alignment; ideally the variable would itself be zero size), but I've
yet to try it out. The question though is if we really need this
ABI compliance here. Recall that for years we've been running
with mis-aligned stacks, and only EFI runtime call issues required
us to address this. There are no EFI runtime calls behind any of
the (to be) patched calls.

I was in fact considering the opposite thing: Use (on new enough
gcc) the command line option to reduce stack alignment to 8 bytes
in general, enforcing 16-byte alignment only where we really need
it.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.