[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 02/14] xen: arm: tweak arm64 stack frame layout
On Wed, 2013-04-10 at 12:57 +0100, Ian Campbell wrote: > On Mon, 2013-04-08 at 14:31 +0100, Stefano Stabellini wrote: > > On Tue, 12 Mar 2013, Ian Campbell wrote: > > > From: Ian Campbell <ian.campbell@xxxxxxxxxx> > > > > > > Correct definition of UREGS_kernel_sizeof and use it. > > > > > > Correct adjustment of stack on entry and exit. > > > > > > Add 64-bit versions of the build time checks for stack pointer alignment > > > correctness when pushing the stack frames. > > > > > > Lastly, correct the padding in the stack frames to properly align the > > > inner and > > > outer frames and also avoid an unnecessary 64bit padding field. > > > > > > Signed-off-by: Ian Campbell <ian.campbell@xxxxxxxxxx> > > > --- > > > xen/arch/arm/arm64/asm-offsets.c | 2 +- > > > xen/arch/arm/arm64/entry.S | 9 +++++---- > > > xen/arch/arm/domain.c | 2 ++ > > > xen/arch/arm/traps.c | 7 +++++++ > > > xen/include/asm-arm/arm64/processor.h | 7 +++---- > > > 5 files changed, 18 insertions(+), 9 deletions(-) > > > > > > diff --git a/xen/arch/arm/arm64/asm-offsets.c > > > b/xen/arch/arm/arm64/asm-offsets.c > > > index 7949e3e..7544082 100644 > > > --- a/xen/arch/arm/arm64/asm-offsets.c > > > +++ b/xen/arch/arm/arm64/asm-offsets.c > > > @@ -39,7 +39,7 @@ void __dummy__(void) > > > OFFSET(UREGS_SP_el1, struct cpu_user_regs, sp_el1); > > > OFFSET(UREGS_ELR_el1, struct cpu_user_regs, elr_el1); > > > > > > - OFFSET(UREGS_kernel_sizeof, struct cpu_user_regs, cpsr); > > > + OFFSET(UREGS_kernel_sizeof, struct cpu_user_regs, spsr_el1); > > > DEFINE(UREGS_user_sizeof, sizeof(struct cpu_user_regs)); > > > BLANK(); > > > > > > diff --git a/xen/arch/arm/arm64/entry.S b/xen/arch/arm/arm64/entry.S > > > index 9d38088..a5fd608 100644 > > > --- a/xen/arch/arm/arm64/entry.S > > > +++ b/xen/arch/arm/arm64/entry.S > > > @@ -34,7 +34,7 @@ lr .req x30 // link register > > > mrs x22, SP_el0 > > > str x22, [x21] > > > > > > - add x21, sp, #UREGS_ELR_el1 > > > + add x21, sp, #UREGS_SP_el1 > > > mrs x22, SP_el1 > > > mrs x23, ELR_el1 > > > stp x22, x23, [x21] > > > > Honestly I think it would be much cleaner and easier to read if we saved > > sp_el1 and elr_el1 separately. > > I think stp is supposed to be faster, but I'm running on a model so I'm > not sure. Also there are weird quadword alignment constraints on the By "weird" I really mean "double", i.e. sp must always be 16 byte aligned. > stack and stp is atomic from that PoV. Actually we aren't actually changing sp in this particular code so that doesn't really matter. But because the UREGS_* constants here are too large for the immediate of a str or stp we need to precalculate the offset, the choices are smth like: add x21, sp, #UREGS_ELR_el1 mrs x22, ELR_el1 str x22, [x21] add x21, sp, #UREGS_SP_el1 mrs x22, SP_el1 str x22, [x21] which is certainly clearer but does two adds (an extra arith instruction) and an extra str for two extra instructions on a hot path. So its worse even if stp had no actual benefit from doing a 16 byte store instead of 2*8 bytes, and I'm pretty sure stp will turn out to have such benefits. (yes, we could use different registers for the two sets of adds+stores and arrange for better pipelining etc, but still). The other option is: add x21, sp, #UREGS_SP_el1 mrs x22, SP_el1 mrs x23, ELR_el1 str x22, [x21] str x23, [x21,#8] which avoids the extra add but still has the extra str and the #8 offset on the second one is about as opaque as the stp thing IMHO, so it's slower for no gain in readability. I think what we have now (the store pair) is the best option. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |