[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 11/15] xen/arm64: Add ARM_SMCCC_ARCH_WORKAROUND_1 BP hardening support

On 12/02/18 17:20, Volodymyr Babchuk wrote:


On 12.02.18 19:12, Julien Grall wrote:
On 12/02/18 16:55, Volodymyr Babchuk wrote:
Hi Julien,

Hi Volodymyr,

On 08.02.18 21:21, Julien Grall wrote:
Add the detection and runtime code for ARM_SMCCC_ARCH_WORKAROUND_1.

Signed-off-by: Julien Grall <julien.grall@xxxxxxx>

     Changes in v2:
         - Patch added
  xen/arch/arm/arm64/bpi.S    | 12 ++++++++++++
  xen/arch/arm/cpuerrata.c    | 32 +++++++++++++++++++++++++++++++-
  xen/include/asm-arm/smccc.h |  1 +
  3 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/arm64/bpi.S b/xen/arch/arm/arm64/bpi.S
index 4b7f1dc21f..ef237de7bd 100644
--- a/xen/arch/arm/arm64/bpi.S
+++ b/xen/arch/arm/arm64/bpi.S
@@ -16,6 +16,8 @@
   * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#include <asm/smccc.h>
  .macro ventry target
      .rept 31
@@ -81,6 +83,16 @@ ENTRY(__psci_hyp_bp_inval_start)
      add     sp, sp, #(8 * 18)
+    sub     sp, sp, #(8 * 4)
+    stp     x2, x3, [sp, #(8 * 0)]
+    stp     x0, x1, [sp, #(8 * 2)]
+    ldp     x2, x3, [sp, #(8 * 0)]
+    ldp     x0, x1, [sp, #(8 * 2)]
+    add     sp, sp, #(8 * 4)

This code confuses me. You allocate 32 bytes on stack, save x0-x4 there, then you load ARM_SMCCC_ARCH_WORKAROUND_1_FID into w0 and restore values of x0-x4, overwriting value written into w0. Am I missing something?

The call to ARM_SMCCC_ARCH_WORKAROUND_1 does not return any value. Even if it were, this code is executed on exception entry before jumping into the trap helper. So you want to restore all the registers saved.

I believe you missed smc instruction in the code above.

Whoops yes. I will fix it.

Btw, you can use something like stp    x0, x1, [sp, #-16]! to avoid manual adjustment of sp. This will save you two instructions.

It was pointed out on Linux Arm that updating sp once *might* be faster on some uarch.

So is this code is targeted for that some specific uarch? Then I would like to see a comment describing why you choose this approach.

I can't confirm whether this will improve uarch A, B, C or Z. I just followed suggestion on Linux Arm (see [1]) and a personal choice on how to write assembly code. It is quite similar that why would I choose the other way around?


[1] https://www.spinics.net/lists/arm-kernel/msg626659.html

Julien Grall

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.