[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 4/5] xen/arm: optee: handle share buffer translation error

To: Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>
From: Julien Grall <julien.grall@xxxxxxx>
Date: Thu, 12 Sep 2019 19:55:30 +0100
Cc: "tee-dev@xxxxxxxxxxxxxxxx" <tee-dev@xxxxxxxxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>
Delivery-date: Thu, 12 Sep 2019 18:55:38 +0000
List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hi Volodymyr,

On 9/11/19 7:32 PM, Volodymyr Babchuk wrote:


Julien Grall writes:

Hi Volodymyr,

On 8/23/19 7:48 PM, Volodymyr Babchuk wrote:

There is a case possible, when OP-TEE asks guest to allocate shared
buffer, but Xen for some reason can't translate buffer's addresses. In
this situation we should do two things:

1. Tell guest to free allocated buffer, so there will be no memory
leak for guest.

2. Tell OP-TEE that buffer allocation failed.

To ask guest to free allocated buffer we should perform the same
thing, as OP-TEE does - issue RPC request. This is done by filling
request buffer (luckily we can reuse the same buffer, that OP-TEE used
to issue original request) and then return to guest with special
return code.

Then we need to handle next call from guest in a special way: as RPC
was issued by Xen, not by OP-TEE, it should be handled by Xen.
Basically, this is the mechanism to preempt OP-TEE mediator.

The same mechanism can be used in the future to preempt mediator
during translation large (>512 pages) shared buffers.

Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@xxxxxxxx>
---
   xen/arch/arm/tee/optee.c | 167 +++++++++++++++++++++++++++++++--------
   1 file changed, 136 insertions(+), 31 deletions(-)

diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c
index 3ce6e7fa55..4eebc60b62 100644
--- a/xen/arch/arm/tee/optee.c
+++ b/xen/arch/arm/tee/optee.c
@@ -96,6 +96,11 @@
                                 OPTEE_SMC_SEC_CAP_UNREGISTERED_SHM | \
                                 OPTEE_SMC_SEC_CAP_DYNAMIC_SHM)
   +enum optee_call_state {
+    OPTEEM_CALL_NORMAL = 0,


enum always start counting at 0. Also, looking at the code, it does
not seem you need to know the value. Right?

Yep. This is a bad habit. Will remove.

+    OPTEEM_CALL_XEN_RPC,


I am a bit confused, the enum is called optee_call_state but all the
enum are prefixed with OPTEEM_CALL_. Why the discrepancy?

Because I'm bad at naming things :)

OPTEEM_CALL_STATE_XEN_RPC looks too long. But you are right, so I'll
rename the enum values. Unless, you have a better idea for this.

My point was not about adding _STATE to the enum values but the fact youcall the enum optee but the value OPTEEM (note the extra M in the later).

So my only request here is to call the enum opteem_call_state or prefixall the enum value with OPTEE.

+};
+
   static unsigned int __read_mostly max_optee_threads;
     /*
@@ -112,6 +117,9 @@ struct optee_std_call {
       paddr_t guest_arg_ipa;
       int optee_thread_id;
       int rpc_op;
+    /* Saved buffer type for the last buffer allocate request */


Looking at the code, it feels to me you are saving the buffer type for
the current command and not the last. Did I miss anything?

Yes, right. Will rename.

+    unsigned int rpc_buffer_type;
+    enum optee_call_state state;
       uint64_t rpc_data_cookie;
       bool in_flight;
       register_t rpc_params[2];
@@ -299,6 +307,7 @@ static struct optee_std_call *allocate_std_call(struct 
optee_domain *ctx)
         call->optee_thread_id = -1;
       call->in_flight = true;
+    call->state = OPTEEM_CALL_NORMAL;
         spin_lock(&ctx->lock);
       list_add_tail(&call->list, &ctx->call_list);
@@ -1075,6 +1084,10 @@ static int handle_rpc_return(struct optee_domain *ctx,
               ret = -ERESTART;
           }
   +        /* Save the buffer type in case we will want to free it
*/
+        if ( shm_rpc->xen_arg->cmd == OPTEE_RPC_CMD_SHM_ALLOC )
+            call->rpc_buffer_type = shm_rpc->xen_arg->params[0].u.value.a;
+
           unmap_domain_page(shm_rpc->xen_arg);
       }
   @@ -1239,18 +1252,102 @@ err:
       return;
   }
   +/*
+ * Prepare RPC request to free shared buffer in the same way, as
+ * OP-TEE does this.
+ *
+ * Return values:
+ *  true  - successfully prepared RPC request
+ *  false - there was an error
+ */
+static bool issue_rpc_cmd_free(struct optee_domain *ctx,
+                               struct cpu_user_regs *regs,
+                               struct optee_std_call *call,
+                               struct shm_rpc *shm_rpc,
+                               uint64_t cookie)
+{
+    register_t r1, r2;
+
+    /* In case if guest will forget to update it with meaningful value */
+    shm_rpc->xen_arg->ret = TEEC_ERROR_GENERIC;
+    shm_rpc->xen_arg->cmd = OPTEE_RPC_CMD_SHM_FREE;
+    shm_rpc->xen_arg->num_params = 1;
+    shm_rpc->xen_arg->params[0].attr = OPTEE_MSG_ATTR_TYPE_VALUE_INPUT;
+    shm_rpc->xen_arg->params[0].u.value.a = call->rpc_buffer_type;
+    shm_rpc->xen_arg->params[0].u.value.b = cookie;
+
+    if ( access_guest_memory_by_ipa(current->domain,
+                                    gfn_to_gaddr(shm_rpc->gfn),
+                                    shm_rpc->xen_arg,
+                                    OPTEE_MSG_GET_ARG_SIZE(1),
+                                    true) )
+    {
+        /*
+         * Well, this is quite bad. We have error in error path.
+         * This can happen only if guest behaves badly, so all
+         * we can do is to return error to OP-TEE and leave
+         * guest's memory leaked.


Could you expand a bit more what you mean by "guest's memory leaked"?

There will be memory leak somewhere in the guest. Yes, looks
like it is misleading...

What I mean, is that OP-TEE requests guest to allocate some
memory. Guest does not know, when OP-TEE finishes using this memory, so
guest will free the memory only by OP-TEE's request. We can't emulate
this request in current circumstances, so guest will keep part of own
memory reserved for OP-TEE infinitely.

What the state of the page from Xen PoV?

 From Xen point of view all will be perfectly fine.

I.e. is there any reference
taken by the OP-TEE mediator? Will the page be freed once the guest is
destroyed?...

As I said, it has nothing to do with the page as Xen it sees. Mediator
will call put_page() prior to entering this function. So, no Xen
resources are used.

It makes sense, Thank you for the explanation. Please update the commentaccordingly.

+         */
+        shm_rpc->xen_arg->ret = TEEC_ERROR_GENERIC;
+        shm_rpc->xen_arg->num_params = 0;
+
+        return false;
+    }
+
+    uint64_to_regpair(&r1, &r2, shm_rpc->cookie);
+
+    call->state = OPTEEM_CALL_XEN_RPC;
+    call->rpc_op = OPTEE_SMC_RPC_FUNC_CMD;
+    call->rpc_params[0] = r1;
+    call->rpc_params[1] = r2;
+    call->optee_thread_id = get_user_reg(regs, 3);
+
+    set_user_reg(regs, 0, OPTEE_SMC_RETURN_RPC_CMD);
+    set_user_reg(regs, 1, r1);
+    set_user_reg(regs, 2, r2);
+
+    return true;
+}
+
+/* Handles return from Xen-issued RPC */
+static void handle_xen_rpc_return(struct optee_domain *ctx,
+                                  struct cpu_user_regs *regs,
+                                  struct optee_std_call *call,
+                                  struct shm_rpc *shm_rpc)
+{
+    call->state = OPTEEM_CALL_NORMAL;
+
+    /*
+     * Right now we have only one reason to be there - we asked guest
+     * to free shared buffer and it did it. Now we can tell OP-TEE that
+     * buffer allocation failed.
+     */


Should we add an ASSERT to ensure the command is the one we expect?

It is strange, that it is missing, actually. Looks like I forgot to add
it. But, looking at xen-error-handling, maybe BOG_ON() would be better?

The documentation in xen-error-handling needs some update. IIRC Georgehad a patch for updating the documentation on the mailing list.

BUG_ON() (and BUG()) should only be used if this is an error thehypervisor can't recover. I am actually slowly go through the tree andremoving those who are in the guest path as some could be triggered onnew revision of the architecture :(.

In this case, this is in guest path and an error case. If something hasbeen missed and the guest may trigger the BUG_ON(). While this is a DOS,this is still not desirable.


So there are three solutions:
   1) Crash the guest
   2) Add an ASSERT()
   3) Print a warning

This is an error path so 2) might be less desirable if we don't do fullcoverage of the code in debug mode.


Cheers,

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

References:
- Re: [Xen-devel] [PATCH 4/5] xen/arm: optee: handle share buffer translation error
  - From: Julien Grall
- Re: [Xen-devel] [PATCH 4/5] xen/arm: optee: handle share buffer translation error
  - From: Volodymyr Babchuk

Prev by Date: [Xen-devel] [PATCH v2 8/8] x86/cpuid: Enable CPUID Faulting for the control domain by default
Next by Date: [Xen-devel] [PATCH v2 0.5/8] libx86: Proactively initialise error pointers
Previous by thread: Re: [Xen-devel] [PATCH 4/5] xen/arm: optee: handle share buffer translation error
Next by thread: [Xen-devel] [PATCH] freebsd-build: fix building efifat after r351831
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.