[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 09/14] livepatch: Add per-function applied/reverted state tracking marker

On 22. Aug 2019, at 12:29, Julien Grall <julien.grall@xxxxxxx> wrote:

Hi Pawel,

Hi Julien,

Thank for looking at this!

On 21/08/2019 09:19, Pawel Wieczorkiewicz wrote:
Livepatch only tracks an entire payload applied/reverted state. But,
with an option to supply the apply_payload() and/or revert_payload()
functions as optional hooks, it becomes possible to intermix the
execution of the original apply_payload()/revert_payload() functions
with their dynamically supplied counterparts.
It is important then to track the current state of every function
being patched and prevent situations of unintentional double-apply
or unapplied revert.
To support that, it is necessary to extend public interface of the
livepatch. The struct livepatch_func gets additional field holding
the applied/reverted state marker.
To reflect the livepatch payload ABI change, bump the version flag
The above solution only applies to x86 architecture for now.
Signed-off-by: Pawel Wieczorkiewicz <wipawel@xxxxxxxxx>
Reviewed-by: Andra-Irina Paraschiv <andraprs@xxxxxxxxxx>
Reviewed-by: Bjoern Doebel <doebel@xxxxxxxxx>
Reviewed-by: Martin Pohlack <mpohlack@xxxxxxxxx>
 xen/arch/x86/livepatch.c    | 20 +++++++++++++++++++-
 xen/common/livepatch.c      | 35 +++++++++++++++++++++++++++++++++++
 xen/include/public/sysctl.h | 11 ++++++++++-
 xen/include/xen/livepatch.h |  2 +-
 4 files changed, 65 insertions(+), 3 deletions(-)
diff --git a/xen/arch/x86/livepatch.c b/xen/arch/x86/livepatch.c
index 436ee40fe1..76fa91a082 100644
--- a/xen/arch/x86/livepatch.c
+++ b/xen/arch/x86/livepatch.c
@@ -61,6 +61,14 @@ void noinline arch_livepatch_apply(struct livepatch_func *func)
     if ( !len )
 +    /* If the apply action has been already executed on this function, do nothing... */
+    if ( func->applied == LIVEPATCH_FUNC_APPLIED )
+    {
+        printk(XENLOG_WARNING LIVEPATCH "%s: %s has been already applied before\n",
+                __func__, func->name);
+        return;
+    }

Does this need to in arch specific code?

As a matter of fact it does not.
Let me enable this feature for Arm as well and make this code a common inline.

     memcpy(func->opaque, old_ptr, len);
     if ( func->new_addr )
@@ -77,15 +85,25 @@ void noinline arch_livepatch_apply(struct livepatch_func *func)
         add_nops(insn, len);
       memcpy(old_ptr, insn, len);
+    func->applied = LIVEPATCH_FUNC_APPLIED;
  * "noinline" to cause control flow change and thus invalidate I$ and
  * cause refetch after modification.
-void noinline arch_livepatch_revert(const struct livepatch_func *func)
+void noinline arch_livepatch_revert(struct livepatch_func *func)
+    /* If the apply action hasn't been executed on this function, do nothing... */
+    if ( !func->old_addr || func->applied == LIVEPATCH_FUNC_NOT_APPLIED )
+    {
+        printk(XENLOG_WARNING LIVEPATCH "%s: %s has not been applied before\n",
+                __func__, func->name);
+        return;
+    }
     memcpy(func->old_addr, func->opaque, livepatch_insn_len(func));
+    func->applied = LIVEPATCH_FUNC_NOT_APPLIED;
diff --git a/xen/common/livepatch.c b/xen/common/livepatch.c
index 585ec9819a..090a48977b 100644
--- a/xen/common/livepatch.c
+++ b/xen/common/livepatch.c
@@ -1242,6 +1242,29 @@ static inline void revert_payload_tail(struct payload *data)
     data->state = LIVEPATCH_STATE_CHECKED;
+ * Check if an action has applied the same state to all payload's functions consistently.
+ */
+static inline bool was_action_consistent(const struct payload *data, livepatch_func_state_t expected_state)
+    int i;
+    for ( i = 0; i < data->nfuncs; i++ )
+    {
+        struct livepatch_func *f = &(data->funcs[i]);
+        if ( f->applied != expected_state )

Per the definition of livepath_func, the field "applied" only exists for x86. So this will not compiled on Arm.

I will enable Arm too.

+        {
+            printk(XENLOG_ERR LIVEPATCH "%s: Payload has a function: '%s' with inconsistent applied state.\n",
+                   data->name, f->name ?: "noname");
+            return false;
+        }
+    }
+    return true;
  * This function is executed having all other CPUs with no deep stack (we may
  * have cpu_idle on it) and IRQs disabled.
@@ -1268,6 +1291,9 @@ static void livepatch_do_action(void)
             rc = apply_payload(data);
 +        if ( !was_action_consistent(data, rc ? LIVEPATCH_FUNC_NOT_APPLIED : LIVEPATCH_FUNC_APPLIED) )

Regardless the compilation issue above, none of the common code will set the state. So for Arm, this would likely mean the function return false.

True, I should have disabled this code for Arm too.
But now, let me just make it common for all archs.

+            panic("livepatch: partially applied payload '%s'!\n", data->name);

I can see at least a case where the panic can be reached here. Per the changes in arch_livepatch_apply(), f->applied will only be set to LIVEPATCH_FUNC_APPLIED if livepatch_insn_len() is not 0.

I don't know whether livepatch_insn_len() can ever return 0 as my livepatch knowledge is limited. But the code live little doubt this is a theoritical possibility that after this patch will turn into crashing Xen.

I suppose the livepatch_insn_len() returning 0 (i.e. func->new_size being 0, which is valid AFAICS) is the to-be-implemented feature of noping out code:
From the livepatch doc:
## TODO Goals

 * NOP out the code sequence if `new_size` is zero.
So, in theory it can be 0 (it has to be introduced to the livepatch.funcs). Thus, you’re right.

I will fix that, but marking such function (new_size==0) as applied.


More generally, I am not very comfortable to see panic() in the middle of the code. Could you explain why panic is the best solution over reverting the work?

Yes. Production-ready hotpatches must not contain inconsistent hooks or functions-to-be-applied/reverted.
The goal here is to detect such hotpatches and fail hard immediately highlighting the fact that such hotpatch
is broken.

The inconsistent application of a hotpatch (some function applied, some reverted while other left behined) leaves
the system in a very bad state. I think the best what we could do here is panic() to enable post-mortem analysis
of what went wrong and avoid leaking such system into production.

Mis-detecting such conditions and leaving such inconsistent system running is a very bad idea. 

My question applies for all the other panic() below.

         if ( rc == 0 )
@@ -1282,6 +1308,9 @@ static void livepatch_do_action(void)
             rc = revert_payload(data);
 +        if ( !was_action_consistent(data, rc ? LIVEPATCH_FUNC_APPLIED : LIVEPATCH_FUNC_NOT_APPLIED) )
+            panic("livepatch: partially reverted payload '%s'!\n", data->name);
         if ( rc == 0 )
@@ -1304,6 +1333,9 @@ static void livepatch_do_action(void)
                 other->rc = revert_payload(other);
   +            if ( !was_action_consistent(other, rc ? LIVEPATCH_FUNC_APPLIED : LIVEPATCH_FUNC_NOT_APPLIED) )
+                panic("livepatch: partially reverted payload '%s'!\n", other->name);
             if ( other->rc == 0 )
@@ -1324,6 +1356,9 @@ static void livepatch_do_action(void)
                 rc = apply_payload(data);
 +            if ( !was_action_consistent(data, rc ? LIVEPATCH_FUNC_NOT_APPLIED : LIVEPATCH_FUNC_APPLIED) )
+                panic("livepatch: partially applied payload '%s'!\n", data->name);
             if ( rc == 0 )
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index 1b2b165a6d..b55ad6d050 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -818,7 +818,7 @@ struct xen_sysctl_cpu_featureset {
  *     If zero exit with success.
  * .livepatch.funcs structure layout defined in the `Payload format`
  * section in the Live Patch design document.
@@ -826,6 +826,11 @@ struct xen_sysctl_cpu_featureset {
  * We guard this with __XEN__ as toolstacks SHOULD not use it.
 #ifdef __XEN__
+typedef enum livepatch_func_state {

AFAIK, enum will always start counting from 0, so is it really necessary to specify the exact values?

It’s probably my (or some earlier reviewer's) OCD. I am happy to remove these.

+} livepatch_func_state_t;
 struct livepatch_func {
     const char *name;       /* Name of function to be patched. */
     void *new_addr;
@@ -834,6 +839,10 @@ struct livepatch_func {
     uint32_t old_size;
     uint8_t version;        /* MUST be LIVEPATCH_PAYLOAD_VERSION. */
     uint8_t opaque[31];
+#if defined CONFIG_X86
+    uint8_t applied;
+    uint8_t _pad[7];

Above, you increase the version of the payload here even for Arm when there are no modification in Arm. This raises the question on whether we would need to increase the version when Arm is going to be supported?

I think the best way forward is to add Arm support.
This feature is simple enough and it does not make much sense to complicate it by adding support spaghetti.

 typedef struct livepatch_func livepatch_func_t;
diff --git a/xen/include/xen/livepatch.h b/xen/include/xen/livepatch.h
index 2aec532ee2..a93126f631 100644
--- a/xen/include/xen/livepatch.h
+++ b/xen/include/xen/livepatch.h
@@ -117,7 +117,7 @@ int arch_livepatch_quiesce(void);
 void arch_livepatch_revive(void);
   void arch_livepatch_apply(struct livepatch_func *func);
-void arch_livepatch_revert(const struct livepatch_func *func);
+void arch_livepatch_revert(struct livepatch_func *func);
 void arch_livepatch_post_action(void);
   void arch_livepatch_mask(void);


Julien Grall

Best Regards,
Pawel Wieczorkiewicz

Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Ralf Herbrich
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.