[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] docs: update xenstore migration design document



On 15.04.20 10:09, Paul Durrant wrote:
-----Original Message-----
From: Xen-devel <xen-devel-bounces@xxxxxxxxxxxxxxxxxxxx> On Behalf Of Juergen 
Gross
Sent: 14 April 2020 17:00
To: xen-devel@xxxxxxxxxxxxxxxxxxxx
Cc: Juergen Gross <jgross@xxxxxxxx>; Stefano Stabellini 
<sstabellini@xxxxxxxxxx>; Julien Grall
<julien@xxxxxxx>; Wei Liu <wl@xxxxxxx>; Andrew Cooper 
<andrew.cooper3@xxxxxxxxxx>; Ian Jackson
<ian.jackson@xxxxxxxxxxxxx>; George Dunlap <george.dunlap@xxxxxxxxxx>; Jan Beulich 
<jbeulich@xxxxxxxx>
Subject: [PATCH] docs: update xenstore migration design document

In the past there have been several attempts to make Xenstore
restartable. This requires to transfer the internal Xenstore state to
the new instance. With the Xenstore migration protocol added recently
to Xen's documentation a first base has been defined to represent the
state of Xenstore. This can be expanded a little bit in order to have
a full state representation which is needed as a first step for live
updating Xenstore.

Add some definitions to designs/xenstore-migration.md which are needed
for live update of xenstored.

Signed-off-by: Juergen Gross <jgross@xxxxxxxx>
---
  docs/designs/xenstore-migration.md | 90 ++++++++++++++++++++++++++++++++++++--
  1 file changed, 87 insertions(+), 3 deletions(-)

diff --git a/docs/designs/xenstore-migration.md 
b/docs/designs/xenstore-migration.md
index 6ab351e8fe..09bb4700b4 100644
--- a/docs/designs/xenstore-migration.md
+++ b/docs/designs/xenstore-migration.md
@@ -9,6 +9,10 @@ records must include details of registered xenstore watches as 
well as
  content; information that cannot currently be recovered from `xenstored`,
  and hence some extension to the xenstore protocol[2] will also be required.

+As a similar set of data is needed for transferring xenstore data from
+one instance to another when live updating xenstored the same definitions
+are being used.
+

That makes sense, but using an external entity to extract the state makes less 
sense in the context of live update so, going
forward, I suggest dropping the section on extra commands.

Right, this (if still needed) should rather go to docs/misc/xenstore.txt

Also, we should define a separate 'xenstore domain image format' which can be 
included as an opaque entity in the migration stream,
in the same way as the qemu save image.

Fine with me.


  The *libxenlight Domain Image Format* specification[3] already defines a
  record type `EMULATOR_XENSTORE_DATA` but this is not suitable for
  transferring xenstore data pertaining to the domain directly as it is
@@ -48,7 +52,10 @@ where type is one of the following values
  |        | 0x00000001: NODE_DATA                            |
  |        | 0x00000002: WATCH_DATA                           |
  |        | 0x00000003: TRANSACTION_DATA                     |
-|        | 0x00000004 - 0xFFFFFFFF: reserved for future use |
+|        | 0x00000004: TRANSACTION_NODE_DATA                |
+|        | 0x00000005: GUEST_RING_DATA                      |
+|        | 0x00000006: DOMAIN_START (live update only)      |
+|        | 0x00000007 - 0xFFFFFFFF: reserved for future use |


  and data is one of the record data formats described in the following
@@ -79,7 +86,7 @@ as follows:
  +-------------------------------+
  | perm count (N)                |
  +-------------------------------+
-| perm0                         |
+| perm1                         |
  +-------------------------------+
  ...
  +-------------------------------+
@@ -93,7 +100,7 @@ as follows:
  +-------------------------------+
  ```

-where perm0..N are formatted as follows:
+where perm1..N are formatted as follows:


  ```
@@ -164,6 +171,83 @@ as follows:
  where tx_id is the non-zero identifier values of an open transaction.


+**TRANSACTION_NODE_DATA**
+
+
+Each TRANSACTION_NODE_DATA record specifies a transaction local xenstore
+node. Its is similar to the NODE_DATA record with the addition of a
+transaction id:
+
+```
+    0       1       2       3     octet
++-------+-------+-------+-------+
+| TRANSACTION_NODE_DATA         |
++-------------------------------+
+| tx_id                         |
++-------------------------------+
+| path length                   |
++-------------------------------+
+| path data                     |
+...
+| pad (0 to 3 octets)           |
++-------------------------------+
+| perm count (N)                |
++-------------------------------+
+| perm1                         |
++-------------------------------+
+...
++-------------------------------+
+| permN                         |
++-------------------------------+
+| value length                  |
++-------------------------------+
+| value data                    |
+...
+| pad (0 to 3 octets)           |
++-------------------------------+
+```
+
+where perm1..N are formatted as specified in the NODE_DATA record. A perm
+count of 0 denotes a node having been deleted in the transaction.
+

Transferring this state means we can presumably drop the TRANSACTION_DATA, 
since we can infer open transactions from the presence of
these records?

No, I don't think so. Imagine a case where just a transaction has
been started without any further activity in this transaction having
happened yet.


+
+**GUEST_RING_DATA**
+
+
+The GUEST_RING_DATA record is used to transfer data which is pending to be
+written to the guest's xenstore ring buffer. It si formatted as follows:
+
+
+```
++-------+-------+-------+-------+
+| GUEST_RING_DATA               |
++-------------------------------+
+| value length                  |
++-------------------------------+
+| value data                    |
+...
+| pad (0 to 3 octets)           |
++-------------------------------+
+```
+
+**DOMAIN_START**
+
+
+For live updating xenstored data of multiple domains needs to be transferred.
+For this purpose the DOMAIN_START record is being used. All records of types
+other than NODE_DATA always relate to the last DOMAIN_START record in the
+stream. A DOMAIN_START record just contains a domain-id:

If we define a separate stream format as I suggest above, I'd expect the 
domain-id to be part of the header.

Yes.


+
+
+```
++-------+-------+-------+-------+
+| DOMAIN_START                  |
++-------------------------------+
+| domid         | pad           |
++-------------------------------+
+```
+
+
  ### Protocol Extension

As mentioned above, this section ought to be dropped for the moment. We should 
define the ways in which xenstored should dump state
in various scenarios: e.g. for LU it will likely be serialize to memory 
(possibly dev/shmem?), for migration/save-restore it will
probably be serialize to fd (socket/file). We should also define how dumping 
state will be triggered.

I was planning to use XS_CONTROL for this purpose. But I'm open for
other solutions, too. And even using XS_CONTROL might want us to
have a library routine for that purpose encapsulating the way how the
extracted state is being transferred.

In the LU case the state dumping will need to be handled Xenstore
internally (especially in stubdom this will just be a memory blob inside
the stubdom, while in the daemon case it could easily be a file).


Juergen



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.