[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] libxl - API call to return sxpr of a domain?

On Tue, Jun 7, 2011 at 12:16 PM, Ian Campbell <Ian.Campbell@xxxxxxxxxxxxx> wrote:
On Tue, 2011-06-07 at 16:30 +0100, Shriram Rajagopalan wrote:
> On Tue, Jun 7, 2011 at 5:02 AM, Ian Campbell <Ian.Campbell@xxxxxxxxxx>
> wrote:
>         On Tue, 2011-06-07 at 04:30 +0100, Shriram Rajagopalan wrote:
>         > I am looking into adding Remus support for libxl. The
>         easiest way is
>         > to obtain the domain's sxpr, so that the rest of Remus
>         python code
>         > stays as is.
>         >
>         > Is there an api call in libxl to return a domain's sxpr ? a
>         grep on
>         > the libxl code
>         > base returned nothing. Or am I missing something pretty
>         obvious?
>         xl has some code to do this but libxl doesn't. An sxpr
>         representation of
>         a domain is rather a xend specific concept which is the only
>         reason xl
>         has it.
>         There are some plans to allow libxl to generate json for any
>         of the IDL
>         defined datastructures, mostly as a convenient pretty-printer
>         but being
>         machine parsable is a handy side-effect. Currently this would
>         just be
>         for individual datastructures though.
>         Where/how does remus use sxp?
>         tools/python/xen/remus/vm.py:domtosxpr()
>         seems to consume a xend datastructure and make a Remus sxp out
>         of it --
>         can an xl equivalent not be written using the python bindings?
>         (NB
>         bindings may be incomplete, we can fix up as you discover
>         stuff). Are
>         all usages of sxp in Remus of that particular sxp format or
>         are there
>         others?
> The only reason remus uses sxpr is because xend conveys info in that
> form. Basically, it only needs the vif device name (vif1.0, etc), the
> disk device name and the access format (tap/drbd) for proper
> operation.

ok, this stuff should be available to xl/libxl (as appropriate) pretty

> The reason for bypassing the usual xend live migration code path is
> because of the  callbacks, the checkpoint interval based
> suspend/resume, etc. Now that I know that xl/libxl doesnt use sxpr in
> its wire-protocol (dunce! :( ), the plan would have to be different.
> (a) Follow the same implementation style like that with xend (bypass
> xl's live migration mechanism) - involves some code duplication
> probably for communicating with remote machine, in xl's wire protocol.
> The advantage is most of remus' python code (save.py,  device.py,
> qdisc.py, code to install/parse IFB devices, tc rules, etc) stays as
> is.
> (b) integrate the remus control flow into xl/libxl stack - I dont know
> how much work that would be yet.

I don't know enough about the needs etc of Remus to make much in the way
of concrete proposals but in general plan b is the sort of thing we
would prefer since all toolstacks can then benefit (at least to some

Certainly I would prefer to see libxl functions which provide the
necessary interfaces (likely sharing common code within the library) etc
to duplication of the code.

Perhaps you could quickly explain the Remus architecture within the xend
world, which might help us to advise. e.g. How are things different on
the tx and rx sides with and without Remus? What additional callbacks
and control flow are there etc?

Do I gather correctly that the thing on the receiving end is not xend
but rather a Remus process?

On the receiving end, there is "no" Remus receiver process.
Well, there are some remus related patches, that have long been integrated
into xc_domain_restore, but apart from that, everything else is as-is.

The only remus specific part on rx side, is the blktap2 userspace driver (block-remus),
which again gets activated by usual Xend control flow (as it tries to create a tap device).
But I dont think this needs special treatment as long as xl can parse/accept spec like
 tap:remus:backupHost:port|aio:/dev/foo (or tap2:remus:.. ).
and launch the appropriate blktap2 backend driver (this system is already in place, afaik).

The bulk of Remus transmission data is in libxc and hence is agnostic to both
xend/xl. It basically prolongs the last iteration for eternity. It supplies a callback
handler for checkpoint, which adds the "wait" time before the next suspend (e.g., suspend
every 50ms). In case of Xend, the checkpoint handler is not supplied and hence the domain
is suspended as soon as the previous iteration finishes.

(a) On the sending side, without Remus, Xend control flow is as follows:
   xm migrate --live <domain> <host>
     (i) XendCheckpoint:save [which writes the signature record, sxp to the socket]
          and issues "xc_save <params>"
     (ii) xc_save calls xc_domain_save with appropriate callback handlers for suspend
            & switch_qemu_logdirty only. These handlers are in libxc/xcutils/xc_save.c.
     (iv) xc_domain_save:
              send dirty pages for max_iters
              if (last_iter) suspend_callback()
              send final set of dirty pages
              send tailbuf data

The callback structure has two other handlers (postcopy aka postresume, checkpoint) that
is used by Remus.
(b) On sending side, with Remus
      remus <domain> <host>
         (i) tools/remus/remus:
            - calls tools/python/xen/remus/vm.py:VM(domid)
            - vm.py:VM issues xmlrpc call to Xend to obtain domid's sxpr and extract out the disk/vif info.
         (ii) create the "buffers" for disk & vif.
         (iii) Connect with remote host's Xend socket and send the sxp info. [same as (i) for non Remus case]

          (iv) tools/python/xen/remus/save.py:Saver uses libcheckpoint to initiate checkpointing.
                tools/python/xen/lowlevel/checkpoint: has suspend/resume handlers similar to xc_save.c
                trampoline functions to bounce the callbacks for suspend, postcopy and checkpoint to their
                python equivalents.
                 tools/python/xen/lowlevel/checkpoint/libcheckpoint.c:checkpoint_start calls xc_domain_save with
                 all needed callback handlers.
                     ---> functionally equivalent to (ii) in non-Remus case.
           (v) xc_domain_save: (after the initial iterations)
               send dirty pages & tailbuf data
               postcopy_callback() [resumes domain]
                   netbuffer_checkpoint() [python - communicates via netlink to sch_plug]
                   diskbuffer_checkpoint() [python - communicates via fifo to block-remus]
                   sleep(50ms) [or whatever checkpoint interval]
                goto copypages

Hope that explains the control flow.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.