[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1 of 2 V3] libxl: Remus - suspend/postflush/commit callbacks



On 2012-02-09, at 4:38 AM, Ian Campbell <Ian.Campbell@xxxxxxxxxx> wrote:

> On Fri, 2012-02-03 at 07:00 +0000, rshriram@xxxxxxxxx wrote:
>> # 
>> 
>> +/* TODO: Explicit Checkpoint acknowledgements via recv_fd. */
>> +int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
>> +                             uint32_t domid, int send_fd, int recv_fd)
>> +{
>> +    GC_INIT(ctx);
>> +    libxl_domain_type type = libxl__domain_type(gc, domid);
>> +    int rc = 0;
>> +
>> +    if (info == NULL) {
>> +        LIBXL__LOG(ctx, LIBXL__LOG_ERROR,
>> +                   "No remus_info structure supplied for domain %d", domid);
>> +        rc = ERROR_INVAL;
>> +        goto remus_fail;
>> +    }
>> +
>> +    /* TBD: Remus setup - i.e. attach qdisc, enable disk buffering, etc */
> 
> Is it worth checking that the domain has no disks or network (IOW is
> this dangerous if they do?)
> 

A domain with no disks or network wouldnt be of much use though.
 Is it dangerous if they do but are not checkpointed ? Yes.
 Dangerous to what extent depends on how critical the application is.
 But then, this patch intends to put the framework in place so that people can 
at least play around with memory check pointing.

> [...]
>> @@ -791,7 +837,27 @@ int libxl__domain_suspend_common(libxl__
>>     }
>> 
>>     memset(&callbacks, 0, sizeof(callbacks));
>> -    callbacks.suspend = libxl__domain_suspend_common_callback;
>> +    if (r_info != NULL) {
>> +        /* save_callbacks:
>> +         * suspend - called after expiration of checkpoint interval,
>> +         *           to *suspend* the domain.
>> +         *
>> +         * postcopy - called after the domain's dirty pages have been
>> +         *            copied into an output buffer. We *resume* the domain
>> +         *            & the device model, return to the caller. Caller then
>> +         *            flushes the output buffer, while the domain continues 
>> to run.
>> +         *
>> +         * checkpoint - called after the memory checkpoint has been flushed 
>> out
>> +         *              into the network. Send the saved device state, 
>> *wait*
>> +         *              for checkpoint ack and *release* the network buffer 
>> (TBD).
>> +         *              Then *sleep* for the checkpoint interval.
>> +         */
> 
> I think this comment would be more useful in xenguest.h next to the
> callback struct.
> 
> Otherwise the patch looks good.
> 
> Ian.
> 
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.