[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream



On 10/07/15 13:17, Ian Jackson wrote:
> Andrew Cooper writes ("[PATCH v2 16/27] tools/libxl: Infrastructure for 
> reading a libxl migration v2 stream"):
>> From: Ross Lagerwall <ross.lagerwall@xxxxxxxxxx>
>>
>> This contains the event machinary and state machines to read an act on a
>> non-checkpointed migration v2 stream (with the exception of the
>> xc_domain_restore() handling which is spliced later in a bisectable way).
>>
>> It also contains some boilerplate to help support checkpointed streams, which
>> shall be introduced in a later patch.
> ...
>
>
>> +/* State for manipulating a libxl migration v2 stream */
>> +typedef struct libxl__stream_read_state libxl__stream_read_state;
> ...
>> +struct libxl__stream_read_state {
>> +    /* filled by the user */
>> +    libxl__ao *ao;
>> +    int fd;
>> +    void (*completion_callback)(libxl__egc *egc,
>> +                                libxl__stream_read_state *srs,
>> +                                int rc);
>> +    /* Private */
>> +    int rc;
>> +    bool running;
>> +
>> +    /* Main stream-reading data */
>> +    libxl__datacopier_state dc;
>> +    libxl__sr_hdr hdr;
>> +    libxl__sr_record_buf *incoming_record;
>> +    LIBXL_STAILQ_HEAD(, libxl__sr_record_buf) record_queue;
> This comes from malloc, not from the gc.  That needs a comment.

Ok

> (Is the same true of incoming_record ?  I think it is.)

Yes.  incoming_record is a partially-read record (almost all records
need splitting across two datacopier calls).

Once a record it complete, it is added to the tail of the record queue.

>
>> +    enum {
>> +        SRS_PHASE_NORMAL,
>> +    } phase;
>> +    bool recursion_guard;
>> +
>> +    /* Emulator blob handling */
>> +    libxl__datacopier_state emu_dc;
>> +    libxl__carefd *emu_carefd;
>> +};
> ...
>
> The comments in libxl_stream_read.c are good but I'm afraid I'm going
> to ask for some commentary about the legal states and/or the
> invariants in the data structure.
>
> That is, you've written a state machine but the states are not
> explicitly listed.
>
> As far as I can tell, from the outside, this machinery has the usual
> Undefined/Idle/Active states.  But does it not have more states on the
> inside ?  PHASE at least needs explaining.  See also comments about
> the record queue, below.
>
> For an example of the kind of thing I mean, see libxl_spawn.
> Particularly, libxl_internal.h:
>
>   * Each libxl__spawn_state is in one of these states
>   *    Undefined, Idle, Attached, Detaching
>
> and the much more detailed state and data structure table in
> libxl_exec.c
>
>  * Full set of possible states of a libxl__spawn_state and its _detachable:
>    
> I think you probably don't need to write down the external states for
> your srs machinery because the states are Undefined, Idle and Active
> like most things.  (Assuming you do something about the anomalous
> initialisation and checking of `running'.)
>
> What I'm missing is the internal description.
>

I will see what I can do.

>
>> +void libxl__stream_read_abort(libxl__egc *egc,
>> +                              libxl__stream_read_state *stream, int rc)
>> +{
>> +    stream_failed(egc, stream, rc);
> This has the same signature as stream_failed.  Perhaps stream_failed
> could be eliminated and replaced with this ?  Or maybe...
>
>> +static void stream_success(libxl__egc *egc, libxl__stream_read_state 
>> *stream)
>> +{
>> +    stream->rc = 0;
>> +    stream_done(egc, stream);
> This function has one call site.  As an alternative to abolishing
> stream_failed, maybe stream_failed should be renamed to
> `stream_complete' and then the call site for stream_success could
> simply say `stream_complete(egc, stream, 0)' ?

I have to admit that I find that paragdim specifically confusing to
read, which is why I deliberately split the _success() and _failed()
cases. 

>
>> +static void stream_failed(libxl__egc *egc,
>> +                          libxl__stream_read_state *stream, int rc)
>> +{
>> +    assert(rc);
>> +    stream->rc = rc;
>> +
>> +    if (stream->running) {
>> +        stream_done(egc, stream);
> Is the check for running necessary here ?

Yes.  An _abort() call from outside can happen at any arbitrary time. 
We must not deliver the callback twice.

(Although, from another email I see you want different semantics for
_abort(), so will try to follow that.)

>
>> +    sprintf(path, XC_DEVICE_MODEL_RESTORE_FILE".%u", dcs->guest_domid);
>> +
>> +    libxl__carefd_begin();
>> +    writefd = open(path, O_WRONLY | O_CREAT | O_TRUNC, 0600);
> Does this need to be a carefd ?  This is going to be a small amount of
> data, and the worst that happens is some other process inherits an fd
> onto a file which might otherwise be deleted sooner.

Who says the file will be deleted?  That will be down to the exact
semantics of the device model.

(I was asked to switch from a plain open() to a carefd as part of review
from v1).

>
> If it does, then please use libxl__carefd_opened.  (Best would be to
> libxl__carefd_opened to save and restore errno and then you can call
> it unconditionally after the open.)

I will go down this route.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.