[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v5 4/7] libxl: add infrastructure to track and query 'recent' domids



> -----Original Message-----
> From: Ian Jackson <ian.jackson@xxxxxxxxxx>
> Sent: 17 February 2020 17:43
> To: Durrant, Paul <pdurrant@xxxxxxxxxxxx>
> Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx; Wei Liu <wl@xxxxxxx>; Anthony Perard
> <anthony.perard@xxxxxxxxxx>
> Subject: Re: [PATCH v5 4/7] libxl: add infrastructure to track and query
> 'recent' domids
> 
> Paul Durrant writes ("[PATCH v5 4/7] libxl: add infrastructure to track
> and query 'recent' domids"):
> > A domid is considered recent if the domain it represents was destroyed
> > less than a specified number of seconds ago. For debugging and/or
> testing
> > purposes the number can be set using the environment variable
> > LIBXL_DOMID_REUSE_TIMEOUT. If the variable does not exist then a default
> > value of 60s is used.
> >
> > Whenever a domain is destroyed, a time-stamped record will be written
> into
> > a history file (/var/run/xen/domid-history). To avoid the history file
> > growing too large, any records with time-stamps that indicate that the
> > age of a domid has exceeded the re-use timeout will also be purged.
> >
> > A new utility function, libxl__is_recent_domid(), has been added. This
> > function reads the same history file checking whether a specified domid
> > has a record that does not exceed the re-use timeout. Since this utility
> > function does not write to the file, no records are actually purged by
> it.
> 
> Thanks for this.  Sorry for the delay in reviewing it.
> 
> I'm afraid I still have some comments about error handling etc.
> 
> > +int libxl_clear_domid_history(libxl_ctx *ctx);
> 
> I think this needs a clear doc comment saying it is for use in host
> initialisation only.  If it is run with any domains running, or
> concurrent libxl processes, things may malfunction.

Ok. Not sure precisely what you mean by 'doc comment'... Do mean a comment in 
the header just above this declaration or elsewhere?

> 
> > +static bool libxl__read_recent(FILE *f, unsigned long *sec,
> > +                               unsigned int *domid)
> > +{
> > +    int n;
> > +
> > +    assert(f);
> > +
> > +    n = fscanf(f, "%lu %u", sec, domid);
> > +    if (n == EOF)
> > +        return false;
> 
> Missing error handling in case of read error.

'man fscanf' tells me:

"The  value EOF is returned if the end of input is reached before either the 
first suc‐
 cessful conversion or a matching failure occurs.  EOF is also returned if a 
read error
 occurs,  in  which case the error indicator for the stream (see ferror(3)) is 
set, and
 errno is set to indicate the error."

So EOF is set in all error cases. What am I missing?

> 
> > +    else if (n != 2) /* malformed entry */
> > +        *domid = INVALID_DOMID;
> 
> Both call sites for this function have open-coded checks for this
> return case, where they just go round again.  I think
> libxl__read_recent should handle this itself, factoring the common
> code into this function and avoiding that special case.

Ok. I thought it was more intuitive to have the function only ever read a 
single entry from the file, but I can easily add the retry loop if you prefer.

> 
> > +    return true;
> 
> I think this function should return an rc.  It could signal EOF by
> setting *domid to INVALID_DOMID maybe, and errors by returning
> ERROR_FAIL.

Ok. I thought it was slightly pointless to do that.

> 
> > +static bool libxl__write_recent(FILE *f, unsigned long sec,
> > +                                unsigned int domid)
> > +{
> > +    assert(f);
> 
> This is rather pointless.  Please drop it.
> 

If you think so, ok.

> > +    assert(libxl_domid_valid_guest(domid));
> 
> I doubt this is really needed but I don't mind it if you must.
> 
> > +    return fprintf(f, "%lu %u\n", sec, domid) > 0;
> 
> Wrong error handling.  This function should return rc.  fprintf
> doesn't return a boolean.

And nor does this code expect it to (since it tests for '> 0').

>  Something should log errno (with LOGE
> probably) if fprintf fails.

I can see you dislike boolean functions; I'll return an error as you desire.

> 
> > +static int libxl__mark_domid_recent(libxl__gc *gc, uint32_t domid)
> > +{
> > +    long timeout = libxl__get_domid_reuse_timeout();
> > +    libxl__flock *lock;
> 
> Please initialise lock = NULL so that it is easy to see that the out
> block is correct.
> 
> (See tools/libxl/CODING_STYLE where this is discussed.)
> 

Ok. Xen style generally avoids initializers where not strictly necessary.

> > +    char *old, *new;
> > +    FILE *of = NULL, *nf = NULL;
> > +    struct timespec ts;
> > +    int rc = ERROR_FAIL;
> 
> Please do not set rc to ERROR_FAIL like this.  Leave it undefined.
> Set it on each exit path.  (If you are calling a function that returns
> an rc, you can put it in rc, and then test rc and goto out without
> assignment.)
> 
> (Again, see tools/libxl/CODING_STYLE where this is discussed.)
> 

Ok.

> > +    lock = libxl__lock_domid_history(gc);
> > +    if (!lock) {
> > +        LOGED(ERROR, domid, "failed to acquire lock");
> > +        goto out;
> > +    }
> > +
> > +    old = libxl__domid_history_path(gc, NULL);
> > +    of = fopen(old, "r");
> > +    if (!of && errno != ENOENT)
> > +        LOGED(WARN, domid, "failed to open '%s'", old);
> 
> This fopen code and its error handling is still duplicated between
> libxl__mark_domid_recent and libxl__is_domid_recent. 

That's not quite true. The error semantics are different; the former does not 
tolerate a failure to open the file, the latter does.

> I meant for you
> to factor it out.  Likewise the other duplicated code in these two
> functions.  I want there to be nothing duplicated that can be written
> once.

I'm not sure how you want me to combine them, given the differing semantics.

> 
> Also failure to open the file should be an error, resulting failure of
> this function and the whole surrounding operation, not simply produce
> a warning in some logfile where it will be ignored.

But that will cause a failure when trying to create the first domain after 
boot, since the file won't exist.

> 
> > +        while (libxl__read_recent(of, &sec, &val)) {
> > +            if (!libxl_domid_valid_guest(val))
> > +                continue; /* Ignore invalid entries */
> > +
> > +            if (ts.tv_sec - sec > timeout)
> > +                continue; /* Ignore expired entries */
> > +
> > +            if (!libxl__write_recent(nf, sec, val)) {
> > +                LOGED(ERROR, domid, "failed to write to '%s'", new);
> > +                goto out;
> > +            }
> > +        }
> > +        if (ferror(of)) {
> > +            LOGED(ERROR, domid, "failed to read from '%s'", old);
> > +            goto out;
> > +        }
> 
> Oh, wait, here is one of the missing pieces of error handling ?
> Please put it where it belongs, next to the corresponding call.
> 
> > +    if (of && fclose(of) == EOF) {
> > +        LOGED(ERROR, domid, "failed to close '%s'", old);
> 
> I don't see how of would be NULL here.
> 

It will be NULL if the file did not exist, which will be the case until the 
first domain destruction occurs.

  Paul

> Thanks,
> Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.