[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 3/6] libxl: add infrastructure to track and query 'retired' domids



> -----Original Message-----
> From: Ian Jackson <ian.jackson@xxxxxxxxxx>
> Sent: 16 January 2020 19:28
> To: Durrant, Paul <pdurrant@xxxxxxxxxxxx>
> Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx; Wei Liu <wl@xxxxxxx>; Anthony Perard
> <anthony.perard@xxxxxxxxxx>
> Subject: Re: [PATCH v3 3/6] libxl: add infrastructure to track and query
> 'retired' domids
> 
> Thanks.  I think this is the algorithm as we discussed, thanks.
> I have some comments about the implementation...
> 
> Paul Durrant writes ("[PATCH v3 3/6] libxl: add infrastructure to track
> and query 'retired' domids"):
> > A domid is considered retired if the domain it represents was destroyed
> > less than a specified number of seconds ago. The number can be set using
> > the environment variable LIBXL_DOMID_MAX_RETIREMENT. If the variable
> does
> > not exist then a default value of 60s is used.
> ...
> 
> I'm afraid I think your update protocol for this file is wrong.  In
> general it is a bad idea to try to write over a file in-place.  Doing
> so is full of gotchas.  (In this particular case for example I think
> an interrupted attempt at cleaning the file can produce a corrupted
> file containing nonsense.)
> 
> Can we please use the standard write-to-new-file-and-rename ?

Ok, fair enough. I'd not really considered interruption as being too much of a 
risk but I guess I should.

> Ie, to launder
>     flock(open("domid-history.lock"))
>     fopen("domid-history","r")
>     fopen("domid-history.new","w")
>     fgets/fputs
>     fclose
>     rename
>     close
> 
> (And no uses of ftell, fopen(,"r+"), etc.)
> 
> Reading can be done without taking the lock, if you so fancy.
> 
> I think there are a lot of missing error checks in this patch, but
> since I'm asking for a different approach I won't point them out
> individually.
> 

Ok.

> > +    fd = open(name, O_RDWR|O_CREAT, 0644);
> > +    if (fd < 0) {
> > +        LOGE(ERROR, "unexpected error while trying open %s, errno=%d",
> > +             name, errno);
> > +        goto fail;
> > +    }
> > +
> > +    for (;;) {
> > +        ret = flock(fd, LOCK_EX);
> 
> I looked for a utility function to do this but didn't find one.
> I think this is complicated because it needs to be a `carefd' in libxl
> terms because of concurrent forking by other threads in the
> application.
> 
> I suggest generalising libxl__lock_domain_userdata, which has all the
> necessary code (and which also would permit removing the file in the
> future).
> 
> I feel responsible for this inconvenience.  If this is too tiresome
> for you, I could do that part for you...
> 

That's ok; I'll insert a preceding generalization patch, unless it turns into a 
total can of worms... which I doubt it will.

> > +/* Write a domid retirement record */
> > +static void libxl__retire_domid(libxl__gc *gc, uint32_t domid)
> > +{
> ...
> > +    while (fgets(line, sizeof(line), f)) {
> > +        unsigned long sec;
> > +        unsigned int ignored;
> > +
> > +        roff = ftell(f);
> > +
> > +        if (sscanf(line, "%lu %u", &sec, &ignored) != 2)
> > +            continue; /* Purge malformed lines */
> 
> I'm not sure why you bother with fgets into a buffer, when you could
> just use fscanf rather than sscanf.  Your code doesn't need to take
> much care about weird syntax which might occur (and indeed your code
> here doesn't take such care).

Well, I need to pull the line into a buffer if I'm going to write it out again, 
but otherwise I could indeed use fscanf().

> 
> > @@ -1331,6 +1462,7 @@ static void devices_destroy_cb(libxl__egc *egc,
> >          if (!ctx->xch) goto badchild;
> >
> >          if (!dis->soft_reset) {
> > +            libxl__retire_domid(gc, domid);
> 
> I wonder if a better term than "retired" would be possible.  I
> initially found this patch a bit confusing because I thought a retired
> domid would be one which had *not* been used recently.  Maybe
> "recent", "mark recent", etc. ?  Ultimately this is a bikeshed issue
> which I will leave this up to you, though.
> 

Ok, 'recent' is probably clearer. I'll s/retired/recent/g.

> 
> I don't much like the environment variable to configure this.  I don't
> object to keeping it but can we have a comment saying this is not
> intended for use in production ?  Personally I would rather it was
> hardcoded, or failing that, written to some config file.
> 

The problem is that libxl has no config file. Env variables seem to be used for 
other things so I followed suit. I'd rather keep the override for debug 
purposes; I'll stick a comment in the header saying that's what it's for 
though, as you suggest.

> 
> Finally, I think this patch needs an addition to xen-init-dom0 to
> remove or empty the record file.  This is because while /run is
> usually a tmpfs, this is not *necessarily* true.
> 

Ok, if we cannot rely on it being tmpfs then I will do that.

  Paul



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.