[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v1 05/10] libxl: synchronise configuration when we hotplug a device



On Wed, Jul 16, 2014 at 05:48:59PM +0100, Ian Campbell wrote:
> On Thu, 2014-07-10 at 15:32 +0100, Wei Liu wrote:
> > We update JSON version first, then write to xenstore, so that we
> > maintain the following invariant: any device which is present in
> > xenstore has a corresponding entry in JSON.
> > 
> > The locking pattern is as followed:
> >  1. lock JSON domain config
> >  2. write JSON entry
> >  3. write xenstore entry for a device
> >  4. unlock JSON domain config
> > 
> > And we ensure in code that JSON entry is always written before the
> > xenstore entry is written. That is, if 2 fails, 3 will not be executed.
> > We also ensure that if 2 and 3 are invoked in a loop, the previous
> > added JSON entry is removed before writing new one.
> > 
> > The locking patten is designed like this, because when adding a disk,
> 
> "pattern". Apparently a patten is a type of clog or shoe l-)
> 
> > the caller might specify get_vdev which will get called in the middle
> > of a xenstore transaction to determine the identifier of a disk. Other
> > devices don't have this property, but I make them follow the same
> > pattern for consistency.
> 
> I don't know if it is helpful but AIUI the get_vdev there is purely to
> handle the case where the disk is being attached to the toolstack domain
> (e.g. for pygrub). I don't know if that simplifies anything for you.
> 

Probably not if get_vdev is set to be called in this loop and embedded
in a xenstore transaction.

For other device, determining devid only requires XBT_NULL, so that I
can safely move the call earlier. I'm not confident that I can do the
same for disk.

> > As we don't have a JSON config file for libxl toolstack domain
> > (currently Dom0) we need to skip JSON manipulation for it.
> 
> How hard would it be to create a stub/stunt JSON for dom0 when we notice
> it is missing? Or from e.g. xencommons perhaps?
> 

We need to determine:
1. when / where to generate such thing (xencommons?)
2. what to put in it

I have yet had answers for #2. The simplest version can be "{}" I think.
That is an empty configuration that every fields gets the default value.
But we probably need more than that.

> > Signed-off-by: Wei Liu <wei.liu2@xxxxxxxxxx>
> > @@ -1917,6 +1922,8 @@ void libxl__device_vtpm_add(libxl__egc *egc, uint32_t 
> > domid,
> >              goto out;
> >          }
> >      }
> > +    vtpm_saved.devid = vtpm->devid;
> 
> Why isn't this in the update_config fn?
> 

Sure, this can be done.

> > +    libxl__update_config_vtpm(gc, &vtpm_saved, vtpm);
> >  
> >      GCNEW(device);
> >      rc = libxl__device_from_vtpm(gc, domid, vtpm, device);
> > @@ -1943,17 +1950,30 @@ void libxl__device_vtpm_add(libxl__egc *egc, 
> > uint32_t domid,
> >      flexarray_append(front, "handle");
> >      flexarray_append(front, GCSPRINTF("%d", vtpm->devid));
> >  
> > +    LOCK_DOMAIN_JSON_CONFIG(gc, domid, lock, rc);
> > +    if (rc) goto out;
> 
> I can't see anything in this macro (or the unlock one) which
> necessitates it being a macro rather than a helper function.
> 

Helper funtion it is.

> Couldn't you just do the LIBXL_TOOLSTACK_DOMID check in
> libxl__lock_domain_configuration?
> 

This can be done.

> > +
> > +    DEVICE_ADD_JSON(vtpm, vtpms, num_vtpms, domid, &vtpm_saved,
> 
> The second and third of these arguments could be derived from the first.
> 

Unfortunately not: PCI device doesn't follow this rule. Otherwise I
would have done it already.  :-(

> > @@ -2193,21 +2217,56 @@ static void device_disk_add(libxl__egc *egc, 
> > uint32_t domid,
> >          goto out;
> >      }
> >  
> > +    libxl_device_disk_copy(CTX, &disk_saved, disk);
> > +
> > +    LOCK_DOMAIN_JSON_CONFIG(gc, domid, lock, rc);
> > +    if (rc) goto out;
> > +
> >      for (;;) {
> >          rc = libxl__xs_transaction_start(gc, &t);
> > -        if (rc) goto out;
> > +        if (rc) {
> > +            UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
> > +            goto out;
> > +        }
> > +
> > +        /* This is a loop, if disk has been added to JSON before, we
> > +         * need to remove it, then add it later, because disk->vdev
> > +         * might change.
> 
> Has the old device been removed from xenstore at this point? Otherwise
> aren't you breaking your invariant that a device in xenstore is always
> described in the json.
> 

Not yet, we're here (again) because the transaction hasn't been
committed.

> > +         */
> > +        DEVICE_REMOVE_JSON(disk, disks, num_disks, domid, &disk_saved,
> > +                           COMPARE_DISK, rc);
> > +        if (rc) {
> > +            UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
> > +            goto out;
> 
> I wonder if you can arrange the exit path such that this can happen
> there and save all of these?
> 

I will see what I can do.

> > +#define DEVICE_ADD_JSON(type, ptr, cnt, domid, dev, compare, rc)        \
> > +    do {                                                                \
> > +        int x;                                                          \
> > +        libxl_domain_config d_config;                                   \
> > +        libxl_device_##type *p;                                         \
> > +                                                                        \
> > +        if (domid == LIBXL_TOOLSTACK_DOMID)                             \
> > +            break;                                                      \
> > +                                                                        \
> > +        libxl_domain_config_init(&d_config);                            \
> > +                                                                        \
> > +        rc = libxl__get_domain_configuration(gc, (domid), &d_config);   \
> > +        if (rc)                                                         \
> > +            goto add_done;                                              \
> > +                                                                        \
> > +        /* Check for duplicated device */                               \
> > +        for (x = 0; x < d_config.cnt; x++) {                            \
> > +            if (compare(&d_config.ptr[x], (dev))) {                     \
> 
> Compare just checks ->devid etc, right?
> 

It depends on specific device type. See those COMPARE_* macros.
Disk and PCI devices don't have devid.

> What if this device has the same devid but a different config?
> 
> That probably ought to be an error, have we already caught that or are
> we relying on something later?
> 

Then the caller should use DEVICE_UPDATE_JSON which is introduced in
later patch.

> I suppose either way we don't want to add the device again now.
> 

Yes.

> > +                rc = 0;                                                 \
> > +                goto add_done;                                          \
> > +            }                                                           \
> > +        }                                                               \
> > +                                                                        \
> > +        d_config.ptr =                                                  \
> > +            libxl__realloc(gc, d_config.ptr,                            \
> > +                           (d_config.cnt + 1) *                         \
> > +                           sizeof(libxl_device_##type));                \
> > +        p = &d_config.ptr[d_config.cnt];                                \
> > +        d_config.cnt++;                                                 \
> > +        libxl_device_##type##_copy(CTX, p, (dev));                      \
> > +                                                                        \
> > +        rc = libxl__set_domain_configuration(gc, (domid), &d_config);   \
> 
> How close to being possible is it to do this as a proper helper function
> which takes a size_t and a bunch of fn pointers with void where the type
> would be?
> 

We need to know the type of this structure otherwise we don't know what
*_copy function to call. Sadly there's no way to pass in type information
in C in runtime.

> > +                                                                        \
> > +    add_done:                                                           \
> 
> You are going to have multiple of these labels in this source file. Not
> sure quite what that means in C. The simple answer would be to involve
> ##type## in the name.
> 

That would break compilation I think. But I haven't seen it so far
because this macro only gets used once in those caller functions.

I will make this change as you suggest anyway.

> > +        libxl_domain_config_dispose(&d_config);                         \
> > +    } while (0)
> > +
> > +#define DEVICE_REMOVE_JSON(type, ptr, cnt, domid, dev, compare, rc)     \
> 
> Same comments as on add.
> 

Same reply as on add. :-)

> > +    do {                                                                \
> > +        int i, j;                                                       \
> > +        libxl_device_##type *p = dev;                                   \
> > +        libxl_domain_config d_config;                                   \
> > +        bool removed = false;                                           \
> > +                                                                        \
> > +        if (domid == LIBXL_TOOLSTACK_DOMID)                             \
> > +            break;                                                      \
> > +                                                                        \
> > +        libxl_domain_config_init(&d_config);                            \
> > +                                                                        \
> > +        rc = libxl__get_domain_configuration(gc, (domid), &d_config);   \
> > +        if (rc)                                                         \
> > +            goto remove_done;                                           \
> > +                                                                        \
> > +        for (i = j = 0; i < d_config.cnt; i++) {                        \
> > +            if (!compare(&d_config.ptr[i], p)) {                        \
> > +                if (i != j) {                                           \
> > +                    libxl_device_##type##_dispose(&d_config.ptr[j]);    \
> > +                    d_config.ptr[j] = d_config.ptr[i];                  \
> > +                    removed = true;                                     \
> > +                }                                                       \
> > +                j++;                                                    \
> > +            }                                                           \
> > +        }                                                               \
> > +                                                                        \
> > +        if (!removed)  /* no matching entry found */                    \
> > +            goto remove_done;                                           \
> > +                                                                        \
> > +        d_config.ptr =                                                  \
> > +            libxl__realloc(NOGC, d_config.ptr,                          \
> > +                           j * sizeof(libxl_device_##type));            \
> > +        d_config.cnt = j;                                               \
> 
> The fact that the compare loop also shuffles everything down is worthy
> of noting in a comment I think.
> 

OK.

Wei.

> Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.