[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v6 08/11] libxl: QEMU startup sync based on QMP



Anthony PERARD writes ("[PATCH v6 08/11] libxl: QEMU startup sync based on 
QMP"):
> This is only activated when dm_restrict=1, as explained in the previous
> patch "libxl_dm: Pre-open QMP socket for QEMU"
...
> Signed-off-by: Anthony PERARD <anthony.perard@xxxxxxxxxx>
> Reviewed-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Thanks.  I think I have spotted one DoS vulnerability (to qemu) and
one potential memory leak.

And some things which are anomalous but may or may not be bugs.

> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index b768d1b09f..de3862c839 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -3898,6 +3898,7 @@ struct libxl__dm_spawn_state {
       /* filled in by user, must remain valid: */
       uint32_t guest_domid; /* domain being served */
>      libxl_domain_config *guest_config;
>      libxl__domain_build_state *build_state; /* relates to guest_domid */
>      libxl__dm_spawn_cb *callback;
> +    libxl__ev_qmp qmp;
>  };

I added a couple more lines of context.  Now we can see that you are
adding qmp in the wrong place.  The qmp is private to
libxl__spawn_*_dm, isn't it ?

This is the private field which can be handled in an idempotent way.
The other private field is `libxl__spawn_state spawn', which can't be
done that way because a spawn cannot be simply disposed.

I think you should introduce and call common functions dmss_init and
dmss_dispose for the use of libxl__spawn_local_dm and
libxl__spawn_stub_dm, and the ev_qmp_init should be done there.

As it is, you neither initialise nor dispose qmp in the case of
libxl__spawn_stub_dm.  That is perhaps correct now but it is a
latent bug if someone starts using qmp in the stub dm case.

> @@ -2343,6 +2346,8 @@ void libxl__spawn_local_dm(libxl__egc *egc, 
> libxl__dm_spawn_state *dmss)
>      const char *dm;
>      int dm_state_fd = -1;
>  
> +    libxl__ev_qmp_init(&dmss->qmp);
> +
>      if (libxl_defbool_val(b_info->device_model_stubdomain)) {
>          abort();
>      }
> @@ -2450,6 +2455,16 @@ retry_transaction:
>      spawn->failure_cb = device_model_startup_failed;
>      spawn->detached_cb = device_model_detached;
>  
> +    if (state->dm_monitor_fd >= 0) {
> +        /* There is a valid QMP socket available now,
> +         * use it to find out when QEMU is ready */
> +        dmss->qmp.callback = device_model_qmp_cb;
> +        dmss->qmp.domid = domid;
> +        dmss->qmp.fd = -1;
> +        rc = libxl__ev_qmp_send(gc, &dmss->qmp, "query-status", NULL);
> +        if (rc) goto out_close;
> +    }

The documentation does not make it clear whether libxl__ev_qmp_send
might make the callback synchronously.  I think if it does you are at
risk of calling libxl__spawn_initiate_failure when the spawn has not
been started yet.

>      rc = libxl__spawn_spawn(egc, spawn);
>      if (rc < 0)
>          goto out_close;
> @@ -2524,6 +2539,44 @@ static void device_model_detached(libxl__egc *egc,
>      device_model_spawn_outcome(egc, dmss, 0);
>  }
>  
> +static void device_model_qmp_cb(libxl__egc *egc, libxl__ev_qmp *ev,
> +                                const libxl__json_object *response,
> +                                int rc)
> +{
> +    EGC_GC;
> +    libxl__dm_spawn_state *dmss = CONTAINER_OF(ev, *dmss, qmp);
> +    const libxl__json_object *o;
> +    const char *status;
> +
> +    libxl__ev_qmp_dispose(gc, ev);

That surely doesn't want to be here.  It should be (and I think, is)
disposed in the general teardown.  If I am wrong about that then I
have misunderstood the control flow, and the control flow may be
wrong.

> +    o = libxl__json_map_get("status", response, JSON_STRING);
> +    if (!o) {
> +        LOGD(ERROR, ev->domid,
> +             "Missing \"status\" in response to \"query-status\"");

If you used ` ' or ' ' here you could do away with the \.

> +        LOGD(DEBUG, ev->domid, ".. instead, got: %s",
> +             libxl__json_object_to_json(gc, response));

The doc comments for libxl__json_object_to_json don't say whether it
can fail.  So I assume it can, in which case you will pass NULL to %s
which is (sadly) nowadays illegal (although in practice probably
safe).

> +    status = libxl__json_object_get_string(o);
> +    if (strcmp(status, "running")) {

I think status can be NULL if o is not a string, and this is therefore
a DoS vulnerability against libxl exploitable by a depriv qemu.

> @@ -2547,6 +2600,8 @@ static void device_model_spawn_outcome(libxl__egc *egc,
>          }
>      }
>  
> +    libxl__ev_qmp_dispose(gc, &dmss->qmp);
> +
>   out:
>      dmss->callback(egc, dmss, rc);

Why is the dispose before out ?  I think this may be a memory leak (or
worse), perhaps exploitable somehow by qemu.

>  _hidden void libxl__spawn_local_dm(libxl__egc *egc, libxl__dm_spawn_state*);
> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index fec42b260c..a0912718e0 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -75,6 +75,7 @@ libxl_error = Enumeration("error", [
>      (-29, "QMP_COMMAND_NOT_FOUND"), # the requested command has not been 
> found
>      (-30, "QMP_DEVICE_NOT_ACTIVE"), # a device has failed to be become active
>      (-31, "QMP_DEVICE_NOT_FOUND"), # the requested device has not been found
> +    (-32, "QEMU_API"),

Can we at least have a descriptive comment for this error code ?


Thanks,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.