[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v6 08/11] libxl: QEMU startup sync based on QMP



On Fri, Nov 16, 2018 at 12:14:43PM +0000, Ian Jackson wrote:
> Anthony PERARD writes ("[PATCH v6 08/11] libxl: QEMU startup sync based on 
> QMP"):
> > This is only activated when dm_restrict=1, as explained in the previous
> > patch "libxl_dm: Pre-open QMP socket for QEMU"
> ...
> > Signed-off-by: Anthony PERARD <anthony.perard@xxxxxxxxxx>
> > Reviewed-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> 
> Thanks.  I think I have spotted one DoS vulnerability (to qemu) and
> one potential memory leak.
> 
> And some things which are anomalous but may or may not be bugs.
> 
> > diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> > index b768d1b09f..de3862c839 100644
> > --- a/tools/libxl/libxl_internal.h
> > +++ b/tools/libxl/libxl_internal.h
> > @@ -3898,6 +3898,7 @@ struct libxl__dm_spawn_state {
>        /* filled in by user, must remain valid: */
>        uint32_t guest_domid; /* domain being served */
> >      libxl_domain_config *guest_config;
> >      libxl__domain_build_state *build_state; /* relates to guest_domid */
> >      libxl__dm_spawn_cb *callback;
> > +    libxl__ev_qmp qmp;
> >  };
> 
> I added a couple more lines of context.  Now we can see that you are
> adding qmp in the wrong place.  The qmp is private to
> libxl__spawn_*_dm, isn't it ?

Yes, I think it is.

> This is the private field which can be handled in an idempotent way.
> The other private field is `libxl__spawn_state spawn', which can't be
> done that way because a spawn cannot be simply disposed.
> 
> I think you should introduce and call common functions dmss_init and
> dmss_dispose for the use of libxl__spawn_local_dm and
> libxl__spawn_stub_dm, and the ev_qmp_init should be done there.

Will do. There seems to be libxl__spawn_qdisk_backend that would need
dmss_init as well.

> As it is, you neither initialise nor dispose qmp in the case of
> libxl__spawn_stub_dm.  That is perhaps correct now but it is a
> latent bug if someone starts using qmp in the stub dm case.
> 
> > @@ -2343,6 +2346,8 @@ void libxl__spawn_local_dm(libxl__egc *egc, 
> > libxl__dm_spawn_state *dmss)
> >      const char *dm;
> >      int dm_state_fd = -1;
> >  
> > +    libxl__ev_qmp_init(&dmss->qmp);
> > +
> >      if (libxl_defbool_val(b_info->device_model_stubdomain)) {
> >          abort();
> >      }
> > @@ -2450,6 +2455,16 @@ retry_transaction:
> >      spawn->failure_cb = device_model_startup_failed;
> >      spawn->detached_cb = device_model_detached;
> >  
> > +    if (state->dm_monitor_fd >= 0) {
> > +        /* There is a valid QMP socket available now,
> > +         * use it to find out when QEMU is ready */
> > +        dmss->qmp.callback = device_model_qmp_cb;
> > +        dmss->qmp.domid = domid;
> > +        dmss->qmp.fd = -1;
> > +        rc = libxl__ev_qmp_send(gc, &dmss->qmp, "query-status", NULL);
> > +        if (rc) goto out_close;
> > +    }
> 
> The documentation does not make it clear whether libxl__ev_qmp_send
> might make the callback synchronously.  I think if it does you are at
> risk of calling libxl__spawn_initiate_failure when the spawn has not
> been started yet.

I'll fix the documentation to tell that libxl__ev_qmp_send will not call
the callback synchronously.

> >      rc = libxl__spawn_spawn(egc, spawn);
> >      if (rc < 0)
> >          goto out_close;
> > @@ -2524,6 +2539,44 @@ static void device_model_detached(libxl__egc *egc,
> >      device_model_spawn_outcome(egc, dmss, 0);
> >  }
> >  
> > +static void device_model_qmp_cb(libxl__egc *egc, libxl__ev_qmp *ev,
> > +                                const libxl__json_object *response,
> > +                                int rc)
> > +{
> > +    EGC_GC;
> > +    libxl__dm_spawn_state *dmss = CONTAINER_OF(ev, *dmss, qmp);
> > +    const libxl__json_object *o;
> > +    const char *status;
> > +
> > +    libxl__ev_qmp_dispose(gc, ev);
> 
> That surely doesn't want to be here.  It should be (and I think, is)
> disposed in the general teardown.  If I am wrong about that then I
> have misunderstood the control flow, and the control flow may be
> wrong.

That is documented in libxl__ev_qmp as to why _dispose is called here:

    Only one connection at a time can be made to one QEMU, so avoid
    keeping a libxl__ev_qmp Connected for to long and call
    libxl__ev_qmp_dispose as soon as it is not needed anymore.


> > +        LOGD(DEBUG, ev->domid, ".. instead, got: %s",
> > +             libxl__json_object_to_json(gc, response));
> 
> The doc comments for libxl__json_object_to_json don't say whether it
> can fail.  So I assume it can, in which case you will pass NULL to %s
> which is (sadly) nowadays illegal (although in practice probably
> safe).

I wounder what to do for this.
Maybe invent a JSON macro which would be:
    JSON(o) (libxl__json_object_to_json(gc, (o)) : ? "\"null\"")
    ("null" would actually be valid json)
Or do it without the macro, but there are plenty of other caller's of
libxl__json_object_to_json in libxl__ev_qmp implementation.

> > +    status = libxl__json_object_get_string(o);
> > +    if (strcmp(status, "running")) {
> 
> I think status can be NULL if o is not a string, and this is therefore
> a DoS vulnerability against libxl exploitable by a depriv qemu.

`o` is a string, libxl__json_map_get(,,JSON_STRING) calls makes sure of
that. Then `status` can't be NULL.

> > @@ -2547,6 +2600,8 @@ static void device_model_spawn_outcome(libxl__egc 
> > *egc,
> >          }
> >      }
> >  
> > +    libxl__ev_qmp_dispose(gc, &dmss->qmp);
> > +
> >   out:
> >      dmss->callback(egc, dmss, rc);
> 
> Why is the dispose before out ?  I think this may be a memory leak (or
> worse), perhaps exploitable somehow by qemu.

It's probably a mistake.

> >  _hidden void libxl__spawn_local_dm(libxl__egc *egc, 
> > libxl__dm_spawn_state*);
> > diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> > index fec42b260c..a0912718e0 100644
> > --- a/tools/libxl/libxl_types.idl
> > +++ b/tools/libxl/libxl_types.idl
> > @@ -75,6 +75,7 @@ libxl_error = Enumeration("error", [
> >      (-29, "QMP_COMMAND_NOT_FOUND"), # the requested command has not been 
> > found
> >      (-30, "QMP_DEVICE_NOT_ACTIVE"), # a device has failed to be become 
> > active
> >      (-31, "QMP_DEVICE_NOT_FOUND"), # the requested device has not been 
> > found
> > +    (-32, "QEMU_API"),
> 
> Can we at least have a descriptive comment for this error code ?

What about:
  QEMU's replies don't contains expected members

Thanks,

-- 
Anthony PERARD

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.