| 
    
 [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 8/9] libxl: Kill QEMU by uid when possible
 On 11/23/18 5:15 PM, George Dunlap wrote:
> The privcmd fd that a dm_restrict'ed QEMU has gives it permission to
> one specific domain ID.  This domain ID will probably eventually be
> used again.  It is therefore necessary to make absolutely sure that a
> rogue QEMU process cannot hang around after its domain has exited.
> 
> Killing QEMU by pid is insufficient in this situation, because QEMU
> may be able to fork() to escape killing.  It is surprisingly tricky to
> kill a process which can call fork() without races; the only reliable
> way is to use kill(-1) to kill all processes with a given uid.
> 
> We can use this method only when we're sure that there's only one QEMU
> instance per uid.  Add a dm_uid into the domain_build_state struct,
> and set it in libxl__domain_get_device_model_uid() when it's safe to
> kill by UID.  Store this in xenstore next to device-model-pid.
> 
> On domain destroy, check to see if device-model-uid is present in
> xenstore.  If so, fork off a reaper process, setuid to that uid, and
> do kill(-9) to kill all uids of that type.  Otherwise, carry on
> destroying by pid.
> 
> NOTE that this is not yet completely safe: with ruid == dm_uid, the
> device model may be able to kill(-9) the 'reaper' process before the
> reaper process can kill it.  Further patches will address this.
> 
> Signed-off-by: George Dunlap <george.dunlap@xxxxxxxxxx>
Also...
> +    if (ret || !dm_uid_str) {
> +        /* No uid in xenstore; just kill the pid we have */
> +        LOGD(DEBUG, domid, "Didn't find dm UID; destroying by pid");
> +        
> +        rc = kill_device_model(gc,
> +                               
> GCSPRINTF("/local/domain/%d/image/device-model-pid", domid));
> +    
> +        libxl__qmp_cleanup(gc, domid);
> +
> +        ddms->callback(egc, ddms, rc);
> +        return;
[snip]
> +static void kill_device_model_uid_cb(libxl__egc *egc,
> +                                    libxl__ev_child *destroyer,
> +                                    pid_t pid, int status)
> +{
> +    libxl__destroy_devicemodel_state *ddms = CONTAINER_OF(destroyer, *ddms, 
> destroyer);
> +    STATE_AO_GC(ddms->ao);
> +    int rc;
> +
> +    if (status) {
> +        if (WIFEXITED(status) && WEXITSTATUS(status)<126) {
> +            LOGEVD(ERROR, WEXITSTATUS(status), ddms->domid,
> +                   "uid-kill failed");
> +        } else {
> +            libxl_report_child_exitstatus(CTX, XTL_ERROR,
> +                                          "async domain destroy", pid, 
> status);
> +        }
> +        rc = ERROR_FAIL;
> +        goto out;
> +    }
> +    rc = 0;
> +
> +out:
> +    libxl__qmp_cleanup(gc, ddms->domid);
Does libxl__qmp_cleanup() need to be called after the kill() happens?
If not, we could put this before the kill() and avoid having two call sites.
 -George
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel
 
 
  | 
  
![]()  | 
            
         Lists.xenproject.org is hosted with RackSpace, monitoring our  |