[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [BUG] Linux pvh vm not getting destroyed on shutdown

On Sat, Feb 13, 2021 at 04:36:24PM +0100, Maximilian Engelhardt wrote:
> after a recent upgrade of one of our test systems to Debian Bullseye we 
> noticed an issue where on shutdown of a pvh vm the vm was not destroyed by 
> xen 
> automatically. It could still be destroyed by manually issuing a 'xl destroy 
> $vm' command.

Usually I would expect such an issue to show on the Debian bug database
before xen-devel.  In particular as this is a behavior change with
security updates, there is a good chance this isn't attributable to the
Xen Project.  Additionally the Xen Project's support window is rather
narrow.  I've been observing the same (or similar) issue for a bit too.

> Here are some things I noticed while trying to debug this issue:
> * It happens on a Debian buster dom0 as well as on a bullseye dom0

I stick with stable on non-development machines, so I can't say anything
to this.

> * It seems to only affect pvh vms.

I've observed it with pv and hvm VMs as well.

> * shutdown from the pvgrub menu ("c" -> "halt") does work

Woah!  That is quite the observation.  Since I had a handy opportunity
I tried this and this reproduces for me.

> * the vm seems to shut down normal, the last lines in the console are:

I agree with this.  Everything appears typical until the last moment.

> * issuing a reboot instead of a shutdown does work fine.

I disagree with this.  I'm seeing the issue occur with restart attempts

> * The issue started with Debian kernel 5.8.3+1~exp1 running in the vm, Debian 
> kernel 5.7.17-1 does not show the issue.

I think the first kernel update during which I saw the issue was around
linux-image-4.19.0-12-amd64 or linux-image-4.19.0-13-amd64.  I think
the last security update to the Xen packages was in a similar timeframe
though.  Rate this portion as unreliable though.  I can definitely state
this occurs with Debian's linux-image-4.19.0-13-amd64 and kernels built
from corresponding source, this may have shown earlier.

> * setting vcpus equal to maxvcpus does *not* show the hang.

I haven't tried things related to this, so I can't comment on this

Fresh observation.  During a similar timeframe I started noticing VM
creation leaving a `xl create` process behind.  I had discovered this
process could be freely killed without appearing to effect the VM and had
thus been doing so (memory in a lean Dom0 is precious).

While typing this I realized there was another scenario I needed to try.
Turns out if I boot PV GRUB and get to its command-line (press 'c'), then
get away from the VM console, kill the `xl create` process, return to
the console and type "halt".  This results in a hung VM.

Are you perhaps either killing the `xl create` process for effected VMs,
or migrating the VM and thus splitting the `xl create` process from the
effected VMs?

This seems more a Debian issue than a Xen Project issue right now.

(\___(\___(\______          --=> 8-) EHM <=--          ______/)___/)___/)
 \BS (    |         ehem+sigmsg@xxxxxxx  PGP 87145445         |    )   /
  \_CS\   |  _____  -O #include <stddisclaimer.h> O-   _____  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.