VM stuck on "Starting" - cannot remove, cannot put VM on another node

Binksy

Member
Feb 18, 2023
5
0
6
Hi there

Running a cluster.

I have a VM which I tried to start and the GUI just shows it "starting" / spinning and not doing anything. Apparently I cannot restart any VMs on this node at all without them getting stuck on Starting

Output shows no content

I have tried to kill the PID, and QEMU Slice, QEMU unlock etc to no avail. I am unable to take this server offline at the moment, so any tips and tricks would be great.

pveversion -v

Code:
proxmox-ve: 7.3-1 (running kernel: 5.15.102-1-pve)
pve-manager: 7.3-6 (running version: 7.3-6/723bb6ec)
pve-kernel-helper: 7.3-8
pve-kernel-5.15: 7.3-3
pve-kernel-5.15.102-1-pve: 5.15.102-1
pve-kernel-5.15.74-1-pve: 5.15.74-1
ceph: 17.2.5-pve1
ceph-fuse: 17.2.5-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.4
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.3-2
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.3-2
libpve-guest-common-perl: 4.2-3
libpve-http-server-perl: 4.1-6
libpve-storage-perl: 7.3-2
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.2-2
lxcfs: 5.0.3-pve1
novnc-pve: 1.4.0-1
proxmox-backup-client: 2.3.3-1
proxmox-backup-file-restore: 2.3.3-1
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.5.5
pve-cluster: 7.3-2
pve-container: 4.4-2
pve-docs: 7.3-1
pve-edk2-firmware: 3.20221111-1
pve-firewall: 4.2-7
pve-firmware: 3.6-4
pve-ha-manager: 3.5.1
pve-i18n: 2.8-3
pve-qemu-kvm: 7.2.0-8
pve-xtermjs: 4.16.0-1
qemu-server: 7.3-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+3
vncterm: 1.7-1
zfsutils-linux: 2.1.9-pve1

pve status

Code:
 pvestatd.service - PVE Status Daemon
     Loaded: loaded (/lib/systemd/system/pvestatd.service; enabled; vendor preset: enabled)
     Active: active (running) since Mon 2023-08-07 17:04:25 EDT; 2 months 17 days ago
    Process: 2190 ExecStart=/usr/bin/pvestatd start (code=exited, status=0/SUCCESS)
   Main PID: 2644 (pvestatd)
      Tasks: 1 (limit: 154453)
     Memory: 126.9M
        CPU: 2d 1h 4min 59.828s
     CGroup: /system.slice/pvestatd.service
             └─2644 pvestatd

Oct 24 11:34:11 pve pvestatd[2644]: status update time (6.174 seconds)
Oct 24 11:34:21 pve pvestatd[2644]: status update time (5.545 seconds)
Oct 24 11:34:30 pve pvestatd[2644]: status update time (5.107 seconds)
Oct 24 16:19:55 pve pvestatd[2644]: status update time (9.876 seconds)
Oct 24 17:34:10 pve pvestatd[2644]: status update time (5.536 seconds)
Oct 24 17:34:21 pve pvestatd[2644]: status update time (5.634 seconds)
Oct 24 17:34:31 pve pvestatd[2644]: status update time (5.716 seconds)
Oct 24 17:34:43 pve pvestatd[2644]: status update time (7.254 seconds)
Oct 24 17:34:50 pve pvestatd[2644]: status update time (5.346 seconds)
Oct 24 20:36:32 pve pvestatd[2644]: status update time (6.225 seconds)

As I can't take this node offline at the moment, I tried to backup to move it to a different node, but I can't get a recent backup as it will not backup

Any tips or tricks would be appreciated

Cheers
 
Hi,
please check the system logs/journal for any additional info/warnings/errors. I'd also check if there is a hanging network storage mount or other hanging processes.

Code:
proxmox-ve: 7.3-1 (running kernel: 5.15.102-1-pve)
In any case, upgrading to the latest version of 7.4 is recommended. Or even to the current 8.0 release https://pve.proxmox.com/wiki/Upgrade_from_7_to_8
 
Hi there,

A reboot fixed it, but this server is being decommissioned so I am just in the process of doing that now. So not needed to upgrade

Cheers