Cannot resume hibernated VM

carsten2

Well-Known Member
Mar 25, 2017
249
21
58
55
I suspended a VM and tried to resume it. This does not work and issues the following error:

Code:
Resuming suspended VM
activating and using 'vmdata1zfs:vm-113-state-suspend-2023-01-18' as vmstate
kvm: Missing section footer for 0000:00:01.3/piix4_pm
kvm: Error while loading VM state: Invalid argument
Resumed VM, removing state
TASK ERROR: unable to open file '/etc/pve/nodes/xxxxx/qemu-server/113.conf.tmp.375382' - Device or resource busy

Another try to resume just gives the error "already running".
Trying to stop the VM gives "VM is locked (suspended)
Using "qm stop --skiplocks" stopped the VM but "qm resume" just hangs.
 
Last edited:
Hi,
please post the output of pveversion -v and qm config 113. How long ago was the snapshot taken? (well the state file includes the date ;)) Can you check what QEMU version was installed at that time (or guess with what version the VM was running at that time)?
This does sound very much like https://gitlab.com/qemu-project/qemu/-/issues/932 but that bug should've been only present in some 7.0 release candidates, not in any released QEMU versions.

EDIT: please post cat /etc/pve/qemu-server/113.conf, to get the full snapshot config too.
 
Last edited:
Hi,
please post the output of pveversion -v and qm config 113. How long ago was the snapshot taken? (well the state file includes the date ;)) Can you check what QEMU version was installed at that time (or guess with what version the VM was running at that time)?
This does sound very much like https://gitlab.com/qemu-project/qemu/-/issues/932 but that bug should've been only present in some 7.0 release candidates, not in any released QEMU versions.

EDIT: please post cat /etc/pve/qemu-server/113.conf, to get the full snapshot config too.

pveversion -v
Code:
proxmox-ve: 7.3-1 (running kernel: 5.15.74-1-pve)
pve-manager: 7.3-4 (running version: 7.3-4/d69b70d4)
pve-kernel-5.15: 7.3-1
pve-kernel-helper: 7.3-1
pve-kernel-5.15.83-1-pve: 5.15.83-1
pve-kernel-5.15.74-1-pve: 5.15.74-1
ceph-fuse: 14.2.21-1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: residual config
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.3
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.3-1
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.3-1
libpve-guest-common-perl: 4.2-3
libpve-http-server-perl: 4.1-5
libpve-storage-perl: 7.3-1
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.3.2-1
proxmox-backup-file-restore: 2.3.2-1
proxmox-mini-journalreader: 1.3-1
proxmox-offline-mirror-helper: 0.5.0-1
proxmox-widget-toolkit: 3.5.3
pve-cluster: 7.3-1
pve-container: 4.4-2
pve-docs: 7.3-1
pve-edk2-firmware: 3.20220526-1
pve-firewall: 4.2-7
pve-firmware: 3.6-2
pve-ha-manager: 3.5.1
pve-i18n: 2.8-1
pve-qemu-kvm: 7.1.0-4
pve-xtermjs: 4.16.0-1
pve-zsync: 2.2.3
qemu-server: 7.3-2
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+2
vncterm: 1.7-1
zfsutils-linux: 2.1.7-pve2

qm config 113
Code:
audio0: device=ich9-intel-hda,driver=none
boot: cdn
bootdisk: scsi0
cores: 8
ide2: none,media=cdrom
memory: 6144
name: hal-surf
net0: virtio=92:0B:E1:32:8A:49,bridge=vmbr0,firewall=1
numa: 0
ostype: l26
runningcpu: kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep
runningmachine: pc-i440fx-7.1+pve0
scsi0: vmdata1zfs:vm-113-disk-0,discard=on,size=100G
scsi1: vmdata1zfs:vm-113-disk-1,discard=on,size=4100G
scsihw: virtio-scsi-pci
smbios1: uuid=b11176f8-c056-4011-8cb7-0e4d591b4d62
sockets: 1

I discarded the suspended state and restartet.
I tried two times to suspend and restart. Always the same error.
 
I'm not able to reproduce the issue here. To get more information, you can add
Code:
args: -trace vmstate*,file=/tmp/hibernate-113.trace -trace savevm*,file=/tmp/hibernate-113.trace
to your VM config. Then start the VM and hibernate. If you can't upload the full log in /tmp/hibernate-113.trace somewhere, please provide the part starting from the line (section_id might differ):
Code:
savevm_section_start 0000:00:01.3/piix4_pm, section_id 54
to the line
Code:
savevm_section_end 0000:00:01.3/piix4_pm, section_id 54 -> 0

Then resume the VM and again, provide the log in /tmp/hibernate-113.trace, or at least the part from the line
Code:
vmstate_load 0000:00:01.3/piix4_pm, piix4_pm
to the line
Code:
vmstate_load_state_end piix4_pm end/0
 
I stopped the VM and restartet it, but two things remained

1) An extra ZFS columne named vm-113-state-part
2) An extra section named "[part]" in the config file which references the zfs volume in 1)

How to remove this?

Can I just delete the section "part" and remove the zfs volume?
 
I stopped the VM and restartet it, but two things remained

1) An extra ZFS columne named vm-113-state-part
2) An extra section named "[part]" in the config file which references the zfs volume in 1)

How to remove this?

Can I just delete the section "part" and remove the zfs volume?
That sounds like you have a snapshot named part? If you want to remove it, you can do it in the Snapshots tab of the VM in the UI or on the CLI with qm delsnapshot 113 part.
 
  • Like
Reactions: carsten2

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!