Snapshot causes VM to become unresponsive.

nivek1612 · Aug 27, 2024

I am not exactly sure when this problem first started but it used to be fine in the past
I'm running 8.2.4 of Proxmox, and I'm seeing strange behaviour of a VM post Snapshot.

I have an instance of HomeAssistant in a VM that after a snapshot is taken of it becomes very slow/non-responsive.
Stopping and starting the VM returns it to normal working mode

Any logs I can look at to find out what the problem is

I asked on the HomeAssiatnt Forum and some other people have similar issues.

LnxBil · Aug 27, 2024

Did you create the snapshot with memory? If so, try without.

nivek1612 · Aug 27, 2024

I did and will try without

nivek1612 · Aug 27, 2024

Without RAM avoids the issue. The question is why did it used to work ok

lassiko · Sep 3, 2024

Same issue here. Running 8.2.4 but this was happening before i updated current version.
Haven't tried snapshot yet without memory, but even it would work, as @nivek1612 is asking; what could be the root cause for this as it used to work fine? How to debug this issue? Thanks.

mdavis06 · Sep 11, 2024

Experiencing the exact same issue. Used to work perfectly (with RAM), but now it doesn't. Looking for a solution or reason.

fiona · Sep 11, 2024

Hi,
please share the VM configuration qm config <ID> and the output of pveversion -v. What physical CPU do you have? Is there anything in the system logs?

EDIT: What guest (and kernel) is running inside the VMs? How does the CPU usage look like?

nivek1612 · Sep 11, 2024

root@pve2:~# qm config 105
agent: 1
bios: ovmf
boot: order=scsi0
cores: 2
description: <div align='center'><img src='https%3A//avatars.githubusercontent.com/u/13844975?s=200&v=4'/></a>%0A%0A%0A # Home Assistant VM
efidisk0: local-lvm:vm-105-disk-0,efitype=4m,size=4M
localtime: 1
memory: 4096
meta: creation-qemu=7.1.0,ctime=1675622700
name: haos9.5
net0: virtio=02:79:26:B8:85:35,bridge=vmbr2
onboot: 1
ostype: l26
parent: precloud2
scsi0: local-lvm:vm-105-disk-1,discard=on,size=32G,ssd=1
scsihw: virtio-scsi-pci
smbios1: uuid=9f8174c5-5c21-4b70-a5c8-d9877470963d
tablet: 0
tags:
usb0: host=10c4:ea60
vmgenid: d153564f-5434-4446-a861-6d880e44dcdc

proxmox-ve: 8.2.0 (running kernel: 6.8.12-1-pve)
pve-manager: 8.2.4 (running version: 8.2.4/faa83925c9641325)
proxmox-kernel-helper: 8.1.0
proxmox-kernel-6.8: 6.8.12-1
proxmox-kernel-6.8.12-1-pve-signed: 6.8.12-1
proxmox-kernel-6.8.4-2-pve-signed: 6.8.4-2
ceph-fuse: 17.2.7-pve3
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx9
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.1
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.3
libpve-access-control: 8.1.4
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.0.7
libpve-cluster-perl: 8.0.7
libpve-common-perl: 8.2.2
libpve-guest-common-perl: 5.1.4
libpve-http-server-perl: 5.1.0
libpve-network-perl: 0.9.8
libpve-rs-perl: 0.8.9
libpve-storage-perl: 8.2.3
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.2.7-1
proxmox-backup-file-restore: 3.2.7-1
proxmox-firewall: 0.5.0
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.3
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.6
proxmox-widget-toolkit: 4.2.3
pve-cluster: 8.0.7
pve-container: 5.1.12
pve-docs: 8.2.3
pve-edk2-firmware: 4.2023.08-4
pve-esxi-import-tools: 0.7.1
pve-firewall: 5.0.7
pve-firmware: 3.13-1
pve-ha-manager: 4.0.5
pve-i18n: 3.2.2
pve-qemu-kvm: 9.0.2-2
pve-xtermjs: 5.3.0-3
qemu-server: 8.2.4
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.4-pve1

Running Home Assistant in the VM CPU usage at 3%

fiona · Sep 11, 2024

What physical CPU do you have? Does using CPU type host for the VM work around the issue?

nivek1612 · Sep 11, 2024

Intel(R) Core(TM) i7-4510U CPU @ 2.00GHz

I will give the CPU type host a try later

EDIT: Sorry where would I change CPU type to host that can't see it

fiona · Sep 11, 2024

nivek1612 said:
EDIT: Sorry where would I change CPU type to host that can't see it

In the VM's Hardware panel, select Processors, use the Edit button and select Type.

nivek1612 · Sep 11, 2024

fiona said:
In the VM's Hardware panel, select Processors, use the Edit button and select Type.

Got it. Will need to do it later as the VM is active and busy and I will need to stop it to change the processor.

nivek1612 · Sep 11, 2024

Changed CPU to host
Performed snapshot with RAM
VM was not responding and had to be restarted so the problem remains even with host CPU

fiona · Sep 11, 2024

Okay, thank you for testing! What does qm status <ID> --verbose show while the VM is unresponsive? Can you still ping it? See the display?

esjay90 · Sep 28, 2024

I am having the same issue. Took a snapshot with RAM as I did already several times before, and after the snapshot it was unresponsive. This is the output of qm status <ID> --verbose

balloon: 4294967296
ballooninfo:
actual: 4294967296
free_mem: 363380736
last_update: 1727510890
major_page_faults: 11611
max_mem: 4294967296
mem_swapped_in: 0
mem_swapped_out: 8192
minor_page_faults: 1266619200
total_mem: 4105109504
blockstat:
efidisk0:
account_failed: 1
account_invalid: 1
failed_flush_operations: 0
failed_rd_operations: 0
failed_unmap_operations: 0
failed_wr_operations: 0
failed_zone_append_operations: 0
flush_operations: 0
flush_total_time_ns: 0
invalid_flush_operations: 0
invalid_rd_operations: 0
invalid_unmap_operations: 0
invalid_wr_operations: 0
invalid_zone_append_operations: 0
rd_bytes: 0
rd_merged: 0
rd_operations: 0
rd_total_time_ns: 0
timed_stats:
unmap_bytes: 0
unmap_merged: 0
unmap_operations: 0
unmap_total_time_ns: 0
wr_bytes: 0
wr_highest_offset: 0
wr_merged: 0
wr_operations: 0
wr_total_time_ns: 0
zone_append_bytes: 0
zone_append_merged: 0
zone_append_operations: 0
zone_append_total_time_ns: 0
pflash0:
account_failed: 1
account_invalid: 1
failed_flush_operations: 0
failed_rd_operations: 0
failed_unmap_operations: 0
failed_wr_operations: 0
failed_zone_append_operations: 0
flush_operations: 0
flush_total_time_ns: 0
invalid_flush_operations: 0
invalid_rd_operations: 0
invalid_unmap_operations: 0
invalid_wr_operations: 0
invalid_zone_append_operations: 0
rd_bytes: 0
rd_merged: 0
rd_operations: 0
rd_total_time_ns: 0
timed_stats:
unmap_bytes: 0
unmap_merged: 0
unmap_operations: 0
unmap_total_time_ns: 0
wr_bytes: 0
wr_highest_offset: 0
wr_merged: 0
wr_operations: 0
wr_total_time_ns: 0
zone_append_bytes: 0
zone_append_merged: 0
zone_append_operations: 0
zone_append_total_time_ns: 0
scsi0:
account_failed: 1
account_invalid: 1
failed_flush_operations: 0
failed_rd_operations: 0
failed_unmap_operations: 0
failed_wr_operations: 0
failed_zone_append_operations: 0
flush_operations: 2
flush_total_time_ns: 58841
idle_time_ns: 672845775
invalid_flush_operations: 0
invalid_rd_operations: 0
invalid_unmap_operations: 0
invalid_wr_operations: 0
invalid_zone_append_operations: 0
rd_bytes: 29419152384
rd_merged: 0
rd_operations: 1711100
rd_total_time_ns: 805491294901
timed_stats:
unmap_bytes: 63356908544
unmap_merged: 0
unmap_operations: 79133
unmap_total_time_ns: 35979561726
wr_bytes: 187989206528
wr_highest_offset: 33345769472
wr_merged: 0
wr_operations: 13719531
wr_total_time_ns: 66137617974586
zone_append_bytes: 0
zone_append_merged: 0
zone_append_operations: 0
zone_append_total_time_ns: 0
cpus: 2
disk: 0
diskread: 29419152384
diskwrite: 187989206528
freemem: 363380736
maxdisk: 34359738368
maxmem: 4294967296
mem: 3741728768
name: homeassistantos
netin: 60019776376
netout: 3225792620
nics:
tap103i0:
netin: 60019776376
netout: 3225792620
pid: 2132265
proxmox-support:
backup-fleecing: 1
backup-max-workers: 1
pbs-dirty-bitmap: 1
pbs-dirty-bitmap-migration: 1
pbs-dirty-bitmap-savevm: 1
pbs-library-version: 1.4.1 (UNKNOWN)
pbs-masterkey: 1
query-bitmap-info: 1
qmpstatus: running
running-machine: pc-i440fx-9.0+pve0
running-qemu: 9.0.0
status: running
tags: proxmox-helper-scripts
uptime: 4590524
vmid: 103

Valombre · Sep 30, 2024

Hi,
i was searching the forum about that issue => for me same problem with my last snapshots for 2 Vms, in 7.4 pve never had those issues.
My host => Linux 6.8.12-2-pve pve-manager/8.2.5
My guests debian bookworm 12 and debian bullseye 11 last updated kernel
Snapshot (with memory by default) => sluggish till freeze, need reboot with a force shutdown.
If you need logs/details just ask, il will post.

ManfredU · Sep 30, 2024

I'm having the same issue.

listhor · Sep 30, 2024

It was fine (snapshots with RAM) until 8.2.4. I see the same issue: https://forum.proxmox.com/threads/manual-snapshot-corrupts-vm.151082/

florism · Oct 7, 2024

Same problem here with home assistant VM

1wire · Oct 7, 2024

Hello together,
i am have the same problem with HomeAssistant in VM.
every 2 Snapshots the VM dont response.
in normal condition the VM pings in <10ms. after first snapshot, the the VM pings from 30 to 28 to 26 down till under 10 then ping restarts with 30 and down to 10 and so on. VM is normal working but a little slow response.
after the second snapshot: the ping goes from 998 every second to 988, 978,968,956...till under 10 then restart with 998 ... 988..978 and so on. in this state the VM doesnt response e.g. HomeAssistant isnt working now.
restart the VM ends in a stopped VM. i have to manually start the VM.
i have two NBs with a proxmox and HomeAssistant VM , both have this issue.

Notebook 1:

CPU(s)

4 x Intel(R) Core(TM) i5-3437U CPU @ 1.90GHz (1 Socket)

Kernelversion

Linux 6.8.12-2-pve (2024-09-05T10:03Z)

Notebook 2:

CPU(s)

8 x Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz (1 Socket)

Kernelversion

Linux 6.8.12-2-pve (2024-09-05T10:03Z)

Snapshot causes VM to become unresponsive.

New Member

Distinguished Member

New Member

New Member

New Member

Active Member

Proxmox Staff Member

New Member

Proxmox Staff Member

New Member

Proxmox Staff Member

New Member

New Member

Proxmox Staff Member

New Member

New Member

New Member

Member

New Member

New Member

We value your privacy