Snapshot causes VM to become unresponsive.

nivek1612

New Member
Aug 25, 2024
10
0
1
I am not exactly sure when this problem first started but it used to be fine in the past
I'm running 8.2.4 of Proxmox, and I'm seeing strange behaviour of a VM post Snapshot.

I have an instance of HomeAssistant in a VM that after a snapshot is taken of it becomes very slow/non-responsive.
Stopping and starting the VM returns it to normal working mode

Any logs I can look at to find out what the problem is

I asked on the HomeAssiatnt Forum and some other people have similar issues.
 
Same issue here. Running 8.2.4 but this was happening before i updated current version.
Haven't tried snapshot yet without memory, but even it would work, as @nivek1612 is asking; what could be the root cause for this as it used to work fine? How to debug this issue? Thanks.
 
Experiencing the exact same issue. Used to work perfectly (with RAM), but now it doesn't. Looking for a solution or reason.
 
Hi,
please share the VM configuration qm config <ID> and the output of pveversion -v. What physical CPU do you have? Is there anything in the system logs?

EDIT: What guest (and kernel) is running inside the VMs? How does the CPU usage look like?
 
Last edited:
root@pve2:~# qm config 105
agent: 1
bios: ovmf
boot: order=scsi0
cores: 2
description: <div align='center'><img src='https%3A//avatars.githubusercontent.com/u/13844975?s=200&v=4'/></a>%0A%0A%0A # Home Assistant VM
efidisk0: local-lvm:vm-105-disk-0,efitype=4m,size=4M
localtime: 1
memory: 4096
meta: creation-qemu=7.1.0,ctime=1675622700
name: haos9.5
net0: virtio=02:79:26:B8:85:35,bridge=vmbr2
onboot: 1
ostype: l26
parent: precloud2
scsi0: local-lvm:vm-105-disk-1,discard=on,size=32G,ssd=1
scsihw: virtio-scsi-pci
smbios1: uuid=9f8174c5-5c21-4b70-a5c8-d9877470963d
tablet: 0
tags:
usb0: host=10c4:ea60
vmgenid: d153564f-5434-4446-a861-6d880e44dcdc


proxmox-ve: 8.2.0 (running kernel: 6.8.12-1-pve)
pve-manager: 8.2.4 (running version: 8.2.4/faa83925c9641325)
proxmox-kernel-helper: 8.1.0
proxmox-kernel-6.8: 6.8.12-1
proxmox-kernel-6.8.12-1-pve-signed: 6.8.12-1
proxmox-kernel-6.8.4-2-pve-signed: 6.8.4-2
ceph-fuse: 17.2.7-pve3
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx9
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.1
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.3
libpve-access-control: 8.1.4
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.0.7
libpve-cluster-perl: 8.0.7
libpve-common-perl: 8.2.2
libpve-guest-common-perl: 5.1.4
libpve-http-server-perl: 5.1.0
libpve-network-perl: 0.9.8
libpve-rs-perl: 0.8.9
libpve-storage-perl: 8.2.3
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.2.7-1
proxmox-backup-file-restore: 3.2.7-1
proxmox-firewall: 0.5.0
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.3
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.6
proxmox-widget-toolkit: 4.2.3
pve-cluster: 8.0.7
pve-container: 5.1.12
pve-docs: 8.2.3
pve-edk2-firmware: 4.2023.08-4
pve-esxi-import-tools: 0.7.1
pve-firewall: 5.0.7
pve-firmware: 3.13-1
pve-ha-manager: 4.0.5
pve-i18n: 3.2.2
pve-qemu-kvm: 9.0.2-2
pve-xtermjs: 5.3.0-3
qemu-server: 8.2.4
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.4-pve1

Running Home Assistant in the VM CPU usage at 3%
 
What physical CPU do you have? Does using CPU type host for the VM work around the issue?
 
Intel(R) Core(TM) i7-4510U CPU @ 2.00GHz

I will give the CPU type host a try later

EDIT: Sorry where would I change CPU type to host that can't see it
 
Last edited:
EDIT: Sorry where would I change CPU type to host that can't see it
In the VM's Hardware panel, select Processors, use the Edit button and select Type.
 
Changed CPU to host
Performed snapshot with RAM
VM was not responding and had to be restarted so the problem remains even with host CPU
 
Okay, thank you for testing! What does qm status <ID> --verbose show while the VM is unresponsive? Can you still ping it? See the display?
 
I am having the same issue. Took a snapshot with RAM as I did already several times before, and after the snapshot it was unresponsive. This is the output of qm status <ID> --verbose

balloon: 4294967296
ballooninfo:
actual: 4294967296
free_mem: 363380736
last_update: 1727510890
major_page_faults: 11611
max_mem: 4294967296
mem_swapped_in: 0
mem_swapped_out: 8192
minor_page_faults: 1266619200
total_mem: 4105109504
blockstat:
efidisk0:
account_failed: 1
account_invalid: 1
failed_flush_operations: 0
failed_rd_operations: 0
failed_unmap_operations: 0
failed_wr_operations: 0
failed_zone_append_operations: 0
flush_operations: 0
flush_total_time_ns: 0
invalid_flush_operations: 0
invalid_rd_operations: 0
invalid_unmap_operations: 0
invalid_wr_operations: 0
invalid_zone_append_operations: 0
rd_bytes: 0
rd_merged: 0
rd_operations: 0
rd_total_time_ns: 0
timed_stats:
unmap_bytes: 0
unmap_merged: 0
unmap_operations: 0
unmap_total_time_ns: 0
wr_bytes: 0
wr_highest_offset: 0
wr_merged: 0
wr_operations: 0
wr_total_time_ns: 0
zone_append_bytes: 0
zone_append_merged: 0
zone_append_operations: 0
zone_append_total_time_ns: 0
pflash0:
account_failed: 1
account_invalid: 1
failed_flush_operations: 0
failed_rd_operations: 0
failed_unmap_operations: 0
failed_wr_operations: 0
failed_zone_append_operations: 0
flush_operations: 0
flush_total_time_ns: 0
invalid_flush_operations: 0
invalid_rd_operations: 0
invalid_unmap_operations: 0
invalid_wr_operations: 0
invalid_zone_append_operations: 0
rd_bytes: 0
rd_merged: 0
rd_operations: 0
rd_total_time_ns: 0
timed_stats:
unmap_bytes: 0
unmap_merged: 0
unmap_operations: 0
unmap_total_time_ns: 0
wr_bytes: 0
wr_highest_offset: 0
wr_merged: 0
wr_operations: 0
wr_total_time_ns: 0
zone_append_bytes: 0
zone_append_merged: 0
zone_append_operations: 0
zone_append_total_time_ns: 0
scsi0:
account_failed: 1
account_invalid: 1
failed_flush_operations: 0
failed_rd_operations: 0
failed_unmap_operations: 0
failed_wr_operations: 0
failed_zone_append_operations: 0
flush_operations: 2
flush_total_time_ns: 58841
idle_time_ns: 672845775
invalid_flush_operations: 0
invalid_rd_operations: 0
invalid_unmap_operations: 0
invalid_wr_operations: 0
invalid_zone_append_operations: 0
rd_bytes: 29419152384
rd_merged: 0
rd_operations: 1711100
rd_total_time_ns: 805491294901
timed_stats:
unmap_bytes: 63356908544
unmap_merged: 0
unmap_operations: 79133
unmap_total_time_ns: 35979561726
wr_bytes: 187989206528
wr_highest_offset: 33345769472
wr_merged: 0
wr_operations: 13719531
wr_total_time_ns: 66137617974586
zone_append_bytes: 0
zone_append_merged: 0
zone_append_operations: 0
zone_append_total_time_ns: 0
cpus: 2
disk: 0
diskread: 29419152384
diskwrite: 187989206528
freemem: 363380736
maxdisk: 34359738368
maxmem: 4294967296
mem: 3741728768
name: homeassistantos
netin: 60019776376
netout: 3225792620
nics:
tap103i0:
netin: 60019776376
netout: 3225792620
pid: 2132265
proxmox-support:
backup-fleecing: 1
backup-max-workers: 1
pbs-dirty-bitmap: 1
pbs-dirty-bitmap-migration: 1
pbs-dirty-bitmap-savevm: 1
pbs-library-version: 1.4.1 (UNKNOWN)
pbs-masterkey: 1
query-bitmap-info: 1
qmpstatus: running
running-machine: pc-i440fx-9.0+pve0
running-qemu: 9.0.0
status: running
tags: proxmox-helper-scripts
uptime: 4590524
vmid: 103
 
Last edited:
Hi,
i was searching the forum about that issue => for me same problem with my last snapshots for 2 Vms, in 7.4 pve never had those issues.
My host => Linux 6.8.12-2-pve pve-manager/8.2.5
My guests debian bookworm 12 and debian bullseye 11 last updated kernel
Snapshot (with memory by default) => sluggish till freeze, need reboot with a force shutdown.
If you need logs/details just ask, il will post.
 
Hello together,
i am have the same problem with HomeAssistant in VM.
every 2 Snapshots the VM dont response.
in normal condition the VM pings in <10ms. after first snapshot, the the VM pings from 30 to 28 to 26 down till under 10 then ping restarts with 30 and down to 10 and so on. VM is normal working but a little slow response.
after the second snapshot: the ping goes from 998 every second to 988, 978,968,956...till under 10 then restart with 998 ... 988..978 and so on. in this state the VM doesnt response e.g. HomeAssistant isnt working now.
restart the VM ends in a stopped VM. i have to manually start the VM.
i have two NBs with a proxmox and HomeAssistant VM , both have this issue.

Notebook 1:
CPU(s)

4 x Intel(R) Core(TM) i5-3437U CPU @ 1.90GHz (1 Socket)
Kernelversion

Linux 6.8.12-2-pve (2024-09-05T10:03Z)

Notebook 2:
CPU(s)

8 x Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz (1 Socket)
Kernelversion

Linux 6.8.12-2-pve (2024-09-05T10:03Z)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!