Corrupt VM disk after server restart

Staniesk

New Member
Sep 20, 2023
7
0
1
Hello, I would like to share with friends my latest experience with a ZFS cluster, with 3 servers.

After creating ZFS and starting to migrate the vms' disks to the new mirror directory, after leaving all the vms working normally, the next day we had a problem.

It was necessary to restart one of the cluster nodes, and one of the vms did not start, we found that the disk was corrupted, preventing this vm from starting.
This happened again on another server, corrupting the VM disk.
Has anyone experienced this and what could cause these problems?
 
Hi,
please share the output of pveversion -v and the VM configurations of the affected VMs qm config <ID>. Are there any errors in the system logs/journal?
 
Hi Fiona, thanks for the reply

see the requested logs

Linux pve2 6.2.16-14-pve #1 SMP PREEMPT_DYNAMIC PMX 6.2.16-14 (2023-09-19T08:17Z) x86_64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Wed Oct 25 16:10:50 -03 2023 from 192.168.1.141 on pts/0
root@pve2:~# qm config 531
agent: 1
balloon: 0
boot: order=ide2;virtio0
cores: 2
cpu: host
description: Template Linux AWS
ide2: none,media=cdrom
memory: 8192
name: dockerWazuh
net0: virtio=F6:73:A8:0A:2D:2A,bridge=vmbr0,tag=232
numa: 0
onboot: 1
ostype: l26
scsihw: virtio-scsi-pci
smbios1: uuid=380a020a-1e40-4951-93f9-6a93d7b415e7
sockets: 2
virtio0: local-lvm:vm-531-disk-0,cache=none,size=30G
virtio1: local-lvm:vm-531-disk-1,size=60G
root@pve2:~# pveversion -v
proxmox-ve: 8.0.2 (running kernel: 6.2.16-14-pve)
pve-manager: 8.0.4 (running version: 8.0.4/d258a813cfa6b390)
proxmox-kernel-helper: 8.0.3
pve-kernel-5.15: 7.4-6
proxmox-kernel-6.2.16-14-pve: 6.2.16-14
proxmox-kernel-6.2: 6.2.16-14
proxmox-kernel-6.2.16-12-pve: 6.2.16-12
pve-kernel-5.15.116-1-pve: 5.15.116-1
pve-kernel-5.15.102-1-pve: 5.15.102-1
ceph-fuse: 16.2.11+ds-2
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx4
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-4
libknet1: 1.25-pve1
libproxmox-acme-perl: 1.4.6
libproxmox-backup-qemu0: 1.4.0
libproxmox-rs-perl: 0.3.1
libpve-access-control: 8.0.5
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.0.8
libpve-guest-common-perl: 5.0.4
libpve-http-server-perl: 5.0.4
libpve-rs-perl: 0.8.5
libpve-storage-perl: 8.0.2
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-2
proxmox-backup-client: 3.0.2-1
proxmox-backup-file-restore: 3.0.2-1
proxmox-kernel-helper: 8.0.3
proxmox-mail-forward: 0.2.0
proxmox-mini-journalreader: 1.4.0
proxmox-widget-toolkit: 4.0.6
pve-cluster: 8.0.3
pve-container: 5.0.4
pve-docs: 8.0.4
pve-edk2-firmware: 3.20230228-4
pve-firewall: 5.0.3
pve-firmware: 3.8-2
pve-ha-manager: 4.0.2
pve-i18n: 3.0.7
pve-qemu-kvm: 8.0.2-6
pve-xtermjs: 4.16.0-3
qemu-server: 8.0.7
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.1.12-pve1
root@pve2:~#
 
Last edited:
boot disk is 0 30Gb

root@pve2:~# qm config 531
agent: 1
balloon: 0
boot: order=ide2;virtio0
cores: 2
cpu: host
description: Template Linux AWS
ide2: none,media=cdrom
memory: 8192
name: dockerWazuh
net0: virtio=F6:73:A8:0A:2D:2A,bridge=vmbr0,tag=232
numa: 0
onboot: 1
ostype: l26
scsihw: virtio-scsi-pci
smbios1: uuid=380a020a-1e40-4951-93f9-6a93d7b415e7
sockets: 2
virtio0: local-lvm:vm-531-disk-0,cache=none,size=30G
virtio1: local-lvm:vm-531-disk-1,size=60G
root@pve2:~#
 
Oct 25 16:16:32 pve2 pveproxy[3266]: starting 1 worker(s)
Oct 25 16:16:32 pve2 pveproxy[3266]: worker 3102888 started
Oct 25 16:17:01 pve2 CRON[3104401]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Oct 25 16:17:01 pve2 CRON[3104402]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Oct 25 16:17:01 pve2 CRON[3104401]: pam_unix(cron:session): session closed for user root
Oct 25 16:23:25 pve2 pvedaemon[3050709]: <root@pam> starting task UPIDve2:002FAD7D:04BFC1C4:65396B2D:qmstart:531:root@pam:
Oct 25 16:23:25 pve2 pvedaemon[3124605]: start VM 531: UPIDve2:002FAD7D:04BFC1C4:65396B2D:qmstart:531:root@pam:
Oct 25 16:23:26 pve2 systemd[1]: Started 531.scope.
Oct 25 16:23:27 pve2 kernel: device tap531i0 entered promiscuous mode
Oct 25 16:23:27 pve2 kernel: vmbr0: port 7(tap531i0) entered blocking state
Oct 25 16:23:27 pve2 kernel: vmbr0: port 7(tap531i0) entered disabled state
Oct 25 16:23:27 pve2 kernel: vmbr0: port 7(tap531i0) entered blocking state
Oct 25 16:23:27 pve2 kernel: vmbr0: port 7(tap531i0) entered forwarding state
Oct 25 16:23:27 pve2 pvedaemon[3050709]: <root@pam> end task UPIDve2:002FAD7D:04BFC1C4:65396B2D:qmstart:531:root@pam: OK
Oct 25 16:23:27 pve2 pmxcfs[3047]: [status] notice: received log
Oct 25 16:23:27 pve2 pmxcfs[3047]: [status] notice: received log
Oct 25 16:23:28 pve2 sshd[3124849]: Accepted publickey for root from 192.168.1.141 port 51072 ssh2: RSA SHA256:2RcsNth8+Rf45PBDCqPy0fFjfm7VqHTvPYua6+KPHWw
Oct 25 16:23:28 pve2 sshd[3124849]: pam_unix(sshd:session): session opened for user root(uid=0) by (uid=0)
Oct 25 16:23:28 pve2 systemd-logind[2644]: New session 329 of user root.
Oct 25 16:23:28 pve2 systemd[1]: Started session-329.scope - Session 329 of User root.
Oct 25 16:23:28 pve2 sshd[3124849]: pam_env(sshd:session): deprecated reading of user environment enabled
Oct 25 16:23:37 pve2 pmxcfs[3047]: [status] notice: received log
Oct 25 16:23:51 pve2 pvedaemon[3126020]: stop VM 531: UPIDve2:002FB304:04BFCC10:65396B47:qmstop:531:root@pam:
Oct 25 16:23:51 pve2 pvedaemon[2692707]: <root@pam> starting task UPIDve2:002FB304:04BFCC10:65396B47:qmstop:531:root@pam:
Oct 25 16:23:52 pve2 kernel: vmbr0: port 7(tap531i0) entered disabled state
Oct 25 16:23:52 pve2 qmeventd[2645]: read: Connection reset by peer
Oct 25 16:23:52 pve2 pvedaemon[2691516]: VM 531 qmp command failed - VM 531 qmp command 'query-proxmox-support' failed - client closed connection
Oct 25 16:23:52 pve2 sshd[3124849]: Received disconnect from 192.168.1.141 port 51072:11: disconnected by user
Oct 25 16:23:52 pve2 sshd[3124849]: Disconnected from user root 192.168.1.141 port 51072
Oct 25 16:23:52 pve2 systemd-logind[2644]: Session 329 logged out. Waiting for processes to exit.
Oct 25 16:23:52 pve2 sshd[3124849]: pam_unix(sshd:session): session closed for user root
Oct 25 16:23:52 pve2 systemd[1]: session-329.scope: Deactivated successfully.
Oct 25 16:23:52 pve2 pmxcfs[3047]: [status] notice: received log
Oct 25 16:23:52 pve2 systemd[1]: session-329.scope: Consumed 1.341s CPU time.
Oct 25 16:23:52 pve2 systemd-logind[2644]: Removed session 329.
Oct 25 16:23:52 pve2 systemd[1]: 531.scope: Deactivated successfully.
Oct 25 16:23:52 pve2 systemd[1]: 531.scope: Consumed 5.084s CPU time.
Oct 25 16:23:52 pve2 pvedaemon[2692707]: <root@pam> end task UPIDve2:002FB304:04BFCC10:65396B47:qmstop:531:root@pam: OK
Oct 25 16:23:53 pve2 qmeventd[3126038]: Starting cleanup for 531
Oct 25 16:23:53 pve2 qmeventd[3126038]: Finished cleanup for 531
 
Last edited:
Important information, these 2 VMs that corrupted the disk were shut down by hard stop.

Could this cause problems with ZFS?
 
A hard stop could always corrupt your guests filesystems. No matter what storage you use beneath it. As all cached but not yet flushed writes will be lost.
Make sure you don't set ZFSs "sync=disabled" and that you don't use QEMUs "unsafe" cache so at least sycc writes won't be lost on a hard stop.
And it helps to use a journaling filesystem inside your VMs.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!