Scenario:
2x Clustered Servers, with StorageZFS replicating between them.
Problem:
When performing the Snapshot of VM 108, several VMs crashed, and I had to reset to go back.
Error log:
Jun 19 09:24:25 pve02 kernel: [10160562.899184] debugfs: Directory 'zd112' with parent 'block' already present!
Logs at the moment:
un 19 09:18:42 pve02 pvedaemon[8291]: starting 1 worker(s)
Jun 19 09:18:42 pve02 pvedaemon[8291]: worker 1409810 started
Jun 19 09:21:26 pve02 pvedaemon[387952]: <root@pam> successful auth for user 'root@pam'
Jun 19 09:22:18 pve02 pvedaemon[1269805]: <root@pam> starting task UPIDve02:00169815:3C8FA0B7:6490487A:vncproxy:108:root@pam:
Jun 19 09:22:18 pve02 pvedaemon[1480725]: starting vnc proxy UPIDve02:00169815:3C8FA0B7:6490487A:vncproxy:108:root@pam:
Jun 19 09:24:12 pve02 pvedaemon[1269805]: <root@pam> end task UPIDve02:00169815:3C8FA0B7:6490487A:vncproxy:108:root@pam: OK
Jun 19 09:24:14 pve02 pvedaemon[1515422]: starting vnc proxy UPIDve02:00171F9E:3C8FCDE0:649048EE:vncproxy:108:root@pam:
Jun 19 09:24:14 pve02 pvedaemon[1269805]: <root@pam> starting task UPIDve02:00171F9E:3C8FCDE0:649048EE:vncproxy:108:root@pam:
Jun 19 09:24:16 pve02 pveproxy[3771496]: worker exit
Jun 19 09:24:16 pve02 pveproxy[8300]: worker 3771496 finished
Jun 19 09:24:16 pve02 pveproxy[8300]: starting 1 worker(s)
Jun 19 09:24:16 pve02 pveproxy[8300]: worker 1515672 started
Jun 19 09:24:21 pve02 pvedaemon[1517070]: <root@pam> snapshot VM 108: snapshotmade4it
Jun 19 09:24:21 pve02 pvedaemon[387952]: <root@pam> starting task UPIDve02:0017260E:3C8FD0B9:649048F5:qmsnapshot:108:root@pam:
Jun 19 09:24:25 pve02 kernel: [10160562.899184] debugfs: Directory 'zd112' with parent 'block' already present!
Jun 19 09:25:22 pve02 pveproxy[3847103]: worker exit
Jun 19 09:25:22 pve02 pveproxy[8300]: worker 3847103 finished
Jun 19 09:25:22 pve02 pveproxy[8300]: starting 1 worker(s)
Jun 19 09:25:22 pve02 pveproxy[8300]: worker 1621163 started
Jun 19 09:27:07 pve02 pvestatd[8238]: VM 106 qmp command failed - VM 106 qmp command 'query-proxmox-support' failed - got timeout
Jun 19 09:27:12 pve02 pvestatd[8238]: VM 115 qmp command failed - VM 115 qmp command 'query-proxmox-support' failed - got timeout
Jun 19 09:27:17 pve02 pvestatd[8238]: VM 104 qmp command failed - VM 104 qmp command 'query-proxmox-support' failed - got timeout
Jun 19 09:27:17 pve02 pvestatd[8238]: status update time (18.683 seconds)
Jun 19 09:27:28 pve02 pvestatd[8238]: VM 115 qmp command failed - VM 115 qmp command 'query-proxmox-support' failed - unable to connect to VM 115 qmp socket - timeout after 51 retries
Jun 19 09:27:34 pve02 pvestatd[8238]: VM 104 qmp command failed - VM 104 qmp command 'query-proxmox-support' failed - unable to connect to VM 104 qmp socket - timeout after 51 retries
Jun 19 09:27:39 pve02 pvestatd[8238]: VM 106 qmp command failed - VM 106 qmp command 'query-proxmox-support' failed - unable to connect to VM 106 qmp socket - timeout after 51 retries
Jun 19 09:27:39 pve02 pvestatd[8238]: status update time (21.753 seconds)
Jun 19 09:27:48 pve02 pvedaemon[1269805]: VM 108 qmp command failed - VM 108 qmp command 'query-proxmox-support' failed - unable to connect to VM 108 qmp socket - timeout after 51 retries
Jun 19 09:27:53 pve02 pvestatd[8238]: VM 106 qmp command failed - VM 106 qmp command 'query-proxmox-support' failed - unable to connect to VM 106 qmp socket - timeout after 51 retries
Jun 19 09:27:58 pve02 pvestatd[8238]: VM 104 qmp command failed - VM 104 qmp command 'query-proxmox-support' failed - unable to connect to VM 104 qmp socket - timeout after 51 retries
Jun 19 09:28:03 pve02 pvestatd[8238]: VM 115 qmp command failed - VM 115 qmp command 'query-proxmox-support' failed - unable to connect to VM 115 qmp socket - timeout after 51 retries
Jun 19 09:28:04 pve02 pvestatd[8238]: status update time (24.777 seconds)
Is it related to running a SnapShot on a VM in production, affecting all VMs that were in the same Storage-ZFS?
2x Clustered Servers, with StorageZFS replicating between them.
Problem:
When performing the Snapshot of VM 108, several VMs crashed, and I had to reset to go back.
Error log:
Jun 19 09:24:25 pve02 kernel: [10160562.899184] debugfs: Directory 'zd112' with parent 'block' already present!
Logs at the moment:
un 19 09:18:42 pve02 pvedaemon[8291]: starting 1 worker(s)
Jun 19 09:18:42 pve02 pvedaemon[8291]: worker 1409810 started
Jun 19 09:21:26 pve02 pvedaemon[387952]: <root@pam> successful auth for user 'root@pam'
Jun 19 09:22:18 pve02 pvedaemon[1269805]: <root@pam> starting task UPIDve02:00169815:3C8FA0B7:6490487A:vncproxy:108:root@pam:
Jun 19 09:22:18 pve02 pvedaemon[1480725]: starting vnc proxy UPIDve02:00169815:3C8FA0B7:6490487A:vncproxy:108:root@pam:
Jun 19 09:24:12 pve02 pvedaemon[1269805]: <root@pam> end task UPIDve02:00169815:3C8FA0B7:6490487A:vncproxy:108:root@pam: OK
Jun 19 09:24:14 pve02 pvedaemon[1515422]: starting vnc proxy UPIDve02:00171F9E:3C8FCDE0:649048EE:vncproxy:108:root@pam:
Jun 19 09:24:14 pve02 pvedaemon[1269805]: <root@pam> starting task UPIDve02:00171F9E:3C8FCDE0:649048EE:vncproxy:108:root@pam:
Jun 19 09:24:16 pve02 pveproxy[3771496]: worker exit
Jun 19 09:24:16 pve02 pveproxy[8300]: worker 3771496 finished
Jun 19 09:24:16 pve02 pveproxy[8300]: starting 1 worker(s)
Jun 19 09:24:16 pve02 pveproxy[8300]: worker 1515672 started
Jun 19 09:24:21 pve02 pvedaemon[1517070]: <root@pam> snapshot VM 108: snapshotmade4it
Jun 19 09:24:21 pve02 pvedaemon[387952]: <root@pam> starting task UPIDve02:0017260E:3C8FD0B9:649048F5:qmsnapshot:108:root@pam:
Jun 19 09:24:25 pve02 kernel: [10160562.899184] debugfs: Directory 'zd112' with parent 'block' already present!
Jun 19 09:25:22 pve02 pveproxy[3847103]: worker exit
Jun 19 09:25:22 pve02 pveproxy[8300]: worker 3847103 finished
Jun 19 09:25:22 pve02 pveproxy[8300]: starting 1 worker(s)
Jun 19 09:25:22 pve02 pveproxy[8300]: worker 1621163 started
Jun 19 09:27:07 pve02 pvestatd[8238]: VM 106 qmp command failed - VM 106 qmp command 'query-proxmox-support' failed - got timeout
Jun 19 09:27:12 pve02 pvestatd[8238]: VM 115 qmp command failed - VM 115 qmp command 'query-proxmox-support' failed - got timeout
Jun 19 09:27:17 pve02 pvestatd[8238]: VM 104 qmp command failed - VM 104 qmp command 'query-proxmox-support' failed - got timeout
Jun 19 09:27:17 pve02 pvestatd[8238]: status update time (18.683 seconds)
Jun 19 09:27:28 pve02 pvestatd[8238]: VM 115 qmp command failed - VM 115 qmp command 'query-proxmox-support' failed - unable to connect to VM 115 qmp socket - timeout after 51 retries
Jun 19 09:27:34 pve02 pvestatd[8238]: VM 104 qmp command failed - VM 104 qmp command 'query-proxmox-support' failed - unable to connect to VM 104 qmp socket - timeout after 51 retries
Jun 19 09:27:39 pve02 pvestatd[8238]: VM 106 qmp command failed - VM 106 qmp command 'query-proxmox-support' failed - unable to connect to VM 106 qmp socket - timeout after 51 retries
Jun 19 09:27:39 pve02 pvestatd[8238]: status update time (21.753 seconds)
Jun 19 09:27:48 pve02 pvedaemon[1269805]: VM 108 qmp command failed - VM 108 qmp command 'query-proxmox-support' failed - unable to connect to VM 108 qmp socket - timeout after 51 retries
Jun 19 09:27:53 pve02 pvestatd[8238]: VM 106 qmp command failed - VM 106 qmp command 'query-proxmox-support' failed - unable to connect to VM 106 qmp socket - timeout after 51 retries
Jun 19 09:27:58 pve02 pvestatd[8238]: VM 104 qmp command failed - VM 104 qmp command 'query-proxmox-support' failed - unable to connect to VM 104 qmp socket - timeout after 51 retries
Jun 19 09:28:03 pve02 pvestatd[8238]: VM 115 qmp command failed - VM 115 qmp command 'query-proxmox-support' failed - unable to connect to VM 115 qmp socket - timeout after 51 retries
Jun 19 09:28:04 pve02 pvestatd[8238]: status update time (24.777 seconds)
Is it related to running a SnapShot on a VM in production, affecting all VMs that were in the same Storage-ZFS?