Backing up - unable to log in

Nesin

New Member
Nov 1, 2022
5
0
1
Poland
Hi,
Every now and then I have a problem with the VM backup.
The copy works fine most of the time. But at random moments it breaks up at a different stage. Recently I couldn't log into Proxmox via GUI or CLI at all. When trying to log in via ssh there is a message: Client_loop: send disconnect: Broken pipe
Could this be an HDD problem?

Code:
INFO: starting new backup job: vzdump --mailto example@domain.com --all 1 --compress zstd --quiet 1 --storage backup --prune-backups 'keep-daily=1,keep-last=4' --notes-template '{{vmid}} ({{guestname}})' --mode snapshot --mailnotification failure
INFO: Starting Backup of VM 100 (qemu)
INFO: Backup started at 2022-11-01 02:00:04
INFO: status = running
INFO: VM Name: Machine
INFO: include disk 'scsi0' 'local-lvm:vm-100-disk-0' 16G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating vzdump archive '/mnt/pve/backup/dump/vzdump-qemu-100-2022_11_01-02_00_04.vma.zst'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task '107dd980-8b17-4289-87b7-4c239db55a30'
INFO: resuming VM again
INFO:  13% (2.1 GiB of 16.0 GiB) in 3s, read: 715.3 MiB/s, write: 142.8 MiB/s
INFO:  15% (2.5 GiB of 16.0 GiB) in 6s, read: 124.5 MiB/s, write: 113.6 MiB/s
INFO:  17% (2.8 GiB of 16.0 GiB) in 9s, read: 126.9 MiB/s, write: 117.1 MiB/s
 
Last edited:
Hi,
how does the node's load look during backup? If there is massive IO wait, please have a look here. Or is a network storage involved and do you use the same network for SSH? How exactly does the backup break, what is the error message? Please also post the output of pveversion -v.
 
Hi,
how does the node's load look during backup? If there is massive IO wait, please have a look here. Or is a network storage involved and do you use the same network for SSH? How exactly does the backup break, what is the error message? Please also post the output of pveversion -v.
IO delay was 35% maximum for a while.
Yes, I am on the same network to connect via SSH.
Exactly that breaks the copy at any% of the progress and nothing else. Entering pve then and Disks will not load.

After not logging in for a long time, it is impossible to enter via GUI and SSH. The only thing you can do is turn the power off and on. Then everything returns to normal.

In the event history, I have the only state description: "stoppped: unexpected status"
Runtime 13h 20m 36s. Usually, I create copies in 2m 8s.
Code:
proxmox-ve: 7.2-1 (running kernel: 5.15.64-1-pve)
pve-manager: 7.2-11 (running version: 7.2-11/b76d3178)
pve-kernel-5.15: 7.2-13
pve-kernel-helper: 7.2-13
pve-kernel-5.15.64-1-pve: 5.15.64-1
pve-kernel-5.15.30-2-pve: 5.15.30-3
ceph-fuse: 15.2.16-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve1
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.2-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.2-3
libpve-guest-common-perl: 4.1-4
libpve-http-server-perl: 4.1-4
libpve-storage-perl: 7.2-10
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.2.7-1
proxmox-backup-file-restore: 2.2.7-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.5.1
pve-cluster: 7.2-2
pve-container: 4.2-3
pve-docs: 7.2-2
pve-edk2-firmware: 3.20220526-1
pve-firewall: 4.2-6
pve-firmware: 3.5-6
pve-ha-manager: 3.4.0
pve-i18n: 2.7-2
pve-qemu-kvm: 7.0.0-4
pve-xtermjs: 4.16.0-1
qemu-server: 7.2-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.6-pve1
 
Hi,
how does the node's load look during backup? If there is massive IO wait, please have a look here. Or is a network storage involved and do you use the same network for SSH? How exactly does the backup break, what is the error message? Please also post the output of pveversion -v.
I have an M.2 SSD server and a second HDD for backup.
I turned off the server and took out the HDD (backup). I checked and did some tests and unfortunately it turned out that the disk is damaged.
Currently, I put on the second.

So why has Proxmox not finished the backup process which has hung due to disk? Why did my entire server crash after a long time?