Proxmox Backups over NFS/SMB Hangs VM/Server

Ryushin

New Member
Dec 5, 2023
5
0
1
New installation here. We are planning on moving several of our Devuan based LibVirt, VirtManager over to Proxmox 8.1.3. I've deployed the first machine and moved a Windows 2022 Server over to it. Snapshot Backups seem to hang the VM and there is no way to abort or release the hang until I reboot the Proxmox server. If I backup to the local ZFS pool it works fine. Backing up over NFSv3/NFSv4/SMB causes it to hang.

I see it count all the way to 100%, but it's no data is being written over the share. I've tried with and without compression.

I've tried set up autofs to mount the NFSv4 mountpoint and trying to backup to that as well.

So not sure what is needed at this point to diagnose this problem. My problem was similar to this thread:
https://forum.proxmox.com/threads/backup-vzdump-fails-and-hangs-forever.75372/

Code:
cat /etc/pve/storage.cfg
dir: local
path /var/lib/vz
content vztmpl
shared 0

zfspool: virtual_machines
pool vmpool/virtual_machines
content images,rootdir
mountpoint /virtual_machines
sparse 1

dir: virtual_images
path /var/virtual_images
content images
prune-backups keep-all=1
shared 0

dir: ISOs
path /var/lib/vz/isos
content iso
prune-backups keep-all=1
shared 0

dir: proxmox_backups
path /mnt/proxmox_backups/gibb-golf-01
content backup
prune-backups keep-last=30
shared 0


Code:
cat /etc/pve/qemu-server/1001.conf 
agent: 1,freeze-fs-on-backup=0,fstrim_cloned_disks=1
bios: ovmf
boot: order=scsi0;sata0
cores: 12
cpu: x86-64-v3
efidisk0: virtual_images:1001/vm-1001-disk-1.qcow2,efitype=4m,pre-enrolled-keys=1,size=528K
machine: pc-q35-5.1
memory: 65536
meta: creation-qemu=8.1.2,ctime=1700170601
name: Legion
net0: virtio=52:54:00:f3:76:34,bridge=vmbr2,mtu=1500,queues=6
numa: 0
onboot: 1
ostype: win10
parent: Before_Installing_RDS
sata0: none,media=cdrom
scsi0: virtual_machines:vm-1001-disk-0,cache=writeback,discard=on,iothread=1,size=200G,ssd=1
scsihw: virtio-scsi-pci
smbios1: uuid=b7d77d14-1d77-4952-8887-15af78ea715b
sockets: 1
vga: virtio,memory=512
vmgenid: 79ea8502-9fe2-43f8-9e04-2c5fbfc5a7ca

Code:
tail -s.1 -f /var/log/syslog | grep pvedaemon
2023-12-05T12:53:14.155825-08:00 gibb-golf-01 pvedaemon[4192]: <root@pam> starting task UPID:gibb-golf-01:00001EDE:00007C34:656F8DBA:vzdump:1001:root@pam:
2023-12-05T12:53:14.175628-08:00 gibb-golf-01 pvedaemon[7902]: INFO: starting new backup job: vzdump 1001 --storage proxmox_backups --notes-template '{{guestname}}' --compress 0 --remove 0 --notification-mode auto --mode snapshot --node gibb-golf-01
2023-12-05T12:53:14.190753-08:00 gibb-golf-01 pvedaemon[7902]: INFO: Starting Backup of VM 1001 (qemu)
2023-12-05T12:54:20.939425-08:00 gibb-golf-01 pvedaemon[4190]: VM 1001 qmp command failed - VM 1001 qmp command 'query-proxmox-support' failed - got timeout
2023-12-05T12:54:46.435754-08:00 gibb-golf-01 pvedaemon[4192]: VM 1001 qmp command failed - VM 1001 qmp command 'query-proxmox-support' failed - unable to connect to VM 1001 qmp socket - timeout after 51 retries

Code:
INFO: starting new backup job: vzdump 1001 --storage proxmox_backups --notes-template '{{guestname}}' --compress 0 --remove 0 --notification-mode auto --mode snapshot --node gibb-golf-01
INFO: Starting Backup of VM 1001 (qemu)
INFO: Backup started at 2023-12-05 12:53:14
INFO: status = running
INFO: VM Name: Legion
INFO: include disk 'scsi0' 'virtual_machines:vm-1001-disk-0' 200G
INFO: include disk 'efidisk0' 'virtual_images:1001/vm-1001-disk-1.qcow2' 528K
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating vzdump archive '/mnt/proxmox_backups/gibb-golf-01/dump/vzdump-qemu-1001-2023_12_05-12_53_14.vma'
INFO: skipping guest-agent 'fs-freeze', disabled in VM options
INFO: started backup task '3d25fd9b-1774-4305-b097-9432679995e1'
INFO: resuming VM again
INFO:   3% (6.5 GiB of 200.0 GiB) in 3s, read: 2.2 GiB/s, write: 2.0 GiB/s
INFO:   5% (10.7 GiB of 200.0 GiB) in 6s, read: 1.4 GiB/s, write: 1.3 GiB/s
INFO:   8% (17.2 GiB of 200.0 GiB) in 9s, read: 2.2 GiB/s, write: 2.0 GiB/s
INFO:  10% (21.3 GiB of 200.0 GiB) in 12s, read: 1.4 GiB/s, write: 1.3 GiB/s
INFO:  13% (27.5 GiB of 200.0 GiB) in 15s, read: 2.1 GiB/s, write: 2.0 GiB/s
INFO:  18% (37.5 GiB of 200.0 GiB) in 18s, read: 3.3 GiB/s, write: 792.1 MiB/s
INFO:  24% (48.3 GiB of 200.0 GiB) in 21s, read: 3.6 GiB/s, write: 679.6 MiB/s
INFO:  26% (54.0 GiB of 200.0 GiB) in 24s, read: 1.9 GiB/s, write: 973.9 MiB/s
INFO:  30% (61.0 GiB of 200.0 GiB) in 27s, read: 2.4 GiB/s, write: 1.6 GiB/s
INFO:  37% (74.3 GiB of 200.0 GiB) in 30s, read: 4.4 GiB/s, write: 0 B/s
INFO:  43% (87.8 GiB of 200.0 GiB) in 33s, read: 4.5 GiB/s, write: 2.7 KiB/s
INFO:  50% (101.3 GiB of 200.0 GiB) in 36s, read: 4.5 GiB/s, write: 0 B/s
INFO:  57% (114.7 GiB of 200.0 GiB) in 39s, read: 4.5 GiB/s, write: 0 B/s
INFO:  64% (128.3 GiB of 200.0 GiB) in 42s, read: 4.5 GiB/s, write: 0 B/s
INFO:  70% (141.8 GiB of 200.0 GiB) in 45s, read: 4.5 GiB/s, write: 0 B/s
INFO:  77% (155.2 GiB of 200.0 GiB) in 48s, read: 4.5 GiB/s, write: 0 B/s
INFO:  84% (168.6 GiB of 200.0 GiB) in 51s, read: 4.5 GiB/s, write: 0 B/s
INFO:  91% (182.2 GiB of 200.0 GiB) in 54s, read: 4.5 GiB/s, write: 0 B/s
INFO:  98% (196.1 GiB of 200.0 GiB) in 57s, read: 4.7 GiB/s, write: 0 B/s
 
Is a change to. PBS no options for you? In my opinion, this is a better option than creating full backups every day. The only thing to note is that the PBS should be equipped with SSDs and the network drive is definitely not mounted there.

The guest agent is up to date and running?
 
Yep, guest agent is install and up and running. Works fine backing up to local disk, just not to network file systems. We run BareOS for file level backups and use ZFS send/receive for other backups using Sanoid/Syncoid. The vzdump backups are going to VERY large ZFS storage. I don't think they are even needed since sanoid/syncoid is running on the Proxmox server sending snaps to another server on site, and a second server across the country. But for a neophyte, restoring using the WebUI will be easier. I would be a nice feature add for Proxmox to be configured to take ZFS snaps in certain intervals and send those snaps to another server instead of having to use vzdump.

Not really an option to add PBS.
 
I was able to get PBS installed on another Debian server that also runs BareOS for our backups. I kicked off a backup from PVE and it hung the VM again and the only way to get the system back up was to reboot the Hypervisor.

Again, backups work fine to local disk, just not to any kind of remote disk, either SMB, NFS, or PBS at this point. Is there anything I should be looking at?
 
Here is the error using the PBS:

INFO: starting new backup job: vzdump 1001 --remove 0 --mode snapshot --storage proxmox_backup_server --mailto chris.dos@giantww.com --node gibb-golf-01 --notification-mode auto --notes-template '{{guestname}}'
INFO: Starting Backup of VM 1001 (qemu)
INFO: Backup started at 2024-01-11 10:13:45
INFO: status = running
INFO: VM Name: Legion
INFO: include disk 'scsi0' 'virtual_machines:vm-1001-disk-0' 200G
INFO: include disk 'efidisk0' 'virtual_images:1001/vm-1001-disk-1.qcow2' 528K
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/1001/2024-01-11T18:13:45Z'
INFO: skipping guest-agent 'fs-freeze', disabled in VM options
INFO: started backup task '1496038e-6599-4e4e-ab62-9a1bf54a029b'
INFO: resuming VM again
INFO: efidisk0: dirty-bitmap status: created new
INFO: scsi0: dirty-bitmap status: created new
INFO: 0% (44.0 MiB of 200.0 GiB) in 3s, read: 14.7 MiB/s, write: 14.7 MiB/s
INFO: 0% (44.0 MiB of 200.0 GiB) in 15m 47s, read: 0 B/s, write: 0 B/s
ERROR: backup write data failed: command error: protocol canceled
INFO: aborting backup job
INFO: resuming VM again
ERROR: Backup of VM 1001 failed - backup write data failed: command error: protocol canceled
INFO: Failed at 2024-01-11 10:29:33
INFO: Backup job finished with errors
INFO: notified via target `<redacted@redacted.com>`
TASK ERROR: job errors

I have community support with my license. Is this forum the community support?
 
Hi all

I am also experiencing this behavior while backing up my VM to my NFS storage. I shut down the VM and then attempted to start the backup. The backup job is still hung 10 minutes after this:

INFO: starting new backup job: vzdump 103 --mode snapshot --notification-mode auto --remove 0 --storage my-nfs --compress zstd --notes-template '{{guestname}}' --node pve-1
INFO: Starting Backup of VM 103 (qemu)
INFO: Backup started at 2025-07-15 11:04:21
INFO: status = stopped
INFO: backup mode: stop
INFO: ionice priority: 7
INFO: VM Name: MY-VM-NAME
INFO: include disk 'scsi0' 'local2:103/vm-103-disk-0.qcow2' 16G
INFO: creating vzdump archive '/mnt/pve/my-nfs/dump/vzdump-qemu-103-2025_07_15-11_04_21.vma.zst'
INFO: starting kvm to execute backup task
INFO: started backup task 'aa828b1e-cd8a-49f3-bbc3-279a1a00d6d3'
INFO: 20% (3.3 GiB of 16.0 GiB) in 3s, read: 1.1 GiB/s, write: 208.6 MiB/s
INFO: 32% (5.2 GiB of 16.0 GiB) in 6s, read: 664.2 MiB/s, write: 163.4 MiB/s
INFO: 49% (7.9 GiB of 16.0 GiB) in 9s, read: 904.6 MiB/s, write: 155.5 MiB/s
INFO: 65% (10.4 GiB of 16.0 GiB) in 12s, read: 868.4 MiB/s, write: 139.9 MiB/s
INFO: 79% (12.7 GiB of 16.0 GiB) in 15s, read: 769.6 MiB/s, write: 160.9 MiB/s
INFO: 99% (15.9 GiB of 16.0 GiB) in 18s, read: 1.1 GiB/s, write: 123.5 MiB/s
INFO: 100% (16.0 GiB of 16.0 GiB) in 19s, read: 122.2 MiB/s, write: 0 B/s
INFO: backup is sparse: 13.21 GiB (82%) total zero data
INFO: transferred 16.00 GiB in 19 seconds (862.3 MiB/s)
<things are hung, nothing is happening past this point>

Now, there is a "disk" icon next to the VM. Mouseover on it says "Status: prelaunch, Config locked (backup)".

When I click Stop in the Task Viewer for this backup job it just says Please Wait for about 5 seconds but nothing happens.

root@pve-1:/etc/pve# cat storage.cfg
dir: local
path /var/lib/vz
content backup,iso,vztmpl

lvmthin: local-lvm
thinpool data
vgname pve
content images,rootdir

dir: local2
path /mnt/pve/local2
content snippets,vztmpl,images,rootdir,iso,backup
is_mountpoint 1
nodes pvemf

nfs: my-nfs
export /mnt/pool/data/files/pve
path /mnt/pve/my-nfs
server nfs-server.my.lan
content images,backup,iso
prune-backups keep-all=1

Any advice? TIA
 
Last edited:
Hi all

I am also experiencing this behavior while backing up my VM to my NFS storage. I shut down the VM and then attempted to start the backup. The backup job is still hung 10 minutes after this:

INFO: starting new backup job: vzdump 103 --mode snapshot --notification-mode auto --remove 0 --storage nfs-sol --compress zstd --notes-template '{{guestname}}' --node pve-1
INFO: Starting Backup of VM 103 (qemu)
INFO: Backup started at 2025-07-15 11:04:21
INFO: status = stopped
INFO: backup mode: stop
INFO: ionice priority: 7
INFO: VM Name: MY-VM-NAME
INFO: include disk 'scsi0' 'local2:103/vm-103-disk-0.qcow2' 16G
INFO: creating vzdump archive '/mnt/pve/my-nfs/dump/vzdump-qemu-103-2025_07_15-11_04_21.vma.zst'
INFO: starting kvm to execute backup task
INFO: started backup task 'aa828b1e-cd8a-49f3-bbc3-279a1a00d6d3'
INFO: 20% (3.3 GiB of 16.0 GiB) in 3s, read: 1.1 GiB/s, write: 208.6 MiB/s
INFO: 32% (5.2 GiB of 16.0 GiB) in 6s, read: 664.2 MiB/s, write: 163.4 MiB/s
INFO: 49% (7.9 GiB of 16.0 GiB) in 9s, read: 904.6 MiB/s, write: 155.5 MiB/s
INFO: 65% (10.4 GiB of 16.0 GiB) in 12s, read: 868.4 MiB/s, write: 139.9 MiB/s
INFO: 79% (12.7 GiB of 16.0 GiB) in 15s, read: 769.6 MiB/s, write: 160.9 MiB/s
INFO: 99% (15.9 GiB of 16.0 GiB) in 18s, read: 1.1 GiB/s, write: 123.5 MiB/s
INFO: 100% (16.0 GiB of 16.0 GiB) in 19s, read: 122.2 MiB/s, write: 0 B/s
INFO: backup is sparse: 13.21 GiB (82%) total zero data
INFO: transferred 16.00 GiB in 19 seconds (862.3 MiB/s)
<things are hung, nothing is happening past this point>

Now, there is a "disk" icon next to the VM. Mouseover on it says "Status: prelaunch, Config locked (backup)".

When I click Stop in the Task Viewer for this backup job it just says Please Wait for about 5 seconds but nothing happens.

root@pve-1:/etc/pve# cat storage.cfg
dir: local
path /var/lib/vz
content backup,iso,vztmpl

lvmthin: local-lvm
thinpool data
vgname pve
content images,rootdir

dir: local2
path /mnt/pve/local2
content snippets,vztmpl,images,rootdir,iso,backup
is_mountpoint 1
nodes pvemf

nfs: my-nfs
export /mnt/pool/data/files/pve
path /mnt/pve/my-nfs
server nfs-server.my.lan
content images,backup,iso
prune-backups keep-all=1

Any advice? TIA

Afraid I don't have any help for you. Other then this just started working a couple of months ago when I tried it again. Nothing was changed other then applying updates.
 
--storage nfs-sol
But in your storage.cfg I see no storage named nfs-sol :
Code:
root@pve-1:/etc/pve# cat storage.cfg
dir: local
        path /var/lib/vz
        content backup,iso,vztmpl

lvmthin: local-lvm
        thinpool data
        vgname pve
        content images,rootdir

dir: local2
        path /mnt/pve/local2
        content snippets,vztmpl,images,rootdir,iso,backup
        is_mountpoint 1
        nodes pvemf

nfs: my-nfs
        export /mnt/pool/data/files/pve
        path /mnt/pve/my-nfs
        server nfs-server.my.lan
        content images,backup,iso
        prune-backups keep-all=1
The only NFS one appears to be my-nfs, so something is odd here.
 
But in your storage.cfg I see no storage named nfs-sol :
Code:
root@pve-1:/etc/pve# cat storage.cfg
dir: local
        path /var/lib/vz
        content backup,iso,vztmpl

lvmthin: local-lvm
        thinpool data
        vgname pve
        content images,rootdir

dir: local2
        path /mnt/pve/local2
        content snippets,vztmpl,images,rootdir,iso,backup
        is_mountpoint 1
        nodes pvemf

nfs: my-nfs
        export /mnt/pool/data/files/pve
        path /mnt/pve/my-nfs
        server nfs-server.my.lan
        content images,backup,iso
        prune-backups keep-all=1
The only NFS one appears to be my-nfs, so something is odd here.
I changed it in the forum post to protect the names of the guilty but I missed it lol. All of the names do truly match in the config.
 
I changed it in the forum post to protect the names of the guilty
Not sure why the name of the nfs storage should need to be redacted at all - but each to his own I guess!

Anyway back to your issue. Have you tried to configure a local tmpdir in the /etc/vzdump.conf file. I refer you to a previous post of mine, which will also link you to the docs on the subject. Please note you will probably need to have enough space in that local tmpdir to temporarily store the backup until it is transferred to the nfs target.
 
  • Like
Reactions: Johannes S