[SOLVED] Backups are running slowly on the internal vmbr0 bridge

bluepr0

Well-Known Member
Mar 1, 2019
59
4
48
68
Hello everyone!

I'm running Proxmox 8.1.4 on my server (i7 6700T with 48GB of RAM and an Intel X520 Dual SPF+ PCIe card). I've set up a daily backup task for all my VMs and LXCs. This task used to run in just a few minutes when I was using the onboard LAN Ethernet port. However, when I switched to the SPF+ card (I think so, but not sure), the backups are now running much slower, to the point where it doesn't seem to make much sense.

Here's the setup:
- A few LXCs for different services.
- VM with Home Assistant.
- VM with TrueNAS, including an NFS share that I mount directly on Proxmox to save backups.

Speed should be very quick since everything is happening on the same machine. I recall speeds of around 200 MiB/s when I checked the logs. Currently, it's around 19 MiB/s, and I am unsure about the cause. I have verified that both VMs are using the bridge and should be connected internally.

Here's my network/interfaces file
Code:
auto lo
iface lo inet loopback

iface enp3s0f0 inet manual
        mtu 9000
#SPF+ 0

iface enp5s0 inet manual
#onboard LAN

iface enp3s0f1 inet manual
#SPF+ 1

auto vmbr0
iface vmbr0 inet static
        address 10.0.1.37/24
        gateway 10.0.1.1
        bridge-ports enp3s0f0
        bridge-stp off
        bridge-fd 0
        mtu 9000
#10gbe

Here is a log of the backup task at the current speed.
Code:
INFO:  47% (34.8 GiB of 74.0 GiB) in 32m 30s, read: 18.2 MiB/s, write: 17.9 MiB/s
INFO:  48% (35.5 GiB of 74.0 GiB) in 33m 10s, read: 18.3 MiB/s, write: 18.2 MiB/s
INFO:  49% (36.3 GiB of 74.0 GiB) in 33m 49s, read: 19.2 MiB/s, write: 18.8 MiB/s
INFO:  50% (37.0 GiB of 74.0 GiB) in 34m 27s, read: 20.1 MiB/s, write: 19.4 MiB/s
INFO:  51% (37.8 GiB of 74.0 GiB) in 35m 5s, read: 20.0 MiB/s, write: 19.6 MiB/s
INFO:  52% (38.5 GiB of 74.0 GiB) in 35m 46s, read: 18.7 MiB/s, write: 18.0 MiB/s
INFO:  53% (39.2 GiB of 74.0 GiB) in 36m 20s, read: 21.9 MiB/s, write: 20.3 MiB/s
INFO:  54% (40.0 GiB of 74.0 GiB) in 37m 5s, read: 16.8 MiB/s, write: 16.5 MiB/s
INFO:  55% (40.7 GiB of 74.0 GiB) in 37m 44s, read: 19.8 MiB/s, write: 18.8 MiB/s
 
Last edited:
Did you test it with MTU 1500 in case of packet fragmentation?
Iperf could help with finding network bottlenecks and fio for storage bottlenecks.
 
Last edited:
Yes! It doesn't seem like it's the problem, but I'm planning to reconnect the onboard LAN to see if it's related to that.

On iperf, I'm getting 9.6, so the 10GB is working correctly. Though it shouldn't be a problem since this is supposed to be internal. The disks are all NVMe, so they are also not a bottleneck!
 
NVMe SSDs don't have to be fast. Not uncommon that those Consumer NVMe SSDs with QLC NAND drop to something like 40MB/s once the SLC-write-cache is filled up.
 
NVMe SSDs don't have to be fast. Not uncommon that those Consumer NVMe SSDs with QLC NAND drop to something like 40MB/s once the SLC-write-cache is filled up.
The thing is that it was working okay before with the same disks, so I'm assuming it's not a problem with the SSDs.

Will try with the onboard LAN and report back!
 
Alright! It's definitely not the network or some weird configuration. I've tried the onboard LAN, and got the same result.

I changed the backup job to use the local Proxmox disk (located under "local" on the sidebar) and it still writes very slowly, at around 15-16 MiB/s.
Code:
NFO: starting new backup job: vzdump 100 101 104 105 --prune-backups 'keep-last=2' --all 0 --mode snapshot --storage local --notes-template '{{guestname}}' --node pve --mailto xxx@gmail.com --mailnotification always --compress gzip
INFO: Starting Backup of VM 100 (qemu)
INFO: Backup started at 2024-03-21 20:39:22
INFO: status = stopped
INFO: backup mode: stop
INFO: ionice priority: 7
INFO: VM Name: ha
INFO: include disk 'scsi0' 'local-lvm:vm-100-disk-1' 74G
INFO: include disk 'efidisk0' 'local-lvm:vm-100-disk-0' 4M
INFO: creating vzdump archive '/var/lib/vz/dump/vzdump-qemu-100-2024_03_21-20_39_22.vma.gz'
INFO: starting kvm to execute backup task
INFO: started backup task '5ccf667f-dfe4-4879-bef4-875b14d772ef'
INFO:   0% (83.4 MiB of 74.0 GiB) in 3s, read: 27.8 MiB/s, write: 14.9 MiB/s
ERROR: interrupted by signal
INFO: aborting backup job
INFO: stopping kvm after backup task
trying to acquire lock...
 OK


When I run an fio test, I achieve speeds of around 2856 MiB/s. See below

Bash:
root@pve:~# sync;fio --randrepeat=1 --ioengine=libaio --direct=1 --name=test --filename=test --bs=4M --size=4G --readwrite=write --ramp_time=4
test: (g=0): rw=write, bs=(R) 4096KiB-4096KiB, (W) 4096KiB-4096KiB, (T) 4096KiB-4096KiB, ioengine=libaio, iodepth=1
fio-3.33
Starting 1 process
Jobs: 1 (f=0)
test: (groupid=0, jobs=1): err= 0: pid=233118: Thu Mar 21 20:51:57 2024
  write: IOPS=714, BW=2856MiB/s (2995MB/s)(4096MiB/1434msec); 0 zone resets
  cpu          : usr=7.40%, sys=24.84%, ctx=1026, majf=0, minf=9
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,1024,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=2856MiB/s (2995MB/s), 2856MiB/s-2856MiB/s (2995MB/s-2995MB/s), io=4096MiB (4295MB), run=1434-1434msec

Disk stats (read/write):
    dm-1: ios=0/2979, merge=0/0, ticks=0/2288, in_queue=2288, util=92.66%, aggrios=1/33793, aggrmerge=0/0, aggrticks=0/17169, aggrin_queue=17170, aggrutil=89.22%
  nvme0n1: ios=1/33793, merge=0/0, ticks=0/17169, in_queue=17170, util=89.22%
root@pve:~#

This doesn't even seem like a network issue anymore. It should be moved to the main forum, I believe. @martin or another admin.
 
Last edited:
Alright, so this is weird. I deleted the backup task, recreated it, and now it's working at the speed it used to. The issue is fixed, but I'm not sure what happened or why it glitched.

Bash:
INFO: starting new backup job: vzdump 100 101 104 105 --notes-template '{{guestname}}' --storage DataSSD --node pve --all 0 --mode snapshot --compress zstd --mailnotification always
INFO: Starting Backup of VM 100 (qemu)
INFO: Backup started at 2024-03-21 21:08:28
INFO: status = running
INFO: VM Name: ha
INFO: include disk 'scsi0' 'local-lvm:vm-100-disk-1' 74G
INFO: include disk 'efidisk0' 'local-lvm:vm-100-disk-0' 4M
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating vzdump archive '/mnt/pve/DataSSD/dump/vzdump-qemu-100-2024_03_21-21_08_28.vma.zst'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task '676e5de6-4b85-46e0-bbe6-d387052eb27b'
INFO: resuming VM again
INFO:   1% (833.8 MiB of 74.0 GiB) in 3s, read: 277.9 MiB/s, write: 216.2 MiB/s
INFO:   2% (1.6 GiB of 74.0 GiB) in 7s, read: 190.3 MiB/s, write: 188.7 MiB/s
INFO:   3% (2.3 GiB of 74.0 GiB) in 11s, read: 193.9 MiB/s, write: 191.3 MiB/s
INFO:   4% (3.0 GiB of 74.0 GiB) in 15s, read: 186.7 MiB/s, write: 182.5 MiB/s
 
Hi,

your backup task command is now different:
Code:
vzdump 100 101 104 105 --prune-backups 'keep-last=2' --all 0 --mode snapshot --storage local --notes-template '{{guestname}}' --node pve --mailto .... --mailnotification always --compress gzip
vs
Code:
vzdump 100 101 104 105 --notes-template '{{guestname}}' --storage DataSSD --node pve --all 0 --mode snapshot --compress zstd --mailnotification always

Maybe gzip vs zstd compression makes the difference?
Also u should edit your post #6 and remove your mail address.

Greetz
 
  • Like
Reactions: bluepr0
Different backup storage used as well.
Yep! I was just playing around with different media, but I guess at some point I inadvertently changed some of the settings of the backup task itself. That's why there was such a huge difference in speed.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!