Hello,
I'm using Proxmox Backup Server 3 with 3 Promox nodes. (7.4)
The backup and restore performance doesn't seem great with the hardware I'm using.
The main issue is with the restore, I'm getting about 350MB/sec.
Locally on PBS on zfs, using fio, using SSD drives I'm getting random read speed of about 4500MB/sec.
And locally on the Proxmox nodes on Ceph, using rados, using NVMe drives I'm getting about 1250MB/sec
The network is 100gb.
So why am I getting only 350MB/sec when restoring? I get that there will be overhead, however something doesn't seem to add up.
Any pointers to increase the performance is welcome
Below you will find local benchmarks, CEPH and ZFS configuration.
PBS box FIO tests
--
--
I'm using Proxmox Backup Server 3 with 3 Promox nodes. (7.4)
The backup and restore performance doesn't seem great with the hardware I'm using.
The main issue is with the restore, I'm getting about 350MB/sec.
Locally on PBS on zfs, using fio, using SSD drives I'm getting random read speed of about 4500MB/sec.
And locally on the Proxmox nodes on Ceph, using rados, using NVMe drives I'm getting about 1250MB/sec
The network is 100gb.
So why am I getting only 350MB/sec when restoring? I get that there will be overhead, however something doesn't seem to add up.
Any pointers to increase the performance is welcome
Code:
PBS box hardware:
Proxmox Backup Server 3.0-2
Kernel Version Linux 6.2.16-15-pve #1 SMP PREEMPT_DYNAMIC PMX 6.2.16-15 (2023-09-28T13:53Z)
Intel(R) Xeon(R) Silver 4210 CPU @ 2.20GHz
128GB DDR4 RAM
2 x LSI HBA 9400-16i
8 x Micron PRO 5400 SSD 7.6TB
100GB network - NVIDIA Mellanox MT27800
10GB network - Intel® Ethernet Controller X540-AT2
Code:
3 x Proxmox nodes hardware:
Lenovo sr645
2 x ThinkSystem AMD EPYC 7502 32C 180W 2.5GHz Processor
1TB RAM TruDDR4 3200MHz
5 x ThinkSystem U.2 PM983 3.84TB Entry NVMe PCIe 3.0 x4 Hot Swap SSD
2 x ThinkSystem 7mm 5300 240GB Entry SATA 6Gb SSD
1 x ThinkSystem Broadcom 57454 10GBASE-T 4-port OCP Ethernet Adapter
1 x 100gb Mellanox Ethernet Adapter
Below you will find local benchmarks, CEPH and ZFS configuration.
Code:
Proxmox node
root@pve:~# rados bench -p vm_storage 10 write -b 4M -t 16 --run-name 'pvexxx' --no-cleanup
hints = 1
Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 10 seconds or 0 objects
Object prefix: benchmark_data_pvexxx_332153
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
0 0 0 0 0 0 - 0
1 16 312 296 1183.93 1184 0.013909 0.0481707
2 16 663 647 1293.87 1404 0.0465274 0.048925
3 16 953 937 1249.2 1160 0.0144783 0.0505117
4 16 1280 1264 1263.86 1308 0.131314 0.0496239
5 16 1630 1614 1291.05 1400 0.0178662 0.0493974
6 16 1938 1922 1281.18 1232 0.0152664 0.0495438
7 16 2220 2204 1259.29 1128 0.0341177 0.0504078
8 16 2555 2539 1269.35 1340 0.0113804 0.0500438
9 16 2830 2814 1250.52 1100 0.0165271 0.0509074
10 15 3131 3116 1246.25 1208 0.0200455 0.0512501
Total time run: 10.0212
Total writes made: 3131
Write size: 4194304
Object size: 4194304
Bandwidth (MB/sec): 1249.75
Stddev Bandwidth: 110.292
Max bandwidth (MB/sec): 1404
Min bandwidth (MB/sec): 1100
Average IOPS: 312
Stddev IOPS: 27.5729
Max IOPS: 351
Min IOPS: 275
Average Latency(s): 0.0511535
Stddev Latency(s): 0.0434358
Max latency(s): 0.207021
Min latency(s): 0.00991231
Code:
CEPH config:
5 OSDs per node (15s OSD total)
root@pve:~# ceph config show osd.1
NAME VALUE SOURCE OVERRIDES IGNORES
auth_client_required cephx file
auth_cluster_required cephx file
auth_service_required cephx file
cluster_network 172.xx.xx.xx/22 file
daemonize false override
keyring $osd_data/keyring default
leveldb_log default
mon_allow_pool_delete true file
mon_host 172.xx.xx.11 172.xx.xx.12 172.xx.xx.13 file
ms_bind_ipv4 true file
ms_bind_ipv6 false file
no_config_file false override
osd_delete_sleep 0.000000 override
osd_delete_sleep_hdd 0.000000 override
osd_delete_sleep_hybrid 0.000000 override
osd_delete_sleep_ssd 0.000000 override
osd_max_backfills 10 default
osd_mclock_max_capacity_iops_hdd 0.000000 override
osd_mclock_max_capacity_iops_ssd 18970.410801 mon
osd_mclock_scheduler_background_best_effort_lim 999999 default
osd_mclock_scheduler_background_best_effort_res 593 default
osd_mclock_scheduler_background_best_effort_wgt 2 default
osd_mclock_scheduler_background_recovery_lim 2371 default
osd_mclock_scheduler_background_recovery_res 593 default
osd_mclock_scheduler_background_recovery_wgt 1 default
osd_mclock_scheduler_client_lim 999999 default
osd_mclock_scheduler_client_res 1186 default
osd_mclock_scheduler_client_wgt 2 default
osd_pool_default_min_size 2 file
osd_pool_default_size 3 file
osd_recovery_max_active 0 default
osd_recovery_max_active_hdd 10 default
osd_recovery_max_active_ssd 20 default
osd_recovery_sleep 0.000000 override
osd_recovery_sleep_hdd 0.000000 override
osd_recovery_sleep_hybrid 0.000000 override
osd_recovery_sleep_ssd 0.000000 override
osd_scrub_sleep 0.000000 override
osd_snap_trim_sleep 0.000000 override
osd_snap_trim_sleep_hdd 0.000000 override
osd_snap_trim_sleep_hybrid 0.000000 override
osd_snap_trim_sleep_ssd 0.000000 override
public_network 172.xx.xx.xx/22 file
rbd_default_features 61 default
rbd_qos_exclude_ops 0 default
setgroup ceph cmdline
setuser ceph cmdline
PBS box FIO tests
Code:
FIO tests:
root@pbs:/storage# fio --name=rand-read --ioengine=posixaio --rw=randread --bs=4M --size=4g --numjobs=1 --iodepth=32 --runtime=60 --time_based --end_fsync=1
READ: bw=4617MiB/s (4841MB/s), 4617MiB/s-4617MiB/s (4841MB/s-4841MB/s), io=271GiB (291GB), run=60024-60024msec
root@pbs:/storage# fio --name=rand-write --ioengine=posixaio --rw=randwrite --bs=4M --size=4g --numjobs=1 --iodepth=32 --runtime=60 --time_based --end_fsync=1
WRITE: bw=253MiB/s (266MB/s), 253MiB/s-253MiB/s (266MB/s-266MB/s), io=19.9GiB (21.3GB), run=80372-80372msec
--
Code:
$ zpool get all stornado
stornado size 55.9T -
stornado capacity 0% -
stornado altroot - default
stornado health ONLINE -
stornado guid 9671548887958957898 -
stornado version - default
stornado bootfs - default
stornado delegation on default
stornado autoreplace off default
stornado cachefile - default
stornado failmode wait default
stornado listsnapshots off default
stornado autoexpand on local
stornado dedupratio 1.00x -
stornado free 55.5T -
stornado allocated 363G -
stornado readonly off -
stornado ashift 12 local
stornado comment - default
stornado expandsize - -
stornado freeing 0 -
stornado fragmentation 0% -
stornado leaked 0 -
stornado multihost off default
stornado checkpoint - -
stornado load_guid 14358635609680744971 -
stornado autotrim off default
stornado compatibility off default
stornado feature@async_destroy enabled local
stornado feature@empty_bpobj active local
stornado feature@lz4_compress active local
stornado feature@multi_vdev_crash_dump enabled local
stornado feature@spacemap_histogram active local
stornado feature@enabled_txg active local
stornado feature@hole_birth active local
stornado feature@extensible_dataset active local
stornado feature@embedded_data active local
stornado feature@bookmarks enabled local
stornado feature@filesystem_limits enabled local
stornado feature@large_blocks enabled local
stornado feature@large_dnode enabled local
stornado feature@sha512 enabled local
stornado feature@skein enabled local
stornado feature@edonr enabled local
stornado feature@userobj_accounting active local
stornado feature@encryption enabled local
stornado feature@project_quota active local
stornado feature@device_removal enabled local
stornado feature@obsolete_counts enabled local
stornado feature@zpool_checkpoint enabled local
stornado feature@spacemap_v2 active local
stornado feature@allocation_classes enabled local
stornado feature@resilver_defer enabled local
stornado feature@bookmark_v2 enabled local
stornado feature@redaction_bookmarks enabled local
stornado feature@redacted_datasets enabled local
stornado feature@bookmark_written enabled local
stornado feature@log_spacemap active local
stornado feature@livelist enabled local
stornado feature@device_rebuild enabled local
stornado feature@zstd_compress enabled local
stornado feature@draid enabled local
--
Code:
root@pbs:/stornado# zpool status
pool: rpool
state: ONLINE
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ata-HDSTOR_-_HSAV25ST250AX_HS230811158DB1F12-part3 ONLINE 0 0 0
ata-HDSTOR_-_HSAV25ST250AX_HS230811158DB1F10-part3 ONLINE 0 0 0
errors: No known data errors
pool: stornado
state: ONLINE
config:
NAME STATE READ WRITE CKSUM
stornado ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
wwn-0x500a0751405fd5d6 ONLINE 0 0 0
wwn-0x500a075141cdda7e ONLINE 0 0 0
wwn-0x500a075141cdf608 ONLINE 0 0 0
wwn-0x500a075141cddbd9 ONLINE 0 0 0
wwn-0x500a075141cdf6e8 ONLINE 0 0 0
wwn-0x500a075141cddd3f ONLINE 0 0 0
wwn-0x500a075141cddc8d ONLINE 0 0 0
wwn-0x500a075141cdd9b4 ONLINE 0 0 0
errors: No known data errors