Backup from CEPH storage to NFS storage very slow

drz-support

Renowned Member
Feb 14, 2014
24
2
68
Hi

After setting up CEPH storage directly running on the 3 Proxmox nodes and moving all VM's to the CEPH storage we noticed very slow backup speed to our NFS storage.

Backing up the same VM from local storage to the exactly same NFS storage is must faster.
How can we improve the backup speed?


Backup from CEPH storage to NFS
Code:
[COLOR=#000000][FONT=tahoma]INFO: starting new backup job: vzdump 103 --remove 0 --mode snapshot --compress lzo --storage pvebackup-archive --node drz-pve01-02[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]INFO: Starting Backup of VM 103 (qemu)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]INFO: status = running[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]INFO: update VM 103: -lock backup[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]INFO: backup mode: snapshot[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]INFO: ionice priority: 7[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]INFO: creating archive '/mnt/pve/pvebackup-archive/dump/vzdump-qemu-103-2015_04_28-09_03_27.vma.lzo'[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]INFO: started backup task 'a4359115-6214-4154-9947-bbe4094d514f'[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]INFO: status: 1% (203620352/12884901888), sparse 1% (137728000), duration 3, 67/21 MB/s[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]INFO: status: 2% (336265216/12884901888), sparse 1% (149213184), duration 6, 44/40 MB/s[/FONT][/COLOR]
[FONT=tahoma][COLOR=#000000].
[/COLOR][/FONT].
.
[COLOR=#000000][FONT=tahoma]INFO: status: 99% (12794789888/12884901888), sparse 86% (11095097344), duration 213, 63/0 MB/s[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]INFO: status: 100% (12884901888/12884901888), sparse 86% (11185209344), duration 215, 45/0 MB/s[/FONT][/COLOR]
[B][COLOR=#000000][FONT=tahoma]INFO: transferred 12884 MB in 215 seconds (59 MB/s)[/FONT][/COLOR][/B]
[COLOR=#000000][FONT=tahoma]INFO: archive file size: 755MB[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]INFO: Finished Backup of VM 103 (00:03:37)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]INFO: Backup job finished successfully[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]TASK OK[/FONT][/COLOR]


Backup from local storage to NFS
Code:
[COLOR=#000000][FONT=tahoma]INFO: starting new backup job: vzdump 103 --remove 0 --mode snapshot --compress lzo --storage pvebackup-archive --node drz-pve01-02[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]INFO: Starting Backup of VM 103 (qemu)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]INFO: status = running[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]INFO: update VM 103: -lock backup[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]INFO: backup mode: snapshot[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]INFO: ionice priority: 7[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]INFO: creating archive '/mnt/pve/pvebackup-archive/dump/vzdump-qemu-103-2015_04_28-09_10_17.vma.lzo'[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]INFO: started backup task '66f1667d-436e-4dab-a70b-866e354e3d9e'[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]INFO: status: 3% (421920768/12884901888), sparse 1% (161206272), duration 3, 140/86 MB/s[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]INFO: status: 6% (832438272/12884901888), sparse 1% (170385408), duration 6, 136/133 MB/s[/FONT][/COLOR]
[FONT=tahoma][COLOR=#000000].
[/COLOR][/FONT].
.
[COLOR=#000000][FONT=tahoma]INFO: status: 92% (11920015360/12884901888), sparse 79% (10220331008), duration 24, 863/0 MB/s[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]INFO: status: 100% (12884901888/12884901888), sparse 86% (11185209344), duration 25, 964/0 MB/s[/FONT][/COLOR]
[B][COLOR=#000000][FONT=tahoma]INFO: transferred 12884 MB in 25 seconds (515 MB/s)[/FONT][/COLOR][/B]
[COLOR=#000000][FONT=tahoma]INFO: archive file size: 755MB[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]INFO: Finished Backup of VM 103 (00:00:26)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]INFO: Backup job finished successfully[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]TASK OK[/FONT][/COLOR]


CEPH write performance
Code:
root@drz-pve01-02:~# rados -p drz-pveceph01 bench -b 4194304 60 write -t 32 --no-cleanup Maintaining 32 concurrent writes of 4194304 bytes for up to 60 seconds or 0 objects
 Object prefix: benchmark_data_drz-pve01-02_995542
 Total time run:         60.835211
Total writes made:      4196
Write size:             4194304
Bandwidth (MB/sec):     275.893


Stddev Bandwidth:       80.7099
Max bandwidth (MB/sec): 444
Min bandwidth (MB/sec): 0
Average Latency:        0.463263
Stddev Latency:         0.214722
Max latency:            1.69095
Min latency:            0.115577
root@drz-pve01-02:~#


CEPH read performance
Code:
root@drz-pve01-02:~# rados -p drz-pveceph01 bench -b 4194304 60 seq  -t 32 --no-cleanup
 Total time run:        8.652504
Total reads made:     4196
Read size:            4194304
Bandwidth (MB/sec):    1939.785


Average Latency:       0.0657323
Max latency:           0.128604
Min latency:           0.0228346
root@drz-pve01-02:~#


PVE Version
Code:
root@drz-pve01-02:~# pveversion -vproxmox-ve-2.6.32: 3.4-150 (running kernel: 2.6.32-37-pve)
pve-manager: 3.4-3 (running version: 3.4-3/2fc72fee)
pve-kernel-2.6.32-37-pve: 2.6.32-150
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.7-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.10-2
pve-cluster: 3.0-16
qemu-server: 3.4-3
pve-firmware: 1.1-4
libpve-common-perl: 3.0-24
libpve-access-control: 3.0-16
libpve-storage-perl: 3.0-32
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-8
vzctl: 4.0-1pve6
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 2.2-8
ksm-control-daemon: 1.1-1
glusterfs-client: 3.5.2-1


Many thanks in advance for your help

Best
Andreas
 

CEPH write performance
Code:
root@drz-pve01-02:~# rados -p drz-pveceph01 bench -b 4194304 60 write -t 32 --no-cleanup Maintaining 32 concurrent writes of 4194304 bytes for up to 60 seconds or 0 objects
 Object prefix: benchmark_data_drz-pve01-02_995542
 Total time run:         60.835211
Total writes made:      4196
Write size:             4194304
Bandwidth (MB/sec):     275.893


Stddev Bandwidth:       80.7099
Max bandwidth (MB/sec): 444
Min bandwidth (MB/sec): 0
Average Latency:        0.463263
Stddev Latency:         0.214722
Max latency:            1.69095
Min latency:            0.115577
root@drz-pve01-02:~#


CEPH read performance
Code:
root@drz-pve01-02:~# rados -p drz-pveceph01 bench -b 4194304 60 seq  -t 32 --no-cleanup
 Total time run:        8.652504
Total reads made:     4196
Read size:            4194304
Bandwidth (MB/sec):    1939.785


Average Latency:       0.0657323
Max latency:           0.128604
Min latency:           0.0228346
root@drz-pve01-02:~#
Hi Andreas,
not an helpful answer for the cackup-issue, but I guess, that your read-meassuring shows cached data?!
How looks your read-performance if you do an
Code:
echo 3 > /proc/sys/vm/drop_caches
on all osd-nodes and the proxmox-host before the "rados -p drz-pveceph01 bench -b 4194304 60 seq -t 32 --no-cleanup" ??

Udo
 
Hi Udo

After
echo 3 > /proc/sys/vm/drop_caches

the result is

Code:
root@drz-pve01-02:~# echo 3 > /proc/sys/vm/drop_caches
root@drz-pve01-02:~# rados -p drz-pveceph01 bench -b 4194304 60 seq -t 32 --no-cleanup
   sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
     0       0         0         0         0         0         -         0
     1      31       155       124   495.839       496  0.108193  0.203497
     2      31       297       266   531.876       568  0.319064  0.216676
     3      31       442       411    547.89       580  0.167212  0.222498
     4      32       597       565   564.898       616  0.211615  0.215222
     5      32       744       712   569.503       588  0.085213  0.217213
     6      31       876       845   563.241       532  0.104911  0.218804
     7      31      1040      1009   576.479       656  0.116166  0.216135
     8      32      1194      1162   580.911       612  0.112321  0.215055
     9      31      1356      1325     588.8       652  0.109085  0.213249
    10      31      1509      1478   591.113       612  0.415087   0.21279
    11      32      1674      1642   597.003       656   0.14649  0.212036
    12      31      1838      1807   602.245       660  0.215636  0.210032
    13      31      2002      1971   606.374       656  0.105266   0.20798
    14      31      2160      2129   608.198       632  0.161033  0.208029
    15      31      2311      2280   607.913       604  0.190595  0.208173
    16      31      2481      2450   612.413       680  0.152376  0.207212
    17      32      2645      2613   614.738       652  0.282513  0.206305
    18      32      2817      2785   618.804       688  0.283582  0.205482
    19      31      2956      2925   615.705       560  0.121248  0.204969
2015-04-28 13:54:16.518962min lat: 0.0473393 max lat: 2.06221 avg lat: 0.206426
   sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
    20      32      3101      3069   613.716       576  0.304109  0.206426
    21      31      3251      3220   613.249       604  0.541936  0.206203
    22      32      3430      3398   617.733       712 0.0893052  0.205572
    23      32      3604      3572   621.133       696  0.208861  0.204919
    24      31      3779      3748   624.582       704 0.0725407  0.203498
    25      32      3953      3921   627.275       692  0.383536  0.203062
    26      32      4121      4089   628.991       672  0.210484  0.202236
 Total time run:        26.816973
Total reads made:     4196
Read size:            4194304
Bandwidth (MB/sec):    625.872


Average Latency:       0.204166
Max latency:           2.06221
Min latency:           0.0473393
root@drz-pve01-02:~#

Best
Andreas
 
this is a known issue, currently I have no quick solutions for this slowness.
 
Hi Tom

Thanks for your reply. Is there a roadmap for a fix to solve this problem?

Best
Andreas
 
Another question along with Andreas' about the roadmap - is there another recommended method for backing up VMs stored on Ceph until NFS is squared away?
 
Another question along with Andreas' about the roadmap - is there another recommended method for backing up VMs stored on Ceph until NFS is squared away?

the problem is not NFS, its the live backup from Ceph VM images. but the backup works reliable, its just slow.
 
the problem is not NFS, its the live backup from Ceph VM images. but the backup works reliable, its just slow.

So a live backup from a ceph vm, even to another storage mechanism, would still experience slow operation?
 
How do you do that with a running vm - and automate the process? I thought to clone you had to do it offline...
 
As odd as it may seem, clone the VM to a different VMID and see if that makes a difference. In my testing, I've doubled my speed backup speeds in some cases.

VM on NFS - Very Slow Backups
Hi,
I guess that's happens due to caching on the osd-nodes.

If you clone an VM all blocks are written again - so the OSD-host has (partly, depends on amunt of RAM) the content cached in RAM.

Udo
 
Hi,
I guess that's happens due to caching on the osd-nodes.

If you clone an VM all blocks are written again - so the OSD-host has (partly, depends on amunt of RAM) the content cached in RAM.

Udo

I've had backup speeds change by putting the same operating system (both clean installs) on a different VMID from typos (102 / 201) while trying to replicate other issues.
 
this is a known issue, currently I have no quick solutions for this slowness.

Hi tom. I've faced this issue recently. Do you have any updates on the fix? Can you advise if the same issue exists with GlusterFS backend?
 
is there an update on this?

we experience that same issue with VM-storage on ceph cluster and vzdump target on glusterfs.
network is definately not the bottleneck, i can scp over 50MB/s from and to a VM on the host that is backing up another VM at ~20MB/s

(if it matters: we use a 2-node ceph cluster and iftop on the pve4-Host shows roughly equal distribution on both ceph nodes (both at like 10MB/s RX speed, which perfectly adds up to the 20MB we experience as backup bw)
 
As best I can tell the problem is with the kvm-qemu live backup code.
The live backup reads very small amounts of data at a time, something like 64k I think.
Then each object in CEPH is 4M if I remember correctly.

So what happens is the backup code reads the same CEPH object 64 times, with each request being penalized by CEPH's latency.
The result is really poor performance because of the latency of 64 requests.

If the live backup code read 4M at a time latency would be reduced significantly and throughput would increase.

I recall reading on some mailing lists that changing how much data the live backup reads is non-trivial.

I'm using CEPH for some VMs that are huge and will only get larger.
Backing them up using Proxmox takes days when it should only take hours.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!