Super slow Backup with pve 2.x

on a new more closely paid attention to freenas nfs server, we are running into bad transfer rates using rsync from cli.

rsync goes from a node directly to the freenas [ not to the nfs mount ] .

transfer rates start at almost 80MB/s and grind down to 3MB/s .

I'll test that same rsync from a proxmox to proxmox node to check for transfer rate issues .
 
I'm using freenas for backup an others. I experienced several network performance issues (nfs, samba). After tuning several options the performance is okay. Check their forum for tuning tips. Freenas is brilliant but not perfect for the use with proxmox.
 
I'm using freenas for backup an others. I experienced several network performance issues (nfs, samba). After tuning several options the performance is okay. Check their forum for tuning tips. Freenas is brilliant but not perfect for the use with proxmox.

could you post the settings which made a difference?


I think a wiki page for freenas working with proxmox / debian squeeze would be good so that others have a starting point for making freenas work OK.
 
could you post the settings which made a difference?

yes, that would be great
but there is still the question why there are problems with buffalo and qnap nas, and why the problem is only on pve2.x and not on 1.9 (and earlier)
to be honest i'm a little bit disappointed of the pve team that the fault is deported so easily
 
As mentioned in my earlier post, I just use FreeNAS as my Backup Server, not as a Proxmox Storage Pod. Tuning and optimizing the connectivity between FreeNAS and oder Operating System like Windows (CIFS) or Proxmox (NFS) is a long turning activity which need to be done individual (trial & error). The parameter you can tune are dependend on your hardware spec (CPU, RAM, Network Card and Version of ZFS). In the FreeNAS Forum you will find a lota information about this topic. nevertheless here my details :

my FreeNAS Box is the same as documented here
I've mounted my Proxmox box to my FreeNAS Tank over NFS with a Gigabit connection and lagg (link aggregation).
Code:
/mnt/bigtank/dmd2  /mnt/bigtank  nfs  nolock,wsize=32768,rsize=32768,soft,noatime,noauto  0  0
Write: 600 mbit/s
Read: 750 mbit/s

The speed on CIFS is by far better.

Sysctls parameters on the FreeNAS Tank :
net.inet.tcp.sendbuf_max 16777216
net.inet.tcp.recvbuf_max 16777216
net.inet.tcp.sendspace 65536
net.inet.tcp.recvspace 131072
vfs.ufs.dirhash_maxmem 16777216
vfs.zfs.txg.write_limit_override 1073741824

There are maybe other parameters I changed, but can't remember yet.

Further informations in the FreeNAS Forum :
ZFS Samba (CIFS) = abysmal performance in FREENAS 8.2
ZFS memory tuning
Network slowdown


In general I would say, that my low cost FreeNAS Tank is not a reference for you. I hope this gives you first input for your further investigation.

cheers
tom
 
Hi all,

sorry if I jump on this thread, but I too have this backup problem for the last 3-4 days.
I only use NFS for backup of KVM machines. But they do have large disks.
I use a Windows Home Server with MS SFU 3.5 as a NFS Server and Freenas 8.04p3.

My VM's stop responding and can not be accessed thru the network.

I have attached a screenshot from this morning. The backup startet at midnight and I had to kill it.

Here is what I found in my syslog:
Code:
Jun 29 08:03:13 proxmox rrdcached[1631]: flushing old values
Jun 29 08:03:13 proxmox rrdcached[1631]: rotating journals
Jun 29 08:03:13 proxmox rrdcached[1631]: started new journal /var/lib/rrdcached/journal//rrd.journal.1340949793.084381
Jun 29 08:03:13 proxmox rrdcached[1631]: removing old journal /var/lib/rrdcached/journal//rrd.journal.1340942593.084292
Jun 29 08:08:32 proxmox pvestatd[1879]: status update time (463.729 seconds)
Jun 29 08:08:32 proxmox pmxcfs[1658]: [status] notice: RRDC update error /var/lib/rrdcached/db/pve2-storage/proxmox/Freenas: -1
Jun 29 08:08:32 proxmox pmxcfs[1658]: [status] notice: RRD update error /var/lib/rrdcached/db/pve2-storage/proxmox/Freenas: /var/lib/rrdcached/db/pve2-storage/proxmox/Freenas: illegal attempt to update using time 1340950112 when last update time is 1340950112 (minimum one second step)
Jun 29 08:08:32 proxmox pmxcfs[1658]: [status] notice: RRDC update error /var/lib/rrdcached/db/pve2-storage/proxmox/local: -1
Jun 29 08:08:32 proxmox pmxcfs[1658]: [status] notice: RRD update error /var/lib/rrdcached/db/pve2-storage/proxmox/local: /var/lib/rrdcached/db/pve2-storage/proxmox/local: illegal attempt to update using time 1340950112 when last update time is 1340950112 (minimum one second step)
Jun 29 08:13:24 proxmox pvestatd[1879]: status update time (132.056 seconds)
Jun 29 08:17:01 proxmox /USR/SBIN/CRON[645292]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Jun 29 08:18:26 proxmox pvestatd[1879]: WARNING: command 'df -P -B 1 /mnt/pve/proxmox2nfs' failed: got timeout
Jun 29 08:18:28 proxmox pvestatd[1879]: WARNING: command 'df -P -B 1 /mnt/pve/WHS-NFS' failed: got timeout
Jun 29 08:19:16 proxmox pvestatd[1879]: status update time (21.987 seconds)
Jun 29 08:31:45 proxmox kernel: __ratelimit: 188 callbacks suppressed
Jun 29 08:31:45 proxmox kernel: kvm: 20257: cpu0 unimplemented perfctr wrmsr: 0xc1 data 0xffffffffff1b1e41
Jun 29 08:31:45 proxmox kernel: kvm: 20257: cpu0 unimplemented perfctr wrmsr: 0x186 data 0x5100c0
Jun 29 08:31:45 proxmox kernel: kvm: 20257: cpu0 unimplemented perfctr wrmsr: 0xc1 data 0x0
Jun 29 08:31:45 proxmox kernel: kvm: 20257: cpu0 unimplemented perfctr wrmsr: 0xc1 data 0x0
Jun 29 08:31:45 proxmox kernel: kvm: 20257: cpu0 unimplemented perfctr wrmsr: 0x186 data 0x5100c0
Jun 29 08:31:45 proxmox kernel: kvm: 20257: cpu0 unimplemented perfctr wrmsr: 0xc1 data 0x0
Jun 29 08:31:45 proxmox kernel: kvm: 20257: cpu0 unimplemented perfctr wrmsr: 0xc1 data 0x0
Jun 29 08:31:45 proxmox kernel: kvm: 20257: cpu0 unimplemented perfctr wrmsr: 0x186 data 0x5100c0
Jun 29 08:31:45 proxmox kernel: kvm: 20257: cpu0 unimplemented perfctr wrmsr: 0xc1 data 0x0
Jun 29 08:31:45 proxmox kernel: kvm: 20257: cpu0 unimplemented perfctr wrmsr: 0xc1 data 0xffffffffff1b1e41

After reading all posts, I did not understand correctly what need to be done to get this solved.
What do I need to change /check?

Thank you in advance
Kind regards
B.
 

Attachments

  • PVE_Backup_1.jpg
    PVE_Backup_1.jpg
    75.3 KB · Views: 15
We have two nfs servers used for pve backup.

our storage network is on separate network hardware. nic, cables etc .

the 1-st is running pve , has raid 10 3ware . this backup took about 2.5 hours . here is a grep of the speeds:
Code:
VZ:
INFO: Total bytes written: 143996835840 (135GiB, 32MiB/s)
INFO: Total bytes written: 698419200 (667MiB, 17MiB/s)
INFO: Total bytes written: 459048960 (438MiB, 8.2MiB/s)
INFO: Total bytes written: 4669184000 (4.4GiB, 39MiB/s)
INFO: Total bytes written: 1031147520 (984MiB, 22MiB/s)
INFO: Total bytes written: 595343360 (568MiB, 21MiB/s)
INFO: Total bytes written: 858828800 (820MiB, 12MiB/s)
INFO: Total bytes written: 2037012480 (1.9GiB, 24MiB/s)
INFO: Total bytes written: 680908800 (650MiB, 11MiB/s)
INFO: Total bytes written: 663674880 (633MiB, 13MiB/s)

KVM:
INFO: Total bytes written: 2048 (0.00 MiB/s)
INFO: Total bytes written: 18257808384 (71.95 MiB/s)
INFO: Total bytes written: 64428706304 (39.19 MiB/s)
INFO: Total bytes written: 16127101440 (10.17 MiB/s)

this is to freenas. it has raidz2 and the nfs is attached to our storage network. this backup started at 10PM last night, and almost 11 hours later is still in progress. here is a grep of speeds to this point.:
Code:
VZ:
INFO: Total bytes written: 144679219200 (135GiB, 44MiB/s)
INFO: Total bytes written: 698419200 (667MiB, 21MiB/s)
INFO: Total bytes written: 459048960 (438MiB, 12MiB/s)
INFO: Total bytes written: 15582935040 (15GiB, 694KiB/s)
INFO: Total bytes written: 4670443520 (4.4GiB, 46MiB/s)
INFO: Total bytes written: 1032867840 (986MiB, 21MiB/s)
INFO: Total bytes written: 595343360 (568MiB, 21MiB/s)
INFO: Total bytes written: 881172480 (841MiB, 20MiB/s)
INFO: Total bytes written: 2136043520 (2.0GiB, 28MiB/s)
INFO: Total bytes written: 684400640 (653MiB, 12MiB/s)
INFO: Total bytes written: 671129600 (641MiB, 14MiB/s)

KVM
INFO: Total bytes written: 2048 (0.00 MiB/s)
INFO: Total bytes written: 18257808384 (4.19 MiB/s)

Note that the first backup has speed slowdowns as the KVM backups start at 71.93 MiB/s and end at 10.17 MiB/s .

So I think there is an issue even without using freenas . And of course the issue is a lot worse using freenas for backup.

We will work continue to work on solving this .
 
Last edited:
I am doing this test:

from cli on freenas
Code:
root@freenas] ~# rsync  -aP --stats 10.100.8.100:/bkup/dump/ /mnt/fantini/pve/dump/

transfer speeds starrrted at 80 MB/s

after a minute or so speed went down to 4 MB/s

So this is not a proxmox issue. it is something I've seen in other zfs systems working with linux .

The issue exists with Squeeze. Lenny per others did not have the issues.

And I think the issue gets worse with big files. See the transfer rates we got on VZ compared to KVM in a prior post.
 
in the last few days i did again a few tests, the first test scenario was to install a few nas-distributions on the same server and test it
server as mentioned in the first post: Dell Poweredge 2950, Xeon 5130, 4GB RAM, 4x 146GB 15k SAS

in every testrun i tried to backup 5 VMs 101 (17,5GiB), 106 (61GiB), 201 (13,3GiB), 202 (81GiB), 304 (16GiB)
101 and 106 from PVE01, 201 and 202 from PVE02, 304 from PVE03

Code:
nas4free: 9.0.0.1.147
101 - (27.39 MiB/s) - 7 errors;
201 - (21.19 MiB/s) - 1 error;
304 - (24.12 MiB/s) - 5 errors;
106 - 2x kernel panic with kmem error
202 - super slow, aborted

illumian+napp-it:
101 - (41.12 MiB/s) - 7 errors;
201 - (25.79 MiB/s) - 0 errors;
304 - (23.54 MiB/s) - 5 errors;
106 - (41.78 MiB/s) - 2 errors;
202 - (29.82 MiB/s) - 0 errors;

freenas 8.0.4p3
101 - (25.52 MiB/s) - 14 errors;
201 - (17.12 MiB/s) - 20 errors;
304 - (23.31 MiB/s) - 5 errors;
106 - (26.81 MiB/s) - 54 errors;
202 - (19.71 MiB/s) - 100 errors;

besides the kernel panic at nas4free you can see that the errors massively raised at freenas!

at next i tried 2 nas-systems i have
Code:
Buffalo Terastation III
101 - (0.91 MiB/s) - several hundred/thousand errors
201 - (0.87 MiB/s) - several hundred/thousand errors
304 - (8.82 MiB/s) - 11 errors
106 - after 101 not tested
202 - after 201 not tested

Synology DS110j

101 - (25.06 MiB/s) - 0 errors;
201 - (22.81 MiB/s) - 0 errors;
304 - (18.86 MiB/s) - 0 errors;
106 - (24.31 MiB/s) - 0 errors;
202 - (14.68 MiB/s) - 2 errors;

and thats something really interesting, with synology the errors are as good as gone and the buffallo raises it to new dimensions
BUT all these errors appear only with 2.x and not with 1.9 or earlier (i had theme all in use with < 1.9)

@tom, dietmar and martin: unfortunately i'm not from vienna but i would send you the buffalo nas for testing if you want (what would be really great)
another user in here told he has these errors with qnap so i think there has to be some bug/problem with the nfs implementation in 2.x
sure, there are some differences but one is always the same: with 1.9 it worked fluently
 
Last edited:
hi rengiared,

you put a lot off effort and time into this test!! Thx!
Can you tell me how your test scenario is? Do you use the regular compression (gzip, lzo) or do you use pigz?
I would like to run similar test with a Windows Storage Server 2008 I just setup and a Windows Home Server v1 (WS2003)+ SFU 3.5.

Kind regards
B.
 
i used no compression (because i try to avoid an additional error variable at restoring) and snapshot, as these two are also my productive settings
Windows Storage Server could be interesting too, you're right
please keep me up

i hope the proxmox guys see my offer :)
 
what did you test exactly, what do you count as "error"?
 
the test was to backup (snapshot mode with no compression) the same 5 productive VMs from the same 3 productive proxmox servers (described in the first post at the top) on to 5 different NFS-devices/servers
an error is for me a message like this in the syslog while a backup
Code:
Jun 14 19:16:04 pve02 pvestatd[2231]: WARNING: command 'df -P -B 1 /mnt/pve/PVE' failed: got timeout
 
the test was to backup (snapshot mode with no compression) the same 5 productive VMs from the same 3 productive proxmox servers (described in the first post at the top) on to 5 different NFS-devices/servers
an error is for me a message like this in the syslog while a backup
Code:
Jun 14 19:16:04 pve02 pvestatd[2231]: WARNING: command 'df -P -B 1 /mnt/pve/PVE' failed: got timeout

Is not pvestat deamon just for statstics (e.g. disku usage)?
 
We solved our slow transfer issues by adding ZFS intent log (ZIL) .

rsync transfers speeds to freenas are the same and consistent as to our debian squeeze systems.

for info on zil see http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#Separate_Log_Devices . we use raid-1 with two 20 GB ssd drives.

Others have had speed issues using freenas . search 'freenas slow speed' on google . most of the solutions are already written in the ZFS_Best_Practices_Guide .

others have found freenas 8.0.4 is slower then v7 by 1/2 . but I like 8.0.4 , it makes managing zfs easy .
 
my probably last status-update
unfortunately illumian won't work on my nas server, there are some driver issues with the NICs
so i installed freenas 8.2.0 beta4 completely new and made a new raidz1 and the poor performance kept
i even tried it with a raid-1 log device, nothing changed

so i finally decided to stay at the raidz1 with 4 disks and took the fifth out and into the synology ds110j
the synology works great so far, only the transfer rates are not as good as i was used to at PVE1.9 and freenas (50mb/s+ and now about 20-25mb/s)

@proxmox: i know i must be annoying to you, but why there is even no comment on my offer with the buffalo nas or something else
there must be some change in NFS handling between 1.9 and 2.x?
would you be more interested if i buy a subscription? cause when i buy it for 1000€ i would appreciate a little more interest in this issue/difference
 
Proxmox VE is open source, so all source code is available. you can take deep look on everything by yourself.

If you cannot figure it out, you can ask the community via forum, if you still find no solution you can ask our commercial support team.

All support options are here:
http://pve.proxmox.com/wiki/Get_support
 
We solved our slow transfer issues by adding ZFS intent log (ZIL) .

rsync transfers speeds to freenas are the same and consistent as to our debian squeeze systems.

for info on zil see http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#Separate_Log_Devices . we use raid-1 with two 20 GB ssd drives.

Others have had speed issues using freenas . search 'freenas slow speed' on google . most of the solutions are already written in the ZFS_Best_Practices_Guide .

others have found freenas 8.0.4 is slower then v7 by 1/2 . but I like 8.0.4 , it makes managing zfs easy .

One trick I found for ZFS, to improve the FSYNC rate as reported by pveperf I did a:

zfs set zfs:zfs_nocacheflush = 1

Where /zfs was my zfs mount point. Obviously you would do that on the freenas server, my zfs was local on the proxmox machine.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!