Hello,
I have the same result. I did a short ping test prior to updating the drivers and averaged
0.110 with a 0.025 moving deviation. The min/max values are 0.049 and 0.149. Do you get that much noise in a simple ping? That is a three fold difference although I will grant that it is overall much lower than my 1 Gb pings. After updating the drivers a similar test shows an average ping time of 0.102, mdev of 0.031, with min/max of 0.075/0.170. It seems quicker but no less noisy. I am not sure how valid these numbers are statistically. I only let the ping run for about 20 counts. From the naked eye's perspective it did seem quicker.
Once I had two nodes with updated drivers, I did another test backup to an NFS share backed by a ZFS dataset. The destination zpool is only a four disk pool of two mirrors but it does have an SSD (of middling quality) ZIL. The job generally ran at around 200 MB/s but at 100% completion it just hung. Check out the syslog:
Jun 02 16:15:59 pve-1 pvedaemon[16091]: INFO: starting new backup job: vzdump 204 --compress lzo --mode snapshot --remove 0 --storage pve3backup --node pve-1
Jun 02 16:15:59 pve-1 pvedaemon[16091]: INFO: Starting Backup of VM 204 (qemu)
Jun 02 16:16:00 pve-1 qm[16094]: <root@pam> update VM 204: -lock backup
Jun 02 16:16:00 pve-1 systemd[1]: [/run/systemd/system/204.scope.d/50-Description.conf:3] Unknown lvalue 'ase' in section 'Unit'
Jun 02 16:16:00 pve-1 systemd[1]: Failed to reset devices.list on /system.slice: Invalid argument
Jun 02 16:16:00 pve-1 kernel: device tap204i0 entered promiscuous mode
Jun 02 16:16:00 pve-1 kernel: vmbr0: port 2(tap204i0) entered forwarding state
Jun 02 16:16:00 pve-1 kernel: vmbr0: port 2(tap204i0) entered forwarding state
Jun 02 16:17:01 pve-1 CRON[16301]: pam_unix(cron:session): session opened for user root by (uid=0)
Jun 02 16:17:01 pve-1 CRON[16302]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Jun 02 16:17:01 pve-1 CRON[16301]: pam_unix(cron:session): session closed for user root
Jun 02 16:22:28 pve-1 pveproxy[15932]: worker exit
Jun 02 16:22:28 pve-1 pveproxy[10039]: worker 15932 finished
Jun 02 16:22:28 pve-1 pveproxy[10039]: starting 1 worker(s)
Jun 02 16:22:28 pve-1 pveproxy[10039]: worker 17064 started
Jun 02 16:23:48 pve-1 kernel: nfs: server 10.0.0.80 not responding, still trying
Jun 02 16:25:37 pve-1 pvedaemon[10030]: <root@pam> successful auth for user 'root@pam'
Jun 02 16:25:39 pve-1 kernel: nfs: server 10.0.0.80 OK
Jun 02 16:25:47 pve-1 pveproxy[10040]: worker exit
Jun 02 16:25:47 pve-1 pveproxy[10039]: worker 10040 finished
Jun 02 16:25:47 pve-1 pveproxy[10039]: starting 1 worker(s)
Jun 02 16:25:47 pve-1 pveproxy[10039]: worker 17529 started
Jun 02 16:33:05 pve-1 pveproxy[17064]: worker exit
Jun 02 16:33:05 pve-1 pveproxy[10039]: worker 17064 finished
Jun 02 16:33:05 pve-1 pveproxy[10039]: starting 1 worker(s)
Jun 02 16:33:05 pve-1 pveproxy[10039]: worker 18555 started
Jun 02 16:33:32 pve-1 rrdcached[9801]: flushing old values
Jun 02 16:33:32 pve-1 rrdcached[9801]: rotating journals
Jun 02 16:33:32 pve-1 rrdcached[9801]: started new journal /var/lib/rrdcached/journal/rrd.journal.1464903212.971187
Jun 02 16:33:33 pve-1 pmxcfs[9948]: [dcdb] notice: data verification successful
Jun 02 16:33:43 pve-1 kernel: nfs: server 10.0.0.80 not responding, still trying
Jun 02 16:35:15 pve-1 kernel: INFO: task task UPIDve-1:16091 blocked for more than 120 seconds.
Jun 02 16:35:15 pve-1 kernel: Tainted: P O 4.4.6-1-pve #1
Jun 02 16:35:15 pve-1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 02 16:35:15 pve-1 kernel: task UPID
ve-1 D ffff883f8de7bd98 0 16091 10029 0x00000000
Jun 02 16:35:15 pve-1 kernel: ffff883f8de7bd98 ffff883fc37309b0 ffff881ff295c4c0 ffff883fad951b80
Jun 02 16:35:15 pve-1 kernel: ffff883f8de7c000 ffff883fc3730a5c ffff883fad951b80 00000000ffffffff
Jun 02 16:35:15 pve-1 kernel: ffff883fc3730a60 ffff883f8de7bdb0 ffffffff818435c5 ffff883fc3730a58
Jun 02 16:35:15 pve-1 kernel: Call Trace:
Jun 02 16:35:15 pve-1 kernel: [<ffffffff818435c5>] schedule+0x35/0x80
Jun 02 16:35:15 pve-1 kernel: [<ffffffff8184387e>] schedule_preempt_disabled+0xe/0x10
Jun 02 16:35:15 pve-1 kernel: [<ffffffff81845589>] __mutex_lock_slowpath+0xb9/0x130
Jun 02 16:35:15 pve-1 kernel: [<ffffffff8184561f>] mutex_lock+0x1f/0x30
Jun 02 16:35:15 pve-1 kernel: [<ffffffffc0a1d095>] nfs_file_fsync+0x45/0x130 [nfs]
Jun 02 16:35:15 pve-1 kernel: [<ffffffff8124084d>] vfs_fsync_range+0x3d/0xb0
Jun 02 16:35:15 pve-1 kernel: [<ffffffff812408dc>] vfs_fsync+0x1c/0x20
Jun 02 16:35:15 pve-1 kernel: [<ffffffffc0a1d596>] nfs_file_flush+0x46/0x60 [nfs]
Jun 02 16:35:15 pve-1 kernel: [<ffffffff8120959f>] filp_close+0x2f/0x70
Jun 02 16:35:15 pve-1 kernel: [<ffffffff8122ae23>] __close_fd+0xa3/0xc0
Jun 02 16:35:15 pve-1 kernel: [<ffffffff81209603>] SyS_close+0x23/0x50
Jun 02 16:35:15 pve-1 kernel: [<ffffffff818476f6>] entry_SYSCALL_64_fastpath+0x16/0x75
Jun 02 16:36:53 pve-1 kernel: nfs: server 10.0.0.80 OK
Jun 02 16:37:02 pve-1 kernel: zd64: p1 p2
Jun 02 16:37:02 pve-1 kernel: zd32: p1
Jun 02 16:37:02 pve-1 kernel: vmbr0: port 2(tap204i0) entered disabled state
Jun 02 16:37:04 pve-1 pvedaemon[16091]: INFO: Finished Backup of VM 204 (00:21:05)
Jun 02 16:37:04 pve-1 pvedaemon[16091]: INFO: Backup job finished successfully
Jun 02 16:39:52 pve-1 pvedaemon[10029]: worker exit
On a somewhat related ticket I opened with Proxmox I complained that the job failed because that is what it looked like when I had to leave work yesterday. I had to leave immediately or else I would have killed the backup job, unlocked the VM, then rebooted the server. Luckily, I didn't do those things. When I got back to the office today I found that the job finished. Fifteen minutes into the backup job is where it reported that 100% had been backed up. Then it just sat there for a few minutes and I had to leave because everyone was going home. I went home with a heavy heart. If I had been able to wait another five minutes I could have gone home with the feeling of a job well done!
Why is it doing this? I have highlighted some of the obviously backup related messages just to save you from having to sift through all the lines. I don't know what half of this stuff means but I definitely don't like the sound of the word "Tainted" at
Jun 02 16:35:15.
The fact that it completes at all is an improvement over the previous state. I could speculate that the ZIL is fooling the NFS client into thinking that all of the backup's bits have been sent and therefore received by the NFS server. In actuality, the receiving server is still trying to get everything written to spindles. Eventually, everything is written to spindle and the NFS server will then agree to respond to the NFS client and the backup process can proceed with closing and unlocking the VM. It could be that if I forced synchronous writes I could get the backup to complete without errors but I bet it would take a long time. Something about NFS server is not willing to trust that the bits have all been committed to storage. I believe that ZFS by design is supposed to fool the kernel that the bits actually HAVE been committed but for some reason this isn't happening.
I will experiment further by backing up to a faster SSD based pool. I have a feeling that I am still missing something. It seems like my zpool should be able to write a backup file of 91 GB in less than 21 minutes. That is only a rate of 74 MB/s. I could go that fast with 1 Gb Ethernet! I would expect my zpool to handle 200 MB/s for a bulk copy even without an SSD ZIL. I will post what I find out.
Any guidance is appreciated.
Sincerely,
GB