qcow2 VM's running on zfs nfs are slow.

davec

New Member
Jun 8, 2015
15
1
1
Iowa
Have a C7000 with NAS4FREE NFS storage ....

The issue is that upgrades on my vm's are taking 35 minutes ... but on my crappy T400 laptop, the same upgrade will take less than 3 minutes. Sigh. The one blade on the C7000 running the proxmox host shows 0 iowait time... the vm shows like 95+% iowait during upgrading (see images at bottom of post). (host on the blade is configure with local SSD storage, vm is on qcow2 nfs storage).

NETWORK
1 GIG network, I do NOT have jumbo frames enabled

PROXMOX CLUSTER INFORMATION
Cluster of 3 proxmox hosts. /etc/hosts file has been updated to include all three in each one.
root@prox100:~# pveversion
pve-manager/4.0-48/0d8559d0 (running kernel: 4.2.2-1-pve)
64 GB ram
2- Intel X5660 processors

NAS INFORMATION
Version 9.1.0.1 - Sandstorm (revision 847)
Build date Sun Aug 18 03:49:41 CEST 2013
Platform OS FreeBSD 9.1-RELEASE-p5 (kern.osreldate: 901000)
Platform x64-embedded on Intel(R) Celeron(R) CPU G1610 @ 2.60GHz
Ram 8 GB
System ASRock Z77 Pro4
System bios American Megatrends Inc. version: P1.60 12/06/2012
System time
System uptime 39 minute(s) 3 second(s)
NAS 23% of 16.2TB
Total: 16.2T | Used: 2.59T | Free: 8.06T | State: ONLINE
6 - 3TB disks RAIDZ2 (all three TOSHIBA DT01ACA300 )

ON LOCAL STORAGE
root@prox100:~# pveperf
CPU BOGOMIPS: 128002.44
REGEX/SECOND: 1206298
HD SIZE: 7.01 GB (/dev/dm-0)
BUFFERED READS: 169.73 MB/sec
AVERAGE SEEK TIME: 0.20 ms
FSYNCS/SECOND: 572.80
DNS EXT: 105.45 ms
DNS INT: 68.47 ms (home.priv)

ON NFS STORAGE
root@prox100:~# pveperf /mnt/pve/NAS-isos/
CPU BOGOMIPS: 128002.44
REGEX/SECOND: 1227257
HD SIZE: 8361.39 GB (192.168.2.253:/mnt/NAS/isos)
FSYNCS/SECOND: 118.36
DNS EXT: 83.72 ms
DNS INT: 84.45 ms (home.priv)

NAS SETTINGS
nas4free: ~ # sysctl -a | grep nfsd
kern.features.nfsd: 1
vfs.nfsd.disable_checkutf8: 0
vfs.nfsd.server_max_nfsvers: 3
vfs.nfsd.server_min_nfsvers: 2
vfs.nfsd.nfs_privport: 0
vfs.nfsd.enable_locallocks: 0
vfs.nfsd.issue_delegations: 0
vfs.nfsd.commit_miss: 0
vfs.nfsd.commit_blks: 0
vfs.nfsd.mirrormnt: 1
vfs.nfsd.minthreads: 48
vfs.nfsd.maxthreads: 48
vfs.nfsd.threads: 48
vfs.nfsd.request_space_used: 0
vfs.nfsd.request_space_used_highest: 1291212
vfs.nfsd.request_space_high: 13107200
vfs.nfsd.request_space_low: 8738133
vfs.nfsd.request_space_throttled: 0
vfs.nfsd.request_space_throttle_count: 0

NAME STATE READ WRITE CKSUM
NAS ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
ada0.nop ONLINE 0 0 0
ada1.nop ONLINE 0 0 0
ada2.nop ONLINE 0 0 0
ada3.nop ONLINE 0 0 0
ada4.nop ONLINE 0 0 0
ada5.nop ONLINE 0 0 0

errors: No known data errors


FWIW, vmware esxi seemed fine when I was running it.... nothing this slow...

I've read documentation till my eye's are sore...

1. I want to use qcow2 because of the snapshots.
2. The vm's are set as virtio disks.
3. The network is set as virtio as well.
4. Processors are set to "host" with numa enabled.
5. Need to use the NAS for several reasons, but mostly because buying disks for all the blades would be insanely expensive, and have the same data store for the cluster is a must.

I am open to suggestions. I REALLY want Proxmox to work.... I do NOT want to go back to vmware...

Thanks in advance.

Image of high iowait on the vm
VM-high-io-wait.png

Image of no iowait on the host
HOST-no-waitime.png
 
Hrm... after doing even MORE searching, I've come across this post: http://ninjix.blogspot.com/2011/02/get-those-fsync-numbers-up-on-your-zfs.html ........ I think I need to add an SSD for logging... thoughts ?

nas4free: ~ # zpool status NAS
pool: NAS
state: ONLINE
scan: scrub in progress since Sun Jan 31 21:17:47 2016
606G scanned out of 3.90T at 180M/s, 5h21m to go
0 repaired, 15.18% done
config:

NAME STATE READ WRITE CKSUM
NAS ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
ada0.nop ONLINE 0 0 0
ada1.nop ONLINE 0 0 0
ada2.nop ONLINE 0 0 0
ada3.nop ONLINE 0 0 0
ada4.nop ONLINE 0 0 0
ada5.nop ONLINE 0 0 0

errors: No known data errors
 
Without a dedidicated log ssd, I think you can't reach super fast write speed, because :

1 : you use raid-2 , so you need to calculate parity of 2 drives
2 : if you don't have dedicated log device, you'll write twice datas on zfs storage
3 : qcow2 is a cow filesystem (on top of zfs which is also a cow filesystem) . So you'll have double writes too on qcow2.

So you'll have around speed of 4drives/4, around 150iops
 
  • Like
Reactions: davec
Thanks for your thoughts. I will look into getting a SSD for logging, hopefully that will at least improve things. ;)
 
Ok, I have some SSD's ordered for the NAS to add in log and cache, and am going to double the ram. I will post back once that is installed.

I did clone my template to a raw format, and the speed increase was considerable. Since I'm on ZFS already with the vm living on the nas, and the nas doing nightly snapshots.... I might be ok giving up snapshots with the raw format, for the increase in speed. To be continued....
 
Ok, small update. I haven't got the cache or log drives installed yet, but I did increase the ram yesterday. I went from 8 GB's to 32 GB's. This DID make a difference. Before when upgrading a vm, I would often see lines like "
INFO: task jbd2/dm-0-8:387 blocked for more than 120 seconds." and the vm might take 35 minutes to upgrade. Now they take 6 - 9 minutes to upgrade, with no errors. This is a HUGE improvement. Still doesn't feel like it's as fast as it should be, pveperf is showing no noticeable changes when ran yet, I'm hoping when I add the log drive that helps that Again... to be continued (as my parts arrive lol).
 
Ok, update....... FREAKING AMAZING RESULTS !!

I finally got in my sata connectors and my SSD's.

Added a 60 GB Kensington SSD for log
Added a 254 GB Samsung SSD for cache

now... i have the following to report from pveperf:

root@prox1:~# pveperf /mnt/pve/NAS-isos/
CPU BOGOMIPS: 140799.12
REGEX/SECOND: 1327792
HD SIZE: 8470.64 GB (192.168.2.253:/mnt/NAS/isos)
FSYNCS/SECOND: 1347.73
DNS EXT: 84.19 ms
DNS INT: 61.36 ms

for reference it WAS:
ON NFS STORAGE
root@prox100:~# pveperf /mnt/pve/NAS-isos/
CPU BOGOMIPS: 128002.44
REGEX/SECOND: 1227257
HD SIZE: 8361.39 GB (192.168.2.253:/mnt/NAS/isos)
FSYNCS/SECOND: 118.36
DNS EXT: 83.72 ms
DNS INT: 84.45 ms

so, my FSYNCS/SECOND went from 118.36 to a whopping 1347.73 ...... and did it ever make a difference.

Those updates that were taking 30-40 minutes, I cut down to 6-10 minutes by adding 4 times the ram..going from 8gb to 32 gib... but the kicker was after adding the cache and the log drives, those updates took less than 2 minutes ... (i have a very fast download, but the download of the packages was probably 1/4 of that time or a bit more...) .......... blazing fast upgrades and response times now.

Just thought I'd put it out there... even with qcow2, with enough hardware, zfs nas can work fine.... and boy is it worth it!!!! :) Hope this helps someone else.

Edit: Here is what my nas consists of, in case anyone wants to know. FWIWl, there are several things I will change when building my next nas... use ECC ram, use dual power supply head, more disks, more ram, more ssds... :)

NAS Software: NAS4FREE (x64 embedded)
Motherboard: ASRock Z77 Pro4
Processor: Intel Celeron CPU G1610 @ 2.60GHz
POWER: 800W ATX12V / EPS12V SLI Ready CrossFire Ready
RAM: 32 GB 4x8 DDR3 SDRAM DDR3 1600 (PC3 12800)
NETWORK: Intel EXPI9301CTBLK Network Adapter 10/ 100/ 1000Mbps PCI-Express
HARD DRIVES: 6 - Toshiba 3TB 7200 RPM 64 MB Cache SATA (zfs raidz2)
SSD: Samsung 840 Pro 256 GB Sata III (cache)
SSD: Kingston V300 60 GB (log)
 
Last edited:
  • Like
Reactions: Lucio

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!