I've never seen this issue before but I'm doing a restore from a local spinner to a local ZFS volume. It's taken the entire host node down to a crawl and while I can ping the node, I can't ssh into it. I've never had this happen. I've seen some really slow behavior with ZFS reading and writing. What used to restore in 5 minutes over ext4 is now taking 40-50 minutes - or longer. I'll have more information I believe when the restore is complete and I can get back into the machine - but right now it's killing me. The node and all of the vms just show a ? because the UI can't get any details. Thoughts?
What is your ZFS pool configuration? On heady load (depends on how slow setup is) pool can be very unacceptable. For SSH use another pool for server OS to avoid IO waiting.
Here's the output - am I doing something wrong? 36 minutes for ~100 gigs... Worse - it took the entire server down. This is restoring from a local spinner to a local SSD. Virtual Environment 5.2-5 Virtual Machine 100 (host6.x.com) on node 'mox2' Logs () restore vma archive: lzop -d -c /home/backups//dump/vzdump-qemu-100-2018_07_12-18_30_02.vma.lzo | vma extract -v -r /var/tmp/vzdumptmp1215315.fifo - /var/tmp/vzdumptmp1215315 CFG: size: 572 name: qemu-server.conf DEV: dev_id=1 size: 34359738368 devname: drive-virtio0 DEV: dev_id=2 size: 68719476736 devname: drive-virtio1 CTIME: Thu Jul 12 18:30:04 2018 new volume ID is 'Local-ZFS:vm-100-disk-1' map 'drive-virtio0' to '/dev/zvol/Local-ZFS/vm-100-disk-1' (write zeros = 0) new volume ID is 'Local-ZFS:vm-100-disk-2' map 'drive-virtio1' to '/dev/zvol/Local-ZFS/vm-100-disk-2' (write zeros = 0) progress 1% (read 1030815744 bytes, duration 6 sec) progress 2% (read 2061631488 bytes, duration 10 sec) progress 3% (read 3092381696 bytes, duration 15 sec) progress 4% (read 4123197440 bytes, duration 21 sec) progress 5% (read 5154013184 bytes, duration 26 sec) progress 6% (read 6184763392 bytes, duration 31 sec) progress 7% (read 7215579136 bytes, duration 35 sec) progress 8% (read 8246394880 bytes, duration 40 sec) progress 9% (read 9277145088 bytes, duration 47 sec) progress 10% (read 10307960832 bytes, duration 53 sec) progress 11% (read 11338776576 bytes, duration 59 sec) progress 12% (read 12369526784 bytes, duration 65 sec) progress 13% (read 13400342528 bytes, duration 71 sec) progress 14% (read 14431092736 bytes, duration 79 sec) progress 15% (read 15461908480 bytes, duration 85 sec) progress 16% (read 16492724224 bytes, duration 91 sec) progress 17% (read 17523474432 bytes, duration 97 sec) progress 18% (read 18554290176 bytes, duration 103 sec) progress 19% (read 19585105920 bytes, duration 109 sec) progress 20% (read 20615856128 bytes, duration 116 sec) progress 21% (read 21646671872 bytes, duration 123 sec) progress 22% (read 22677487616 bytes, duration 130 sec) progress 23% (read 23708237824 bytes, duration 136 sec) progress 24% (read 24739053568 bytes, duration 143 sec) progress 25% (read 25769803776 bytes, duration 169 sec) progress 26% (read 26800619520 bytes, duration 202 sec) progress 27% (read 27831435264 bytes, duration 235 sec) progress 28% (read 28862185472 bytes, duration 263 sec) progress 29% (read 29893001216 bytes, duration 291 sec) progress 30% (read 30923816960 bytes, duration 291 sec) progress 31% (read 31954567168 bytes, duration 291 sec) progress 32% (read 32985382912 bytes, duration 291 sec) progress 33% (read 34016198656 bytes, duration 291 sec) progress 34% (read 35046948864 bytes, duration 310 sec) progress 35% (read 36077764608 bytes, duration 356 sec) progress 36% (read 37108580352 bytes, duration 382 sec) progress 37% (read 38139330560 bytes, duration 423 sec) progress 38% (read 39170146304 bytes, duration 445 sec) progress 39% (read 40200896512 bytes, duration 474 sec) progress 40% (read 41231712256 bytes, duration 486 sec) progress 41% (read 42262528000 bytes, duration 521 sec) progress 42% (read 43293278208 bytes, duration 554 sec) progress 43% (read 44324093952 bytes, duration 583 sec) progress 44% (read 45354909696 bytes, duration 609 sec) progress 45% (read 46385659904 bytes, duration 653 sec) progress 46% (read 47416475648 bytes, duration 678 sec) progress 47% (read 48447291392 bytes, duration 703 sec) progress 48% (read 49478041600 bytes, duration 738 sec) progress 49% (read 50508857344 bytes, duration 761 sec) progress 50% (read 51539607552 bytes, duration 792 sec) progress 51% (read 52570423296 bytes, duration 819 sec) progress 52% (read 53601239040 bytes, duration 868 sec) progress 53% (read 54631989248 bytes, duration 896 sec) progress 54% (read 55662804992 bytes, duration 922 sec) progress 55% (read 56693620736 bytes, duration 961 sec) progress 56% (read 57724370944 bytes, duration 986 sec) progress 57% (read 58755186688 bytes, duration 1022 sec) progress 58% (read 59786002432 bytes, duration 1046 sec) progress 59% (read 60816752640 bytes, duration 1072 sec) progress 60% (read 61847568384 bytes, duration 1106 sec) progress 61% (read 62878384128 bytes, duration 1129 sec) progress 62% (read 63909134336 bytes, duration 1165 sec) progress 63% (read 64939950080 bytes, duration 1187 sec) progress 64% (read 65970700288 bytes, duration 1213 sec) progress 65% (read 67001516032 bytes, duration 1250 sec) progress 66% (read 68032331776 bytes, duration 1276 sec) progress 67% (read 69063081984 bytes, duration 1302 sec) progress 68% (read 70093897728 bytes, duration 1325 sec) progress 69% (read 71124713472 bytes, duration 1364 sec) progress 70% (read 72155463680 bytes, duration 1376 sec) progress 71% (read 73186279424 bytes, duration 1405 sec) progress 72% (read 74217095168 bytes, duration 1469 sec) progress 73% (read 75247845376 bytes, duration 1479 sec) progress 74% (read 76278661120 bytes, duration 1512 sec) progress 75% (read 77309411328 bytes, duration 1543 sec) progress 76% (read 78340227072 bytes, duration 1575 sec) progress 77% (read 79371042816 bytes, duration 1596 sec) progress 78% (read 80401793024 bytes, duration 1629 sec) progress 79% (read 81432608768 bytes, duration 1651 sec) progress 80% (read 82463424512 bytes, duration 1689 sec) progress 81% (read 83494174720 bytes, duration 1717 sec) progress 82% (read 84524990464 bytes, duration 1743 sec) progress 83% (read 85555806208 bytes, duration 1778 sec) progress 84% (read 86586556416 bytes, duration 1805 sec) progress 85% (read 87617372160 bytes, duration 1836 sec) progress 86% (read 88648187904 bytes, duration 1859 sec) progress 87% (read 89678938112 bytes, duration 1882 sec) progress 88% (read 90709753856 bytes, duration 1914 sec) progress 89% (read 91740504064 bytes, duration 1938 sec) progress 90% (read 92771319808 bytes, duration 1959 sec) progress 91% (read 93802135552 bytes, duration 1988 sec) progress 92% (read 94832885760 bytes, duration 2022 sec) progress 93% (read 95863701504 bytes, duration 2044 sec) progress 94% (read 96894517248 bytes, duration 2075 sec) progress 95% (read 97925267456 bytes, duration 2101 sec) progress 96% (read 98956083200 bytes, duration 2124 sec) progress 97% (read 99986898944 bytes, duration 2151 sec) progress 98% (read 101017649152 bytes, duration 2151 sec) progress 99% (read 102048464896 bytes, duration 2162 sec) progress 100% (read 103079215104 bytes, duration 2162 sec) total bytes read 103079215104, sparse bytes 9391480832 (9.11%) space reduction due to 4K zero blocks 0.202% TASK OK
Hi Nemesiz - not sure if this is helpful or what you are looking for - and I appreciate you reaching out and trying to help! Just a single disk zpool RAID-0. dir: local path /var/lib/vz content rootdir,images,vztmpl,iso maxfiles 0 dir: Local-Backups path /home/backups/ content iso,images,rootdir,vztmpl,backup maxfiles 3 rbd: ILStore1 content rootdir,images krbd 0 monhost 172.16.0.46 172.16.0.47 172.16.0.48 nodes mox3,mox0,mox1,mox2 pool ILStore1 username admin zfspool: Local-ZFS pool Local-ZFS content rootdir,images nodes mox1,mox3,mox0,mox2 sparse 1 dir: Local-SSD-ZFS disable path /Local-SSD-ZFS/storage content iso,vztmpl,images,rootdir nodes mox0 shared 0 zfspool: rpool-SSD pool rpool content images,rootdir nodes mox0 sparse 1 pool: Local-ZFS state: ONLINE scan: scrub repaired 0B in 5h50m with 0 errors on Sun Jul 8 06:14:16 2018 config: NAME STATE READ WRITE CKSUM Local-ZFS ONLINE 0 0 0 sdb1 ONLINE 0 0 0 errors: No known data errors pool: rpool state: ONLINE status: Some supported features are not enabled on the pool. The pool can still be used, but some features are unavailable. action: Enable all features using 'zpool upgrade'. Once this is done, the pool may no longer be accessible by software that does not support the features. See zpool-features(5) for details. scan: scrub repaired 0B in 0h4m with 0 errors on Sun Jul 8 00:28:28 2018 config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 sda2 ONLINE 0 0 0 errors: No known data errors NAME USED AVAIL REFER MOUNTPOINT Local-ZFS 575G 324G 144K /Local-ZFS Local-ZFS/dump 96K 324G 96K /Local-ZFS/dump Local-ZFS/image 96K 324G 96K /Local-ZFS/image Local-ZFS/iso 96K 324G 96K /Local-ZFS/iso Local-ZFS/private 96K 324G 96K /Local-ZFS/private Local-ZFS/storage 96K 324G 96K /Local-ZFS/storage Local-ZFS/template 96K 324G 96K /Local-ZFS/template Local-ZFS/vm-100-disk-1 27.8G 324G 27.0G - Local-ZFS/vm-100-disk-2 61.0G 324G 61.0G - Local-ZFS/vm-109-disk-1 32.3G 324G 32.2G - Local-ZFS/vm-109-disk-2 406G 324G 404G - Local-ZFS/vm-110-disk-1 23.8G 324G 23.8G - Local-ZFS/vm-111-disk-1 23.8G 324G 23.8G - rpool 38.8G 186G 96K /rpool rpool/ROOT 7.77G 186G 96K /rpool/ROOT rpool/ROOT/pve-1 7.77G 186G 7.77G / rpool/swap 30.8G 188G 29.1G - root@mox2:~# zpool status pool: Local-ZFS state: ONLINE scan: scrub repaired 0B in 5h50m with 0 errors on Sun Jul 8 06:14:16 2018 config: NAME STATE READ WRITE CKSUM Local-ZFS ONLINE 0 0 0 sdb1 ONLINE 0 0 0 errors: No known data errors zpool: rpool state: ONLINE status: Some supported features are not enabled on the pool. The pool can still be used, but some features are unavailable. action: Enable all features using 'zpool upgrade'. Once this is done, the pool may no longer be accessible by software that does not support the features. See zpool-features(5) for details. scan: scrub repaired 0B in 0h4m with 0 errors on Sun Jul 8 00:28:28 2018 config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 sda2 ONLINE 0 0 0 errors: No known data errors root@mox2:~# zpool iostat capacity operations bandwidth pool alloc free read write read write ---------- ----- ----- ----- ----- ----- ----- Local-ZFS 573G 355G 518 103 12.0M 6.98M rpool 37.0G 195G 31 44 212K 597K ---------- ----- ----- ----- ----- ----- ----- root@mox2:~# zpool list NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT Local-ZFS 928G 573G 355G - 39% 61% 1.00x ONLINE - rpool 232G 37.0G 195G - 53% 15% 1.00x ONLINE -
NAME PROPERTY VALUE SOURCE rpool type filesystem - rpool creation Thu Mar 3 10:11 2016 - rpool used 38.8G - rpool available 186G - rpool referenced 96K - rpool compressratio 1.14x - rpool mounted yes - rpool quota none default rpool reservation none default rpool recordsize 128K default rpool mountpoint /rpool default rpool sharenfs off default rpool checksum on default rpool compression lz4 local rpool atime off local rpool devices on default rpool exec on default rpool setuid on default rpool readonly off default rpool zoned off default rpool snapdir hidden default rpool aclinherit restricted default rpool createtxg 1 - rpool canmount on default rpool xattr on default rpool copies 1 default rpool version 5 - rpool utf8only off - rpool normalization none - rpool casesensitivity sensitive - rpool vscan off default rpool nbmand off default rpool sharesmb off default rpool refquota none default rpool refreservation none default rpool guid 9946285014865877177 - rpool primarycache all default rpool secondarycache all default rpool usedbysnapshots 0B - rpool usedbydataset 96K - rpool usedbychildren 38.8G - rpool usedbyrefreservation 0B - rpool logbias latency default rpool dedup off default rpool mlslabel none default rpool sync standard local rpool dnodesize legacy default rpool refcompressratio 1.00x - rpool written 96K - rpool logicalused 41.6G - rpool logicalreferenced 40K - rpool volmode default default rpool filesystem_limit none default rpool snapshot_limit none default rpool filesystem_count none default rpool snapshot_count none default rpool snapdev hidden default rpool acltype off default rpool context none default rpool fscontext none default rpool defcontext none default rpool rootcontext none default rpool relatime off default rpool redundant_metadata all default rpool overlay off default
And then: Local-ZFS root@mox2:~# zfs get all Local-ZFS NAME PROPERTY VALUE SOURCE Local-ZFS type filesystem - Local-ZFS creation Tue Jun 19 23:55 2018 - Local-ZFS used 573G - Local-ZFS available 326G - Local-ZFS referenced 144K - Local-ZFS compressratio 1.00x - Local-ZFS mounted yes - Local-ZFS quota none default Local-ZFS reservation none default Local-ZFS recordsize 128K default Local-ZFS mountpoint /Local-ZFS default Local-ZFS sharenfs off default Local-ZFS checksum on default Local-ZFS compression off default Local-ZFS atime on default Local-ZFS devices on default Local-ZFS exec on default Local-ZFS setuid on default Local-ZFS readonly off default Local-ZFS zoned off default Local-ZFS snapdir hidden default Local-ZFS aclinherit restricted default Local-ZFS createtxg 1 - Local-ZFS canmount on default Local-ZFS xattr on default Local-ZFS copies 1 default Local-ZFS version 5 - Local-ZFS utf8only off - Local-ZFS normalization none - Local-ZFS casesensitivity sensitive - Local-ZFS vscan off default Local-ZFS nbmand off default Local-ZFS sharesmb off default Local-ZFS refquota none default Local-ZFS refreservation none default Local-ZFS guid 6229891795844742391 - Local-ZFS primarycache all default Local-ZFS secondarycache all default Local-ZFS usedbysnapshots 0B - Local-ZFS usedbydataset 144K - Local-ZFS usedbychildren 573G - Local-ZFS usedbyrefreservation 0B - Local-ZFS logbias latency default Local-ZFS dedup off default Local-ZFS mlslabel none default Local-ZFS sync standard default Local-ZFS dnodesize legacy default Local-ZFS refcompressratio 1.00x - Local-ZFS written 144K - Local-ZFS logicalused 570G - Local-ZFS logicalreferenced 60.5K - Local-ZFS volmode default default Local-ZFS filesystem_limit none default Local-ZFS snapshot_limit none default Local-ZFS filesystem_count none default Local-ZFS snapshot_count none default Local-ZFS snapdev hidden default Local-ZFS acltype off default Local-ZFS context none default Local-ZFS fscontext none default Local-ZFS defcontext none default Local-ZFS rootcontext none default Local-ZFS relatime off default Local-ZFS redundant_metadata all default Local-ZFS overlay off default
I suggest you to set sync=disabled to avoid double write. Single disk is single disk. ZFS don't have IO process priority. What you have to know 1. The data goes like this: Program -> ZFS write cache (not ZIL) -> disk 2. ZFS flush data from write cache to disk every ~5 sec 3. Then the write cache is full and disk flush in progress -> happens IO wait for programs (freeze) Maybe your SSD is consumer level and cant handle much.
Thanks Nemesiz... They are Samsumg 850 Pro's - but there are better/stronger/faster. I read these threads yesterday: https://forum.proxmox.com/threads/zfs-sync-disabled.37900/ https://forum.proxmox.com/threads/p...-ssd-drives-sync-parameter.31130/#post-155543 So basically on the pve host nodes I should enter: zfs set sync=disabled ? And no changes to vms and their caching correct? I typically use writethrough
immediately If you set to Local-ZFS will effect Local-ZFS/vm-100-disk-1 and so on. And you can set individually for sub fs.
Thanks again for your help. I'm going to let things run like this for a day or two and do some testing tonight to see how things behave. Fingers crossed.
Nemesiz - this made all the difference in the world. One last question - I have some latency on a 15TB Ceph volume. It's set to Writethrough cache as well. The Ceph I use this particular Ceph mount for backup storage which is fairly static. Other than the 5 second lag for caching, are there any dangers to the data in changing the cache back to the default of NoCache? (3 node Ceph cluster w/ 3 separate monitoring nodes. Ceph drives are spinners w/ SSDs for caching)