Hi,
I've a PVE box setup with two zfs pools :
Yesterday I moved one VM disk from poll ONE to TWO and it took way too much time :
Any advice on how to get better performance ?
I've a PVE box setup with two zfs pools :
Bash:
root@pve:~# zpool status -v ONE_Pool
pool: ONE_Pool
state: ONLINE
scan: scrub in progress since Tue Nov 29 11:48:09 2022
194G scanned at 6.91G/s, 2.67M issued at 97.7K/s, 948G total
0B repaired, 0.00% done, no estimated completion time
config:
NAME STATE READ WRITE CKSUM
ONE_Pool ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
ata-WDC_WD30EFRX-68EUZN0_WD-WCC4N6HJPTTJ ONLINE 0 0 0
ata-WDC_WD30EFRX-68EUZN0_WD-WCC4N6HJPPKV ONLINE 0 0 0
ata-WDC_WD30EFRX-68EUZN0_WD-WCC4N4AY4UFU ONLINE 0 0 0
logs
ata-KINGSTON_SA400S37240G_50026B7782F57CEF-part1 ONLINE 0 0 0
cache
ata-KINGSTON_SA400S37240G_50026B7782F57CEF-part2 ONLINE 0 0 0
errors: No known data errors
root@pve:~# zpool status -v TWO_Pool
pool: TWO_Pool
state: ONLINE
scan: scrub repaired 0B in 01:54:32 with 0 errors on Sun Nov 13 02:18:35 2022
config:
NAME STATE READ WRITE CKSUM
TWO_Pool ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
scsi-35000c500565d8e37 ONLINE 0 0 0
scsi-35000c500565daf43 ONLINE 0 0 0
scsi-35000c500565ddb63 ONLINE 0 0 0
errors: No known data errors
Bash:
root@pve:~# arc_summary
------------------------------------------------------------------------
ZFS Subsystem Report Tue Nov 29 12:05:18 2022
Linux 5.15.74-1-pve 2.1.6-pve1
Machine: pve (x86_64) 2.1.6-pve1
ARC status: HEALTHY
Memory throttle count: 0
ARC size (current): 100.0 % 39.3 GiB
Target size (adaptive): 100.0 % 39.3 GiB
Min size (hard limit): 6.2 % 2.5 GiB
Max size (high water): 16:1 39.3 GiB
Most Frequently Used (MFU) cache size: 13.7 % 5.1 GiB
Most Recently Used (MRU) cache size: 86.3 % 32.1 GiB
Metadata cache size (hard limit): 75.0 % 29.4 GiB
Metadata cache size (current): 15.8 % 4.7 GiB
Dnode cache size (hard limit): 10.0 % 2.9 GiB
Dnode cache size (current): 0.5 % 13.6 MiB
ARC hash breakdown:
Elements max: 30.7M
Elements current: 25.2 % 7.7M
Collisions: 1.6G
Chain max: 14
Chains: 1.3M
ARC misc:
Deleted: 1.5G
Mutex misses: 198.4k
Eviction skips: 30.6k
Eviction skips due to L2 writes: 2.5M
L2 cached evictions: 2.9 TiB
L2 eligible evictions: 6.0 TiB
L2 eligible MFU evictions: 9.7 % 597.9 GiB
L2 eligible MRU evictions: 90.3 % 5.4 TiB
L2 ineligible evictions: 3.0 TiB
ARC total accesses (hits + misses): 3.3G
Cache hit ratio: 61.5 % 2.0G
Cache miss ratio: 38.5 % 1.3G
Actual hit ratio (MFU + MRU hits): 61.2 % 2.0G
Data demand efficiency: 23.5 % 1.2G
Data prefetch efficiency: 2.5 % 338.7M
Cache hits by cache type:
Most frequently used (MFU): 78.0 % 1.6G
Most recently used (MRU): 21.7 % 437.1M
Most frequently used (MFU) ghost: 0.3 % 5.9M
Most recently used (MRU) ghost: 0.5 % 10.3M
Cache hits by data type:
Demand data: 14.2 % 287.1M
Demand prefetch data: 0.4 % 8.5M
Demand metadata: 85.3 % 1.7G
Demand prefetch metadata: < 0.1 % 566.8k
Cache misses by data type:
Demand data: 73.7 % 933.1M
Demand prefetch data: 26.1 % 330.3M
Demand metadata: 0.1 % 1.7M
Demand prefetch metadata: 0.1 % 635.8k
DMU prefetch efficiency: 46.7M
Hit ratio: 16.4 % 7.7M
Miss ratio: 83.6 % 39.1M
L2ARC status: DEGRADED
Low memory aborts: 4
Free on write: 20.8M
R/W clashes: 246
Bad checksums: 21
I/O errors: 0
L2ARC size (adaptive): 20.3 GiB
Compressed: 89.2 % 18.1 GiB
Header size: 0.6 % 132.9 MiB
MFU allocated size: 19.8 % 3.6 GiB
MRU allocated size: 80.0 % 14.5 GiB
Prefetch allocated size: 0.2 % 34.3 MiB
Data (buffer content) allocated size: 97.8 % 17.7 GiB
Metadata (buffer content) allocated size: 2.2 % 417.5 MiB
L2ARC breakdown: 816.9M
Hit ratio: 2.8 % 23.3M
Miss ratio: 97.2 % 793.6M
Feeds: 882.8k
L2ARC writes:
Writes sent: 100 % 443.3k
L2ARC evicts:
Lock retries: 21.0k
Upon reading: 172
Solaris Porting Layer (SPL):
spl_hostid 0
spl_hostid_path /etc/hostid
spl_kmem_alloc_max 1048576
spl_kmem_alloc_warn 65536
spl_kmem_cache_kmem_threads 4
spl_kmem_cache_magazine_size 0
spl_kmem_cache_max_size 32
spl_kmem_cache_obj_per_slab 8
spl_kmem_cache_reclaim 0
spl_kmem_cache_slab_limit 16384
spl_max_show_tasks 512
spl_panic_halt 0
spl_schedule_hrtimeout_slack_us 0
spl_taskq_kick 0
spl_taskq_thread_bind 0
spl_taskq_thread_dynamic 1
spl_taskq_thread_priority 1
spl_taskq_thread_sequential 4
Yesterday I moved one VM disk from poll ONE to TWO and it took way too much time :
Code:
drive-scsi2: transferred 2.0 TiB of 2.0 TiB (100.00%) in 15h 22m 23s, ready