huge IO delay ZFS

coldzeek

Member
Feb 4, 2020
15
0
21
35
Hi,

I recently i migrated to Proxmox 6.1-5 from ESXi and have a problem with IO delay.
I have three ZFS pool:
datastore0: 2 NVME Silicon Motion SM2263XT - mirror (ashift=12, atime=off) 1tb
datastore1: 7 HDD Toshiba HDWD130 - raidz1 (ashift=12, compression=lz4, atime=off) 19tb
datastore2: 3 HDD Toshiba DT01ACA200 - raidz1 (ashift=12, atime=off) 5,3tb

CPU AMD Epyc 7551
RAM ECC 128gb
swapoff
options zfs zfs_arc_max=25179869184

Problem:
If I move kvm or lxc volume from datastore 1 to datastore 2, i have io delay near 30%-35%.
And VM on datastore0 have very huge response on this time.
Server is not ofercomited, avarade load ~10%.
 
Last edited:
Hi,

what Controller do you use to connect this Drives?
Generall only HBAs are recommended for Ceph and ZFS
 
Hi,

Can you show the output of your:

arc_summary

Anyway, is not recomanded to use many pools on the same host. And your arc size seems to be low. And not least if you have a vm or ct with a high io, any move will result in a high io on any storage do you may use!

Good luck / Bafta !
 
Hi,

Can you show the output of your:

arc_summary

Anyway, is not recomanded to use many pools on the same host. And your arc size seems to be low. And not least if you have a vm or ct with a high io, any move will result in a high io on any storage do you may use!

Good luck / Bafta !

Thank you for your answer.

ZFS Subsystem Report Thu Feb 06 08:29:06 2020
Linux 5.3.13-3-pve 0.8.3-pve1
Machine: pve0 (x86_64) 0.8.3-pve1

ARC status: HEALTHY
Memory throttle count: 0

ARC size (current): 23.6 % 15.1 GiB
Target size (adaptive): 23.6 % 15.1 GiB
Min size (hard limit): 6.1 % 3.9 GiB
Max size (high water): 16:1 64.0 GiB
Most Frequently Used (MFU) cache size: 16.7 % 437.6 MiB
Most Recently Used (MRU) cache size: 83.3 % 2.1 GiB
Metadata cache size (hard limit): 75.0 % 48.0 GiB
Metadata cache size (current): 17.2 % 8.2 GiB
Dnode cache size (hard limit): 10.0 % 4.8 GiB
Dnode cache size (current): 3.1 % 153.0 MiB

ARC hash breakdown:
Elements max: 80.6M
Elements current: 100.0 % 80.6M
Collisions: 183.9M
Chain max: 20
Chains: 16.0M

ARC misc:
Deleted: 116.7M
Mutex misses: 5.9k
Eviction skips: 14.1M

ARC total accesses (hits + misses): 391.7M
Cache hit ratio: 94.9 % 371.6M
Cache miss ratio: 5.1 % 20.1M
Actual hit ratio (MFU + MRU hits): 94.4 % 369.7M
Data demand efficiency: 91.5 % 121.6M
Data prefetch efficiency: 24.8 % 11.9M

Cache hits by cache type:
Most frequently used (MFU): 97.2 % 361.2M
Most recently used (MRU): 2.3 % 8.4M
Most frequently used (MFU) ghost: 0.4 % 1.3M
Most recently used (MRU) ghost: 0.3 % 948.3k

Cache hits by data type:
Demand data: 29.9 % 111.3M
Demand prefetch data: 0.8 % 3.0M
Demand metadata: 69.0 % 256.5M
Demand prefetch metadata: 0.2 % 841.4k

Cache misses by data type:
Demand data: 51.6 % 10.4M
Demand prefetch data: 44.5 % 9.0M
Demand metadata: 2.7 % 541.5k
Demand prefetch metadata: 1.2 % 241.9k

DMU prefetch efficiency: 370.0M
Hit ratio: 9.6 % 35.7M
Miss ratio: 90.4 % 334.3M

L2ARC status: HEALTHY
Low memory aborts: 146
Free on write: 71.9k
R/W clashes: 9
Bad checksums: 0
I/O errors: 0

L2ARC size (adaptive): 650.4 GiB
Compressed: 96.0 % 624.4 GiB
Header size: 1.1 % 7.1 GiB

L2ARC breakdown: 20.1M
Hit ratio: 5.9 % 1.2M
Miss ratio: 94.1 % 18.9M
Feeds: 34.6k

L2ARC writes:
Writes sent: 100 % 33.5 KiB

L2ARC evicts:
Lock retries: 0
Upon reading: 0

Solaris Porting Layer (SPL):
spl_hostid 0
spl_hostid_path /etc/hostid
spl_kmem_alloc_max 1048576
spl_kmem_alloc_warn 65536
spl_kmem_cache_expire 2
spl_kmem_cache_kmem_limit 2048
spl_kmem_cache_kmem_threads 4
spl_kmem_cache_magazine_size 0
spl_kmem_cache_max_size 32
spl_kmem_cache_obj_per_slab 8
spl_kmem_cache_obj_per_slab_min 1
spl_kmem_cache_reclaim 0
spl_kmem_cache_slab_limit 16384
spl_max_show_tasks 512
spl_panic_halt 0
spl_schedule_hrtimeout_slack_us 0
spl_taskq_kick 0
spl_taskq_thread_bind 0
spl_taskq_thread_dynamic 1
spl_taskq_thread_priority 1
spl_taskq_thread_sequential 4
 
Tunables:
dbuf_cache_hiwater_pct 10
dbuf_cache_lowater_pct 10
dbuf_cache_max_bytes 2111605056
dbuf_cache_shift 5
dbuf_metadata_cache_max_bytes 1055802528
dbuf_metadata_cache_shift 6
dmu_object_alloc_chunk_shift 7
dmu_prefetch_max 134217728
ignore_hole_birth 1
l2arc_feed_again 1
l2arc_feed_min_ms 50
l2arc_feed_secs 1
l2arc_headroom 2
l2arc_headroom_boost 200
l2arc_noprefetch 1
l2arc_norw 0
l2arc_write_boost 100000000
l2arc_write_max 100000000
metaslab_aliquot 524288
metaslab_bias_enabled 1
metaslab_debug_load 0
metaslab_debug_unload 0
metaslab_df_max_search 16777216
metaslab_df_use_largest_segment 0
metaslab_force_ganging 16777217
metaslab_fragmentation_factor_enabled 1
metaslab_lba_weighting_enabled 1
metaslab_preload_enabled 1
send_holes_without_birth_time 1
spa_asize_inflation 24
spa_config_path /etc/zfs/zpool.cache
spa_load_print_vdev_tree 0
spa_load_verify_data 1
spa_load_verify_metadata 1
spa_load_verify_shift 4
spa_slop_shift 5
vdev_removal_max_span 32768
vdev_validate_skip 0
zap_iterate_prefetch 1
zfetch_array_rd_sz 1048576
zfetch_max_distance 8388608
zfetch_max_streams 8
zfetch_min_sec_reap 2
zfs_abd_scatter_enabled 1
zfs_abd_scatter_max_order 10
zfs_abd_scatter_min_size 1536
zfs_admin_snapshot 0
zfs_arc_average_blocksize 8192
zfs_arc_dnode_limit 0
zfs_arc_dnode_limit_percent 10
zfs_arc_dnode_reduce_percent 10
zfs_arc_grow_retry 0
zfs_arc_lotsfree_percent 10
zfs_arc_max 68719476736
zfs_arc_meta_adjust_restarts 4096
zfs_arc_meta_limit 0
zfs_arc_meta_limit_percent 75
zfs_arc_meta_min 0
zfs_arc_meta_prune 10000
zfs_arc_meta_strategy 1
zfs_arc_min 0
zfs_arc_min_prefetch_ms 0
zfs_arc_min_prescient_prefetch_ms 0
zfs_arc_p_dampener_disable 1
zfs_arc_p_min_shift 0
zfs_arc_pc_percent 0
zfs_arc_shrink_shift 0
zfs_arc_sys_free 0
zfs_async_block_max_blocks 100000
zfs_autoimport_disable 1
zfs_checksum_events_per_second 20
zfs_commit_timeout_pct 5
zfs_compressed_arc_enabled 1
zfs_condense_indirect_commit_entry_delay_ms 0
zfs_condense_indirect_vdevs_enable 1
zfs_condense_max_obsolete_bytes 1073741824
zfs_condense_min_mapping_bytes 131072
zfs_dbgmsg_enable 1
zfs_dbgmsg_maxsize 4194304
zfs_dbuf_state_index 0
zfs_ddt_data_is_special 1
zfs_deadman_checktime_ms 60000
zfs_deadman_enabled 1
zfs_deadman_failmode wait
zfs_deadman_synctime_ms 600000
zfs_deadman_ziotime_ms 300000
zfs_dedup_prefetch 0
zfs_delay_min_dirty_percent 60
zfs_delay_scale 500000
zfs_delete_blocks 20480
zfs_dirty_data_max 4294967296
zfs_dirty_data_max_max 4294967296
zfs_dirty_data_max_max_percent 25
zfs_dirty_data_max_percent 10
zfs_dirty_data_sync_percent 20
zfs_disable_ivset_guid_check 0
zfs_dmu_offset_next_sync 0
zfs_expire_snapshot 300
zfs_flags 0
zfs_free_bpobj_enabled 1
zfs_free_leak_on_eio 0
zfs_free_min_time_ms 1000
zfs_immediate_write_sz 262144
zfs_initialize_value 16045690984833335022
zfs_key_max_salt_uses 400000000
zfs_lua_max_instrlimit 100000000
zfs_lua_max_memlimit 104857600
zfs_max_missing_tvds 0
zfs_max_recordsize 1048576
zfs_metaslab_fragmentation_threshold 70
zfs_metaslab_segment_weight_enabled 1
zfs_metaslab_switch_threshold 2
zfs_mg_fragmentation_threshold 95
zfs_mg_noalloc_threshold 0
zfs_multihost_fail_intervals 10
zfs_multihost_history 0
zfs_multihost_import_intervals 20
zfs_multihost_interval 1000
zfs_multilist_num_sublists 0
zfs_no_scrub_io 0
zfs_no_scrub_prefetch 0
zfs_nocacheflush 0
zfs_nopwrite_enabled 1
zfs_object_mutex_size 64
zfs_obsolete_min_time_ms 500
 
zfs_override_estimate_recordsize 0
zfs_pd_bytes_max 52428800
zfs_per_txg_dirty_frees_percent 5
zfs_prefetch_disable 0
zfs_read_chunk_size 1048576
zfs_read_history 0
zfs_read_history_hits 0
zfs_reconstruct_indirect_combinations_max 4096
zfs_recover 0
zfs_recv_queue_length 16777216
zfs_removal_ignore_errors 0
zfs_removal_suspend_progress 0
zfs_remove_max_segment 16777216
zfs_resilver_disable_defer 0
zfs_resilver_min_time_ms 3000
zfs_scan_checkpoint_intval 7200
zfs_scan_fill_weight 3
zfs_scan_ignore_errors 0
zfs_scan_issue_strategy 0
zfs_scan_legacy 0
zfs_scan_max_ext_gap 2097152
zfs_scan_mem_lim_fact 20
zfs_scan_mem_lim_soft_fact 20
zfs_scan_strict_mem_lim 0
zfs_scan_suspend_progress 0
zfs_scan_vdev_limit 4194304
zfs_scrub_min_time_ms 1000
zfs_send_corrupt_data 0
zfs_send_queue_length 16777216
zfs_send_unmodified_spill_blocks 1
zfs_slow_io_events_per_second 20
zfs_spa_discard_memory_limit 16777216
zfs_special_class_metadata_reserve_pct 25
zfs_sync_pass_deferred_free 2
zfs_sync_pass_dont_compress 8
zfs_sync_pass_rewrite 2
zfs_sync_taskq_batch_pct 75
zfs_trim_extent_bytes_max 134217728
zfs_trim_extent_bytes_min 32768
zfs_trim_metaslab_skip 0
zfs_trim_queue_limit 10
zfs_trim_txg_batch 32
zfs_txg_history 100
zfs_txg_timeout 1
zfs_unlink_suspend_progress 0
zfs_user_indirect_is_special 1
zfs_vdev_aggregate_trim 0
zfs_vdev_aggregation_limit 1048576
zfs_vdev_aggregation_limit_non_rotating 131072
zfs_vdev_async_read_max_active 3
zfs_vdev_async_read_min_active 1
zfs_vdev_async_write_active_max_dirty_percent 60
zfs_vdev_async_write_active_min_dirty_percent 30
zfs_vdev_async_write_max_active 10
zfs_vdev_async_write_min_active 2
zfs_vdev_cache_bshift 16
zfs_vdev_cache_max 16384
zfs_vdev_cache_size 0
zfs_vdev_default_ms_count 200
zfs_vdev_initializing_max_active 1
zfs_vdev_initializing_min_active 1
zfs_vdev_max_active 1000
zfs_vdev_min_ms_count 16
zfs_vdev_mirror_non_rotating_inc 0
zfs_vdev_mirror_non_rotating_seek_inc 1
zfs_vdev_mirror_rotating_inc 0
zfs_vdev_mirror_rotating_seek_inc 5
zfs_vdev_mirror_rotating_seek_offset 1048576
zfs_vdev_ms_count_limit 131072
zfs_vdev_queue_depth_pct 1000
zfs_vdev_raidz_implcycle [fastest] original scalar sse2 ssse3 avx2
zfs_vdev_read_gap_limit 32768
zfs_vdev_removal_max_active 2
zfs_vdev_removal_min_active 1
zfs_vdev_scheduler unused
zfs_vdev_scrub_max_active 2
zfs_vdev_scrub_min_active 1
zfs_vdev_sync_read_max_active 10
zfs_vdev_sync_read_min_active 10
zfs_vdev_sync_write_max_active 10
zfs_vdev_sync_write_min_active 10
zfs_vdev_trim_max_active 2
zfs_vdev_trim_min_active 1
zfs_vdev_write_gap_limit 4096
zfs_zevent_cols 80
zfs_zevent_console 0
zfs_zevent_len_max 1024
zfs_zil_clean_taskq_maxalloc 1048576
zfs_zil_clean_taskq_minalloc 1024
zfs_zil_clean_taskq_nthr_pct 100
zil_maxblocksize 131072
zil_nocacheflush 0
zil_replay_disable 0
zil_slog_bulk 786432
zio_deadman_log_all 0
zio_dva_throttle_enabled 1
zio_requeue_io_start_cut_in_line 1
zio_slow_io_ms 30000
zio_taskq_batch_pct 75
zvol_inhibit_dev 0
zvol_major 230
zvol_max_discard_blocks 16384
zvol_prefetch_bytes 131072
zvol_request_sync 0
zvol_threads 32
zvol_volmode 1

VDEV cache disabled, skipping section

ZIL committed transactions: 2.7M
Commit requests: 137.6k
Flushes to stable storage: 137.6k
Transactions to SLOG storage pool: 4.0 GiB 93.9k
Transactions to non-SLOG storage pool: 0 Bytes 0
 
I changed configuration zpool.
Destroy datastore0 and datastore 2
Add to datastore1 zil SATA3 250gb SSD and NVME 1tb l2ARC.
I had tired to change Tunables atributes, but it did not help.
 
Last edited:
I guess you run in a NUMA related problem, but not sure.
Can you show the kernel module parameter

Code:
perl -e 'my $dir="/sys/module/zfs/parameters"; opendir(DH, $dir); my @files = readdir(DH); closedir(DH); foreach my $file (@files){ my $filepath = "$dir/$file"; open(FD, "<:encoding(UTF-8)", $filepath) || die $@; my $para = <FD>; close(FD); print "$file: $para" };'
 
I guess you run in a NUMA related problem, but not sure.
Can you show the kernel module parameter

Code:
perl -e 'my $dir="/sys/module/zfs/parameters"; opendir(DH, $dir); my @files = readdir(DH); closedir(DH); foreach my $file (@files){ my $filepath = "$dir/$file"; open(FD, "<:encoding(UTF-8)", $filepath) || die $@; my $para = <FD>; close(FD); print "$file: $para" };'

Thank you Wolfgang for watching my problem.
This motherboard is support only one CPU.

.: ..: zfs_arc_p_min_shift: 0
zvol_request_sync: 0
vdev_validate_skip: 0
zfs_object_mutex_size: 64
spa_slop_shift: 5
zfs_sync_taskq_batch_pct: 75
zfs_vdev_async_write_max_active: 10
zfs_multilist_num_sublists: 0
zil_nocacheflush: 0
zfs_trim_metaslab_skip: 0
zfs_trim_extent_bytes_min: 32768
zfs_checksum_events_per_second: 20
zfs_no_scrub_prefetch: 0
zfs_vdev_sync_read_min_active: 10
zfs_dmu_offset_next_sync: 0
metaslab_debug_load: 0
zio_deadman_log_all: 0
zfs_vdev_mirror_rotating_seek_inc: 5
zfs_vdev_mirror_non_rotating_inc: 0
zfs_read_history: 0
zfs_multihost_history: 0
zfs_metaslab_switch_threshold: 2
metaslab_fragmentation_factor_enabled: 1
zfs_admin_snapshot: 0
zfs_delete_blocks: 20480
zfs_arc_meta_prune: 10000
zfs_free_min_time_ms: 1000
zfs_removal_suspend_progress: 0
zfs_scrub_min_time_ms: 1000
zfs_vdev_default_ms_count: 200
zfs_dedup_prefetch: 0
zfs_txg_history: 100
zfs_vdev_max_active: 1000
zfs_vdev_sync_write_min_active: 10
spa_load_verify_data: 1
zfs_async_block_max_blocks: 100000
zfs_dirty_data_max_max: 4294967296
dbuf_cache_shift: 5
zfs_send_corrupt_data: 0
dbuf_cache_lowater_pct: 10
zfs_send_queue_length: 16777216
zfs_lua_max_instrlimit: 100000000
zfs_scan_fill_weight: 3
dmu_object_alloc_chunk_shift: 7
zfs_arc_shrink_shift: 0
zfs_resilver_min_time_ms: 3000
zfs_trim_extent_bytes_max: 134217728
zfs_free_bpobj_enabled: 1
zfs_vdev_mirror_non_rotating_seek_inc: 1
zfs_vdev_cache_max: 16384
zfs_condense_min_mapping_bytes: 131072
ignore_hole_birth: 1
zfs_multihost_fail_intervals: 10
zfs_arc_min_prefetch_ms: 0
zfs_arc_sys_free: 0
metaslab_df_use_largest_segment: 0
zfs_sync_pass_dont_compress: 8
zio_taskq_batch_pct: 75
zfs_remove_max_segment: 16777216
zfs_arc_meta_limit_percent: 75
zfs_arc_p_dampener_disable: 1
spa_load_verify_metadata: 1
dbuf_cache_hiwater_pct: 10
zfs_read_chunk_size: 1048576
zfs_arc_grow_retry: 0
zfs_vdev_trim_min_active: 1
metaslab_aliquot: 524288
zfs_vdev_async_read_min_active: 1
zfs_vdev_cache_bshift: 16
metaslab_preload_enabled: 1
zfs_deadman_failmode: wait
l2arc_feed_min_ms: 200
zfs_read_history_hits: 0
zfetch_max_distance: 8388608
send_holes_without_birth_time: 1
zfs_max_recordsize: 1048576
zfs_dbuf_state_index: 0
zio_slow_io_ms: 30000
dbuf_cache_max_bytes: 2111616064
zfs_zevent_cols: 80
zfs_scan_mem_lim_soft_fact: 20
zfs_no_scrub_io: 0
zil_slog_bulk: 786432
spa_asize_inflation: 24
l2arc_write_boost: 8388608
zfs_abd_scatter_min_size: 1536
zfs_arc_meta_limit: 0
zfs_deadman_enabled: 1
zfs_abd_scatter_enabled: 1
zfs_arc_min_prescient_prefetch_ms: 0
zfs_vdev_async_write_active_min_dirty_percent: 30
zfs_free_leak_on_eio: 0
zfs_vdev_cache_size: 0
zfs_vdev_write_gap_limit: 4096
zfs_scan_issue_strategy: 0
zfs_max_missing_tvds: 0
l2arc_headroom: 2
zfs_per_txg_dirty_frees_percent: 5
zfs_compressed_arc_enabled: 1
dbuf_metadata_cache_max_bytes: 1055808032
zfs_scan_ignore_errors: 0
zfs_vdev_removal_max_active: 2
zfs_condense_indirect_commit_entry_delay_ms: 0
zfs_metaslab_segment_weight_enabled: 1
zfs_dirty_data_max_max_percent: 25
metaslab_force_ganging: 16777217
zio_dva_throttle_enabled: 1
zfs_vdev_scrub_min_active: 1
zfs_arc_average_blocksize: 8192
zfs_scan_suspend_progress: 0
zfs_vdev_queue_depth_pct: 1000
zfs_multihost_interval: 1000
zfs_vdev_aggregate_trim: 0
zfs_condense_indirect_vdevs_enable: 1
zio_requeue_io_start_cut_in_line: 1
zfetch_max_streams: 8
zfs_multihost_import_intervals: 20
zfs_ddt_data_is_special: 1
zfs_zevent_console: 0
zfs_zil_clean_taskq_minalloc: 1024
zfs_sync_pass_deferred_free: 2
zfs_vdev_initializing_min_active: 1
zfs_nocacheflush: 0
zfs_arc_dnode_limit: 0
zfs_scan_legacy: 0
zfs_dbgmsg_enable: 1
zfs_scan_vdev_limit: 4194304
zfs_vdev_raidz_impl: cycle [fastest] original scalar sse2 ssse3 avx2 zvol_threads: 32
zfs_vdev_async_write_min_active: 2
zfs_removal_ignore_errors: 0
zfs_vdev_sync_read_max_active: 10
l2arc_headroom_boost: 200
zfs_reconstruct_indirect_combinations_max: 4096
zfs_sync_pass_rewrite: 2
spa_config_path: /etc/zfs/zpool.cache
zfs_pd_bytes_max: 52428800
metaslab_df_max_search: 16777216
zfs_flags: 0
zfs_deadman_checktime_ms: 60000
zap_iterate_prefetch: 1
spa_load_print_vdev_tree: 0
zfs_dirty_data_max_percent: 10
zfs_user_indirect_is_special: 1
zfs_scan_checkpoint_intval: 7200
dbuf_metadata_cache_shift: 6
zfetch_min_sec_reap: 2
zfs_zil_clean_taskq_nthr_pct: 100
zfs_key_max_salt_uses: 400000000
zfs_mg_noalloc_threshold: 0
zfs_deadman_ziotime_ms: 300000
zfs_special_class_metadata_reserve_pct: 25
zfs_arc_meta_min: 0
zvol_prefetch_bytes: 131072
zfs_deadman_synctime_ms: 600000
zfs_send_unmodified_spill_blocks: 1
zfs_autoimport_disable: 1
zfs_arc_min: 0
zfs_trim_queue_limit: 10
l2arc_noprefetch: 1
zfs_nopwrite_enabled: 1
l2arc_feed_again: 1
zfs_vdev_sync_write_max_active: 10
zfs_prefetch_disable: 0
zfetch_array_rd_sz: 1048576
zfs_metaslab_fragmentation_threshold: 70
l2arc_write_max: 8388608
zfs_scan_mem_lim_fact: 20
zfs_dbgmsg_maxsize: 4194304
zfs_override_estimate_recordsize: 0
zfs_vdev_read_gap_limit: 32768
zfs_dirty_data_sync_percent: 20
zfs_delay_min_dirty_percent: 60
zfs_recv_queue_length: 16777216
zfs_vdev_async_write_active_max_dirty_percent: 60
zfs_disable_ivset_guid_check: 0
zfs_arc_lotsfree_percent: 10
zfs_immediate_write_sz: 32768
zil_replay_disable: 0
zil_maxblocksize: 131072
zfs_vdev_mirror_rotating_inc: 0
zvol_volmode: 1
zfs_unlink_suspend_progress: 0
zfs_arc_meta_strategy: 1
zfs_obsolete_min_time_ms: 500
zfs_vdev_trim_max_active: 2
zfs_resilver_disable_defer: 0
metaslab_bias_enabled: 1
zfs_vdev_async_read_max_active: 3
l2arc_feed_secs: 1
zfs_commit_timeout_pct: 5
zfs_arc_max: 0
spa_load_verify_shift: 4
zfs_trim_txg_batch: 32
vdev_removal_max_span: 32768
zfs_zevent_len_max: 1024
zfs_scan_max_ext_gap: 2097152
zfs_scan_strict_mem_lim: 0
zfs_vdev_aggregation_limit_non_rotating: 131072
zfs_arc_meta_adjust_restarts: 4096
l2arc_norw: 0
zfs_recover: 0
zvol_inhibit_dev: 0
zfs_vdev_aggregation_limit: 1048576
zfs_condense_max_obsolete_bytes: 1073741824
dmu_prefetch_max: 134217728
zvol_major: 230
metaslab_debug_unload: 0
zfs_slow_io_events_per_second: 20
zfs_lua_max_memlimit: 104857600
metaslab_lba_weighting_enabled: 1
zfs_zil_clean_taskq_maxalloc: 1048576
zfs_txg_timeout: 5
zfs_vdev_removal_min_active: 1
zfs_vdev_min_ms_count: 16
zfs_vdev_scrub_max_active: 2
zfs_vdev_mirror_rotating_seek_offset: 1048576
zfs_arc_pc_percent: 0
zfs_vdev_scheduler: unused
zvol_max_discard_blocks: 16384
zfs_arc_dnode_reduce_percent: 10
zfs_vdev_ms_count_limit: 131072
zfs_dirty_data_max: 4294967296
zfs_abd_scatter_max_order: 10
zfs_spa_discard_memory_limit: 16777216
zfs_initialize_value: 16045690984833335022
zfs_expire_snapshot: 300
zfs_vdev_initializing_max_active: 1
zfs_arc_dnode_limit_percent: 10
zfs_delay_scale: 500000
zfs_mg_fragmentation_threshold: 95
 
Esterday I updated Proxmox to 6.1-7
HPET (high precision timer) is enabled in bios

In syslog I have strange lines, maybe these problems are related.

Code:
Feb  5 07:28:01 pve0 systemd[1]: Started Proxmox VE replication runner.
Feb  5 07:28:42 pve0 kernel: [24250.826445] Uhhuh. NMI received for unknown reason 2c on CPU 63.
Feb  5 07:28:42 pve0 kernel: [24250.826445] Do you have a strange power saving mode enabled?
Feb  5 07:28:42 pve0 kernel: [24250.826446] Dazed and confused, but trying to continue
 
Last edited:
The first Gen EPYC in on a single Socket a NUMA architecture.
Wow, thanks, this shows that I need to read more bases.

And does you have all power savings disabled in the BIOS?

In bios -> ACPI control I did not have performance setings, only HPET settings

Power / Performance Determinism have auto setting (In the manual for the motherboard is written the value auto = Performance)
 
What MB do you have?
 
Hello forum,
I changed my configuration but the problem is that if I start loading zfs a little (for example, to backup over sftp a large number of small files 30-150kb, copied in 1.5 hours ~48000 files or start a mysql database) then iodelay increases very quickly and the rest of the containers and vm slow down very much. IO delay ~ 20%- 35%.
I tried to make fine adjustments to zfs and mysql and it gave results, but any operations with files cause big brakes in virtual environments.

I have one ZFS pool:
datastore1: SATA 6 HDD Toshiba HDWD130 - raidz1 (ashift=12, compression=lz4, atime=off)
+ logs: msata SSD 256 gb
+ cache: NVME SSD 1024GB
+ spare: SATA 1 HDD Toshiba HDWD130
running 11 LXC and 4 KVM

CPU AMD Epyc 7551
RAM ECC 128gb
swapoff
options zfs zfs_arc_max = 0

proxmox-ve: 6.1-2 (running kernel: 5.3.18-3-pve)
pve-manager: 6.1-8 (running version: 6.1-8/806edfe1)
pve-kernel-helper: 6.1-7
pve-kernel-5.3: 6.1-6
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-5.3.18-2-pve: 5.3.18-2
pve-kernel-5.3.10-1-pve: 5.3.10-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.3-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: residual config
ifupdown2: 2.0.1-1+pve8
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.15-pve1
libpve-access-control: 6.0-6
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.0-17
libpve-guest-common-perl: 3.0-5
libpve-http-server-perl: 3.0-5
libpve-storage-perl: 6.1-5
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 3.2.1-1
lxcfs: 4.0.1-pve1
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.1-3
pve-cluster: 6.1-4
pve-container: 3.0-23
pve-docs: 6.1-6
pve-edk2-firmware: 2.20200229-1
pve-firewall: 4.0-10
pve-firmware: 3.0-6
pve-ha-manager: 3.0-9
pve-i18n: 2.0-4
pve-qemu-kvm: 4.1.1-4
pve-xtermjs: 4.3.0-1
pve-zsync: 2.0-2
qemu-server: 6.1-7
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.3-pve1

any suggestions?
 

Attachments

  • arc_summury.txt
    23.3 KB · Views: 6
Last edited:
Hi,

How fast is your ZIL(logs) SSD in 4k sync writes?
Because if this is not fast this can be the bottleneck.
Also, I Would consider removing the cache device because of this needs memory that is not available for ARC.
 
Thank you for your answer Wolfgang
I have disabled the ZIL and will monitor the situation
 
The situation has not changed: (
iotop -a-P -o-d 5
shows that the problem is reading files and not writing
 
How fast are your ZIL devices with 4k sync writes?
You can benchmark it like this.

Code:
fio --size=20G --bs=4k --rw=write --direct=1 --sync=1 --runtime=60  --group_reporting --name=test --ramp_time=5s --filename=/dev/sd<x>
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!