Hi.
We have a proxmox cluster (version 7.3-1) with Ceph storage. This morning some drives got marked down/out and the pool with those drives stopped. However, if I run "smartctl -a" or run hdsentinel the drives seem to be in good health.
Ceph health says this:
Here is the output from ceph osd tree:
...and the contents of /var/log/ceph/ceph-osd.36.log (and the other log files for the other NVMe disks that are down/out) is full of stack traces that I have not idea how to decipher:
-24> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: [column_family.cc:555] (skipping printing options)
-23> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: [column_family.cc:555] (skipping printing options)
We have a proxmox cluster (version 7.3-1) with Ceph storage. This morning some drives got marked down/out and the pool with those drives stopped. However, if I run "smartctl -a" or run hdsentinel the drives seem to be in good health.
Ceph health says this:
Code:
root@jarn24:~# ceph health
HEALTH_WARN 1 nearfull osd(s); Reduced data availability: 6 pgs inactive; Low space hindering backfill (add storage if this doesn't resolve itself): 6 pgs backfill_toofull; Degraded data redundancy: 147733/7884885 objects degraded (1.874%), 61 pgs degraded, 64 pgs undersized; 64 pgs not deep-scrubbed in time; 64 pgs not scrubbed in time; 4 pool(s) nearfull; 25 daemons have recently crashed
Here is the output from ceph osd tree:
Code:
root@jarn24:~# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 395.52695 root default
-5 131.84232 host jarn24
12 hdd 10.91409 osd.12 up 1.00000 1.00000
13 hdd 10.91409 osd.13 up 1.00000 1.00000
14 hdd 10.91409 osd.14 up 1.00000 1.00000
15 hdd 10.91409 osd.15 up 1.00000 1.00000
16 hdd 10.91409 osd.16 up 1.00000 1.00000
17 hdd 10.91409 osd.17 up 1.00000 1.00000
18 hdd 10.91409 osd.18 up 1.00000 1.00000
19 hdd 10.91409 osd.19 up 1.00000 1.00000
20 hdd 10.91409 osd.20 up 1.00000 1.00000
21 hdd 10.91409 osd.21 up 1.00000 1.00000
22 hdd 10.91409 osd.22 up 1.00000 1.00000
23 hdd 10.91409 osd.23 up 1.00000 1.00000
36 nvme 0.43660 osd.36 down 0 1.00000
37 nvme 0.43660 osd.37 up 1.00000 1.00000
-3 131.84232 host jarn25
0 hdd 10.91409 osd.0 up 1.00000 1.00000
1 hdd 10.91409 osd.1 up 1.00000 1.00000
2 hdd 10.91409 osd.2 up 1.00000 1.00000
3 hdd 10.91409 osd.3 up 1.00000 1.00000
4 hdd 10.91409 osd.4 up 1.00000 1.00000
5 hdd 10.91409 osd.5 up 1.00000 1.00000
6 hdd 10.91409 osd.6 up 1.00000 1.00000
7 hdd 10.91409 osd.7 up 1.00000 1.00000
8 hdd 10.91409 osd.8 up 1.00000 1.00000
9 hdd 10.91409 osd.9 up 1.00000 1.00000
10 hdd 10.91409 osd.10 up 1.00000 1.00000
11 hdd 10.91409 osd.11 up 1.00000 1.00000
38 nvme 0.43660 osd.38 down 0 1.00000
39 nvme 0.43660 osd.39 down 0 1.00000
-7 131.84232 host jarn26
24 hdd 10.91409 osd.24 up 1.00000 1.00000
25 hdd 10.91409 osd.25 up 1.00000 1.00000
26 hdd 10.91409 osd.26 up 1.00000 1.00000
27 hdd 10.91409 osd.27 up 1.00000 1.00000
28 hdd 10.91409 osd.28 up 1.00000 1.00000
29 hdd 10.91409 osd.29 up 1.00000 1.00000
30 hdd 10.91409 osd.30 up 1.00000 1.00000
31 hdd 10.91409 osd.31 up 1.00000 1.00000
32 hdd 10.91409 osd.32 up 1.00000 1.00000
33 hdd 10.91409 osd.33 up 1.00000 1.00000
34 hdd 10.91409 osd.34 up 1.00000 1.00000
35 hdd 10.91409 osd.35 up 1.00000 1.00000
40 nvme 0.43660 osd.40 up 1.00000 1.00000
41 nvme 0.43660 osd.41 up 1.00000 1.00000
...and the contents of /var/log/ceph/ceph-osd.36.log (and the other log files for the other NVMe disks that are down/out) is full of stack traces that I have not idea how to decipher:
Code:
-92> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.write_buffer_size: 268435456
-91> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.max_write_buffer_number: 4
-90> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.compression: NoCompression
-89> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.bottommost_compression: Disabled
-88> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.prefix_extractor: nullptr
-87> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.memtable_insert_with_hint_prefix_extractor: nullptr
-86> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.num_levels: 7
-85> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.min_write_buffer_number_to_merge: 1
-84> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.max_write_buffer_number_to_maintain: 0
-83> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.max_write_buffer_size_to_maintain: 0
-82> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.bottommost_compression_opts.window_bits: -14
-81> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.bottommost_compression_opts.level: 32767
-80> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.bottommost_compression_opts.strategy: 0
-79> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.bottommost_compression_opts.max_dict_bytes: 0
-78> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.bottommost_compression_opts.zstd_max_train_bytes: 0
-77> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.bottommost_compression_opts.enabled: false
-76> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.compression_opts.window_bits: -14
-75> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.compression_opts.level: 32767
-74> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.compression_opts.strategy: 0
-73> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.compression_opts.max_dict_bytes: 0
-72> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.compression_opts.zstd_max_train_bytes: 0
-71> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.compression_opts.enabled: false
-70> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.level0_file_num_compaction_trigger: 4
-69> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.level0_slowdown_writes_trigger: 20
-68> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.level0_stop_writes_trigger: 36
-67> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.target_file_size_base: 67108864
-66> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.target_file_size_multiplier: 1
-65> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.max_bytes_for_level_base: 268435456
-64> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.level_compaction_dynamic_level_bytes: 0
-63> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.max_bytes_for_level_multiplier: 10.000000
-62> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.max_bytes_for_level_multiplier_addtl[0]: 1
-61> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.max_bytes_for_level_multiplier_addtl[1]: 1
-60> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.max_bytes_for_level_multiplier_addtl[2]: 1
-59> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.max_bytes_for_level_multiplier_addtl[3]: 1
-58> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.max_bytes_for_level_multiplier_addtl[4]: 1
-57> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.max_bytes_for_level_multiplier_addtl[5]: 1
-56> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.max_bytes_for_level_multiplier_addtl[6]: 1
-55> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.max_sequential_skip_in_iterations: 8
-54> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.max_compaction_bytes: 1677721600
-53> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.arena_block_size: 33554432
-52> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.soft_pending_compaction_bytes_limit: 68719476736
-51> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.hard_pending_compaction_bytes_limit: 274877906944
-50> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.rate_limit_delay_max_milliseconds: 100
-49> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.disable_auto_compactions: 0
-48> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.compaction_style: kCompactionStyleLevel
-47> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.compaction_pri: kMinOverlappingRatio
-46> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.compaction_options_universal.size_ratio: 1
-45> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.compaction_options_universal.min_merge_width: 2
-44> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.compaction_options_universal.max_merge_width: 4294967295
-43> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.compaction_options_universal.max_size_amplification_percent: 200
-42> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.compaction_options_universal.compression_size_percent: -1
-41> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.compaction_options_universal.stop_style: kCompactionStopStyleTotalSize
-40> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.compaction_options_fifo.max_table_files_size: 1073741824
-39> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.compaction_options_fifo.allow_compaction: 0
-38> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.table_properties_collectors:
-37> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.inplace_update_support: 0
-36> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.inplace_update_num_locks: 10000
-35> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.memtable_prefix_bloom_size_ratio: 0.000000
-34> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.memtable_whole_key_filtering: 0
-33> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.memtable_huge_page_size: 0
-32> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.bloom_locality: 0
-31> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.max_successive_merges: 0
-30> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.optimize_filters_for_hits: 0
-29> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.paranoid_file_checks: 0
-28> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.force_consistency_checks: 0
-27> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.report_bg_io_stats: 0
-26> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.ttl: 2592000
-25> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: Options.periodic_compaction_seconds: 0
-23> 2022-12-26T14:05:19.583+0000 7f76921c4080 4 rocksdb: [column_family.cc:555] (skipping printing options)
Last edited: