Hi,
I noticed that in my 3-node, 12-osd cluster (3 OSD per Node), one node has all 3 of its OSDs marked "Down" and "Out". I tried to bring them back 'In" and "Up", but, this is what the log shows:
My setup is WAL and block.db is on SSD, but the OSD is SATA HDD. Each server has 2 SSDs, each SSD has 3 partitions - one partition is for WAL, one is for block.db, and of course there is SATA disk for OSD.
Any idea what this could be?
PVE Versions:
I noticed that in my 3-node, 12-osd cluster (3 OSD per Node), one node has all 3 of its OSDs marked "Down" and "Out". I tried to bring them back 'In" and "Up", but, this is what the log shows:
My setup is WAL and block.db is on SSD, but the OSD is SATA HDD. Each server has 2 SSDs, each SSD has 3 partitions - one partition is for WAL, one is for block.db, and of course there is SATA disk for OSD.
Any idea what this could be?
Code:
2019-03-11 15:43:43.831453 7f18b8892e00 0 set uid:gid to 64045:64045 (ceph:ceph)
2019-03-11 15:43:43.831468 7f18b8892e00 0 ceph version 12.2.10 (fc2b1783e3727b66315cc667af9d663d30fe7ed4) luminous (stable), process ceph-osd, pid 2988913
2019-03-11 15:43:43.836761 7f18b8892e00 0 pidfile_write: ignore empty --pid-file
2019-03-11 15:43:43.844687 7f18b8892e00 0 load: jerasure load: lrc load: isa
2019-03-11 15:43:43.844789 7f18b8892e00 1 bdev create path /var/lib/ceph/osd/ceph-8/block type kernel
2019-03-11 15:43:43.844798 7f18b8892e00 1 bdev(0x563466b4cb40 /var/lib/ceph/osd/ceph-8/block) open path /var/lib/ceph/osd/ceph-8/block
2019-03-11 15:43:43.845001 7f18b8892e00 1 bdev(0x563466b4cb40 /var/lib/ceph/osd/ceph-8/block) open size 6001170317312 (0x57541a00000, 5.46TiB) block_size 4096 (4KiB) rotational
2019-03-11 15:43:43.845283 7f18b8892e00 1 bluestore(/var/lib/ceph/osd/ceph-8) _set_cache_sizes cache_size 1073741824 meta 0.4 kv 0.4 data 0.2
2019-03-11 15:43:43.845299 7f18b8892e00 1 bdev(0x563466b4cb40 /var/lib/ceph/osd/ceph-8/block) close
2019-03-11 15:43:44.169681 7f18b8892e00 1 bluestore(/var/lib/ceph/osd/ceph-8) _mount path /var/lib/ceph/osd/ceph-8
2019-03-11 15:43:44.170038 7f18b8892e00 1 bdev create path /var/lib/ceph/osd/ceph-8/block type kernel
2019-03-11 15:43:44.170043 7f18b8892e00 1 bdev(0x563466b4cd80 /var/lib/ceph/osd/ceph-8/block) open path /var/lib/ceph/osd/ceph-8/block
2019-03-11 15:43:44.170205 7f18b8892e00 1 bdev(0x563466b4cd80 /var/lib/ceph/osd/ceph-8/block) open size 6001170317312 (0x57541a00000, 5.46TiB) block_size 4096 (4KiB) rotational
2019-03-11 15:43:44.170470 7f18b8892e00 1 bluestore(/var/lib/ceph/osd/ceph-8) _set_cache_sizes cache_size 1073741824 meta 0.4 kv 0.4 data 0.2
2019-03-11 15:43:44.170522 7f18b8892e00 1 bdev create path /var/lib/ceph/osd/ceph-8/block.db type kernel
2019-03-11 15:43:44.170526 7f18b8892e00 1 bdev(0x563466b4d200 /var/lib/ceph/osd/ceph-8/block.db) open path /var/lib/ceph/osd/ceph-8/block.db
2019-03-11 15:43:44.170647 7f18b8892e00 1 bdev(0x563466b4d200 /var/lib/ceph/osd/ceph-8/block.db) open size 5997854720 (0x165800000, 5.59GiB) block_size 4096 (4KiB) non-rotational
2019-03-11 15:43:44.170655 7f18b8892e00 1 bluefs add_block_device bdev 1 path /var/lib/ceph/osd/ceph-8/block.db size 5.59GiB
2019-03-11 15:43:44.172927 7f18b8892e00 1 bdev create path /var/lib/ceph/osd/ceph-8/block type kernel
2019-03-11 15:43:44.172937 7f18b8892e00 1 bdev(0x563466b4d440 /var/lib/ceph/osd/ceph-8/block) open path /var/lib/ceph/osd/ceph-8/block
2019-03-11 15:43:44.173124 7f18b8892e00 1 bdev(0x563466b4d440 /var/lib/ceph/osd/ceph-8/block) open size 6001170317312 (0x57541a00000, 5.46TiB) block_size 4096 (4KiB) rotational
2019-03-11 15:43:44.173136 7f18b8892e00 1 bluefs add_block_device bdev 2 path /var/lib/ceph/osd/ceph-8/block size 5.46TiB
2019-03-11 15:43:44.173171 7f18b8892e00 1 bluefs mount
2019-03-11 15:43:44.178468 7f18b8892e00 -1 *** Caught signal (Segmentation fault) **
in thread 7f18b8892e00 thread_name:ceph-osd
ceph version 12.2.10 (fc2b1783e3727b66315cc667af9d663d30fe7ed4) luminous (stable)
1: (()+0xa56bd4) [0x56345cdc8bd4]
2: (()+0x110c0) [0x7f18b5e980c0]
3: (BlueFS::_replay(bool)+0x1616) [0x56345cd7fb96]
4: (BlueFS::mount()+0x1e1) [0x56345cd82aa1]
5: (BlueStore::_open_db(bool)+0x1698) [0x56345cc8c6b8]
6: (BlueStore::_mount(bool)+0x2b4) [0x56345ccc5cf4]
7: (OSD::init()+0x3e2) [0x56345c813fe2]
8: (main()+0x3092) [0x56345c71d3c2]
9: (__libc_start_main()+0xf1) [0x7f18b4e4d2e1]
10: (_start()+0x2a) [0x56345c7a9f9a]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
--- begin dump of recent events ---
-75> 2019-03-11 15:43:43.826473 7f18b8892e00 5 asok(0x563466baf4a0) register_command perfcounters_dump hook 0x563466b4a1c0
-74> 2019-03-11 15:43:43.826489 7f18b8892e00 5 asok(0x563466baf4a0) register_command 1 hook 0x563466b4a1c0
-73> 2019-03-11 15:43:43.826491 7f18b8892e00 5 asok(0x563466baf4a0) register_command perf dump hook 0x563466b4a1c0
-72> 2019-03-11 15:43:43.826494 7f18b8892e00 5 asok(0x563466baf4a0) register_command perfcounters_schema hook 0x563466b4a1c0
-71> 2019-03-11 15:43:43.826496 7f18b8892e00 5 asok(0x563466baf4a0) register_command perf histogram dump hook 0x563466b4a1c0
-70> 2019-03-11 15:43:43.826498 7f18b8892e00 5 asok(0x563466baf4a0) register_command 2 hook 0x563466b4a1c0
-69> 2019-03-11 15:43:43.826499 7f18b8892e00 5 asok(0x563466baf4a0) register_command perf schema hook 0x563466b4a1c0
-68> 2019-03-11 15:43:43.826501 7f18b8892e00 5 asok(0x563466baf4a0) register_command perf histogram schema hook 0x563466b4a1c0
-67> 2019-03-11 15:43:43.826503 7f18b8892e00 5 asok(0x563466baf4a0) register_command perf reset hook 0x563466b4a1c0
-66> 2019-03-11 15:43:43.826511 7f18b8892e00 5 asok(0x563466baf4a0) register_command config show hook 0x563466b4a1c0
-65> 2019-03-11 15:43:43.826513 7f18b8892e00 5 asok(0x563466baf4a0) register_command config help hook 0x563466b4a1c0
-64> 2019-03-11 15:43:43.826516 7f18b8892e00 5 asok(0x563466baf4a0) register_command config set hook 0x563466b4a1c0
-63> 2019-03-11 15:43:43.826518 7f18b8892e00 5 asok(0x563466baf4a0) register_command config get hook 0x563466b4a1c0
-62> 2019-03-11 15:43:43.826519 7f18b8892e00 5 asok(0x563466baf4a0) register_command config diff hook 0x563466b4a1c0
-61> 2019-03-11 15:43:43.826522 7f18b8892e00 5 asok(0x563466baf4a0) register_command config diff get hook 0x563466b4a1c0
-60> 2019-03-11 15:43:43.826524 7f18b8892e00 5 asok(0x563466baf4a0) register_command log flush hook 0x563466b4a1c0
-59> 2019-03-11 15:43:43.826526 7f18b8892e00 5 asok(0x563466baf4a0) register_command log dump hook 0x563466b4a1c0
-58> 2019-03-11 15:43:43.826528 7f18b8892e00 5 asok(0x563466baf4a0) register_command log reopen hook 0x563466b4a1c0
-57> 2019-03-11 15:43:43.826538 7f18b8892e00 5 asok(0x563466baf4a0) register_command dump_mempools hook 0x563466e5ada8
-56> 2019-03-11 15:43:43.831453 7f18b8892e00 0 set uid:gid to 64045:64045 (ceph:ceph)
-55> 2019-03-11 15:43:43.831468 7f18b8892e00 0 ceph version 12.2.10 (fc2b1783e3727b66315cc667af9d663d30fe7ed4) luminous (stable), process ceph-osd, pid 2988913
-54> 2019-03-11 15:43:43.831501 7f18b8892e00 5 object store type is bluestore
-53> 2019-03-11 15:43:43.836104 7f18b2aee700 2 Event(0x563466b4c500 nevent=5000 time_id=1).set_owner idx=0 owner=139744053749504
-52> 2019-03-11 15:43:43.836145 7f18b22ed700 2 Event(0x563466b4c740 nevent=5000 time_id=1).set_owner idx=1 owner=139744045356800
-51> 2019-03-11 15:43:43.836152 7f18b1aec700 2 Event(0x563466b4c980 nevent=5000 time_id=1).set_owner idx=2 owner=139744036964096
-50> 2019-03-11 15:43:43.836545 7f18b8892e00 1 -- 172.17.1.54:0/0 learned_addr learned my addr 172.17.1.54:0/0
-49> 2019-03-11 15:43:43.836554 7f18b8892e00 1 -- 172.17.1.54:6802/2988913 _finish_bind bind my_inst.addr is 172.17.1.54:6802/2988913
-48> 2019-03-11 15:43:43.836608 7f18b8892e00 1 -- 10.10.10.5:0/0 learned_addr learned my addr 10.10.10.5:0/0
-47> 2019-03-11 15:43:43.836615 7f18b8892e00 1 -- 10.10.10.5:6802/2988913 _finish_bind bind my_inst.addr is 10.10.10.5:6802/2988913
-46> 2019-03-11 15:43:43.836682 7f18b8892e00 1 -- 10.10.10.5:0/0 learned_addr learned my addr 10.10.10.5:0/0
-45> 2019-03-11 15:43:43.836687 7f18b8892e00 1 -- 10.10.10.5:6803/2988913 _finish_bind bind my_inst.addr is 10.10.10.5:6803/2988913
-44> 2019-03-11 15:43:43.836754 7f18b8892e00 1 -- 172.17.1.54:0/0 learned_addr learned my addr 172.17.1.54:0/0
-43> 2019-03-11 15:43:43.836759 7f18b8892e00 1 -- 172.17.1.54:6803/2988913 _finish_bind bind my_inst.addr is 172.17.1.54:6803/2988913
-42> 2019-03-11 15:43:43.836761 7f18b8892e00 0 pidfile_write: ignore empty --pid-file
-41> 2019-03-11 15:43:43.838350 7f18b8892e00 5 asok(0x563466baf4a0) init /var/run/ceph/ceph-osd.8.asok
-40> 2019-03-11 15:43:43.838362 7f18b8892e00 5 asok(0x563466baf4a0) bind_and_listen /var/run/ceph/ceph-osd.8.asok
-39> 2019-03-11 15:43:43.838411 7f18b8892e00 5 asok(0x563466baf4a0) register_command 0 hook 0x563466b481a8
-38> 2019-03-11 15:43:43.838419 7f18b8892e00 5 asok(0x563466baf4a0) register_command version hook 0x563466b481a8
-37> 2019-03-11 15:43:43.838424 7f18b8892e00 5 asok(0x563466baf4a0) register_command git_version hook 0x563466b481a8
-36> 2019-03-11 15:43:43.838429 7f18b8892e00 5 asok(0x563466baf4a0) register_command help hook 0x563466b4a620
-35> 2019-03-11 15:43:43.838431 7f18b8892e00 5 asok(0x563466baf4a0) register_command get_command_descriptions hook 0x563466b4a630
-34> 2019-03-11 15:43:43.838488 7f18b031b700 5 asok(0x563466baf4a0) entry start
-33> 2019-03-11 15:43:43.838497 7f18b8892e00 10 monclient: build_initial_monmap
-32> 2019-03-11 15:43:43.844687 7f18b8892e00 0 load: jerasure load: lrc load: isa
-31> 2019-03-11 15:43:43.844745 7f18b8892e00 5 adding auth protocol: none
-30> 2019-03-11 15:43:43.844750 7f18b8892e00 5 adding auth protocol: none
-29> 2019-03-11 15:43:43.844789 7f18b8892e00 1 bdev create path /var/lib/ceph/osd/ceph-8/block type kernel
-28> 2019-03-11 15:43:43.844798 7f18b8892e00 1 bdev(0x563466b4cb40 /var/lib/ceph/osd/ceph-8/block) open path /var/lib/ceph/osd/ceph-8/block
-27> 2019-03-11 15:43:43.845001 7f18b8892e00 1 bdev(0x563466b4cb40 /var/lib/ceph/osd/ceph-8/block) open size 6001170317312 (0x57541a00000, 5.46TiB) block_size 4096 (4KiB) rotational
-26> 2019-03-11 15:43:43.845283 7f18b8892e00 1 bluestore(/var/lib/ceph/osd/ceph-8) _set_cache_sizes cache_size 1073741824 meta 0.4 kv 0.4 data 0.2
-25> 2019-03-11 15:43:43.845299 7f18b8892e00 1 bdev(0x563466b4cb40 /var/lib/ceph/osd/ceph-8/block) close
-24> 2019-03-11 15:43:44.169462 7f18b8892e00 5 asok(0x563466baf4a0) register_command objecter_requests hook 0x563466b4a6b0
-23> 2019-03-11 15:43:44.169528 7f18b8892e00 1 -- 172.17.1.54:6802/2988913 start start
-22> 2019-03-11 15:43:44.169536 7f18b8892e00 1 -- - start start
-21> 2019-03-11 15:43:44.169537 7f18b8892e00 1 -- - start start
-20> 2019-03-11 15:43:44.169538 7f18b8892e00 1 -- 172.17.1.54:6803/2988913 start start
-19> 2019-03-11 15:43:44.169542 7f18b8892e00 1 -- 10.10.10.5:6803/2988913 start start
-18> 2019-03-11 15:43:44.169544 7f18b8892e00 1 -- 10.10.10.5:6802/2988913 start start
-17> 2019-03-11 15:43:44.169547 7f18b8892e00 1 -- - start start
-16> 2019-03-11 15:43:44.169667 7f18b8892e00 2 osd.8 0 init /var/lib/ceph/osd/ceph-8 (looks like hdd)
-15> 2019-03-11 15:43:44.169673 7f18b8892e00 2 osd.8 0 journal /var/lib/ceph/osd/ceph-8/journal
-14> 2019-03-11 15:43:44.169681 7f18b8892e00 1 bluestore(/var/lib/ceph/osd/ceph-8) _mount path /var/lib/ceph/osd/ceph-8
-13> 2019-03-11 15:43:44.170038 7f18b8892e00 1 bdev create path /var/lib/ceph/osd/ceph-8/block type kernel
-12> 2019-03-11 15:43:44.170043 7f18b8892e00 1 bdev(0x563466b4cd80 /var/lib/ceph/osd/ceph-8/block) open path /var/lib/ceph/osd/ceph-8/block
-11> 2019-03-11 15:43:44.170205 7f18b8892e00 1 bdev(0x563466b4cd80 /var/lib/ceph/osd/ceph-8/block) open size 6001170317312 (0x57541a00000, 5.46TiB) block_size 4096 (4KiB) rotational
-10> 2019-03-11 15:43:44.170470 7f18b8892e00 1 bluestore(/var/lib/ceph/osd/ceph-8) _set_cache_sizes cache_size 1073741824 meta 0.4 kv 0.4 data 0.2
-9> 2019-03-11 15:43:44.170522 7f18b8892e00 1 bdev create path /var/lib/ceph/osd/ceph-8/block.db type kernel
-8> 2019-03-11 15:43:44.170526 7f18b8892e00 1 bdev(0x563466b4d200 /var/lib/ceph/osd/ceph-8/block.db) open path /var/lib/ceph/osd/ceph-8/block.db
-7> 2019-03-11 15:43:44.170647 7f18b8892e00 1 bdev(0x563466b4d200 /var/lib/ceph/osd/ceph-8/block.db) open size 5997854720 (0x165800000, 5.59GiB) block_size 4096 (4KiB) non-rotational
-6> 2019-03-11 15:43:44.170655 7f18b8892e00 1 bluefs add_block_device bdev 1 path /var/lib/ceph/osd/ceph-8/block.db size 5.59GiB
-5> 2019-03-11 15:43:44.172927 7f18b8892e00 1 bdev create path /var/lib/ceph/osd/ceph-8/block type kernel
-4> 2019-03-11 15:43:44.172937 7f18b8892e00 1 bdev(0x563466b4d440 /var/lib/ceph/osd/ceph-8/block) open path /var/lib/ceph/osd/ceph-8/block
-3> 2019-03-11 15:43:44.173124 7f18b8892e00 1 bdev(0x563466b4d440 /var/lib/ceph/osd/ceph-8/block) open size 6001170317312 (0x57541a00000, 5.46TiB) block_size 4096 (4KiB) rotational
-2> 2019-03-11 15:43:44.173136 7f18b8892e00 1 bluefs add_block_device bdev 2 path /var/lib/ceph/osd/ceph-8/block size 5.46TiB
-1> 2019-03-11 15:43:44.173171 7f18b8892e00 1 bluefs mount
0> 2019-03-11 15:43:44.178468 7f18b8892e00 -1 *** Caught signal (Segmentation fault) **
in thread 7f18b8892e00 thread_name:ceph-osd
ceph version 12.2.10 (fc2b1783e3727b66315cc667af9d663d30fe7ed4) luminous (stable)
1: (()+0xa56bd4) [0x56345cdc8bd4]
2: (()+0x110c0) [0x7f18b5e980c0]
3: (BlueFS::_replay(bool)+0x1616) [0x56345cd7fb96]
4: (BlueFS::mount()+0x1e1) [0x56345cd82aa1]
5: (BlueStore::_open_db(bool)+0x1698) [0x56345cc8c6b8]
6: (BlueStore::_mount(bool)+0x2b4) [0x56345ccc5cf4]
7: (OSD::init()+0x3e2) [0x56345c813fe2]
8: (main()+0x3092) [0x56345c71d3c2]
9: (__libc_start_main()+0xf1) [0x7f18b4e4d2e1]
10: (_start()+0x2a) [0x56345c7a9f9a]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
--- logging levels ---
0/ 5 none
0/ 1 lockdep
0/ 1 context
1/ 1 crush
1/ 5 mds
1/ 5 mds_balancer
1/ 5 mds_locker
1/ 5 mds_log
1/ 5 mds_log_expire
1/ 5 mds_migrator
0/ 1 buffer
0/ 1 timer
0/ 1 filer
0/ 1 striper
0/ 1 objecter
0/ 5 rados
0/ 5 rbd
0/ 5 rbd_mirror
0/ 5 rbd_replay
0/ 5 journaler
0/ 5 objectcacher
0/ 5 client
1/ 5 osd
0/ 5 optracker
0/ 5 objclass
1/ 3 filestore
1/ 3 journal
0/ 5 ms
1/ 5 mon
0/10 monc
1/ 5 paxos
0/ 5 tp
1/ 5 auth
1/ 5 crypto
1/ 1 finisher
1/ 1 reserver
1/ 5 heartbeatmap
1/ 5 perfcounter
1/ 5 rgw
1/10 civetweb
1/ 5 javaclient
1/ 5 asok
1/ 1 throttle
0/ 0 refs
1/ 5 xio
1/ 5 compressor
1/ 5 bluestore
1/ 5 bluefs
1/ 3 bdev
1/ 5 kstore
4/ 5 rocksdb
4/ 5 leveldb
4/ 5 memdb
1/ 5 kinetic
1/ 5 fuse
1/ 5 mgr
1/ 5 mgrc
1/ 5 dpdk
1/ 5 eventtrace
-2/-2 (syslog threshold)
-1/-1 (stderr threshold)
max_recent 10000
max_new 1000
log_file /var/log/ceph/ceph-osd.8.log
--- end dump of recent events ---
PVE Versions:
Code:
# pveversion --verbose
proxmox-ve: 5.3-1 (running kernel: 4.15.18-10-pve)
pve-manager: 5.3-8 (running version: 5.3-8/2929af8e)
pve-kernel-4.15: 5.3-1
pve-kernel-4.15.18-10-pve: 4.15.18-32
pve-kernel-4.15.18-4-pve: 4.15.18-23
pve-kernel-4.15.18-1-pve: 4.15.18-19
pve-kernel-4.15.17-1-pve: 4.15.17-9
pve-kernel-4.13.13-6-pve: 4.13.13-42
pve-kernel-4.13.13-5-pve: 4.13.13-38
pve-kernel-4.13.4-1-pve: 4.13.4-26
ceph: 12.2.10-pve1
corosync: 2.4.4-pve1
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.1-3
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-44
libpve-guest-common-perl: 2.0-19
libpve-http-server-perl: 2.0-11
libpve-storage-perl: 5.0-36
libqb0: 1.0.3-1~bpo9
lvm2: 2.02.168-pve6
lxc-pve: 3.1.0-2
lxcfs: 3.0.2-2
novnc-pve: 1.0.0-2
proxmox-widget-toolkit: 1.0-22
pve-cluster: 5.0-33
pve-container: 2.0-33
pve-docs: 5.3-1
pve-edk2-firmware: 1.20181023-1
pve-firewall: 3.0-17
pve-firmware: 2.0-6
pve-ha-manager: 2.0-6
pve-i18n: 1.0-9
pve-libspice-server1: 0.14.1-2
pve-qemu-kvm: 2.12.1-1
pve-xtermjs: 3.10.1-1
qemu-server: 5.0-45
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.12-pve1~bpo1