We initially tried this with Ceph 12.2.4 and subsequently re-created the problem with 12.2.5.
Using 'lz4' compression on a Ceph Luminous erasure coded pool causes OSD processes to crash. Changing the compressor to snappy results in the OSD being stable, when the crashed OSD starts thereafter.
Test cluster environment:
Creating the erasure coded data pool and test RBD image:
Copying test data:
Copy stalled and one of the BlueStore ssd OSDs started flapping. Left it for a while with no difference (boots, comes online and then crashes, repeats continually). Set compressor to snappy instead and OSD was stable thereafter:
Sample log information when OSD is crashing:
Object storage utilisation:
Using 'lz4' compression on a Ceph Luminous erasure coded pool causes OSD processes to crash. Changing the compressor to snappy results in the OSD being stable, when the crashed OSD starts thereafter.
Test cluster environment:
- 3 hosts
- 2 BlueStore SSD OSDs per host (they are connected to a HP SmartArray controller so get detected as hdd, we overrode the device class to be ssd)
Creating the erasure coded data pool and test RBD image:
Code:
ceph osd erasure-code-profile set ec21_ssd plugin=jerasure k=2 m=1 technique=reed_sol_van crush-root=default crush-failure-domain=host crush-device-class=ssd directory=/usr/lib/ceph/erasure-code;
ceph osd pool create ec_ssd 16 erasure ec21_ssd;
ceph osd pool set ec_ssd allow_ec_overwrites true;
ceph osd pool application enable ec_ssd rbd;
ceph osd pool set ec_ssd compression_algorithm lz4;
ceph osd pool set ec_ssd compression_mode aggressive;
rbd create rbd_ssd/test_ec --size 100G --data-pool ec_ssd;
[root@kvm1 ~]# rbd info rbd_ssd/test_ec
rbd image 'test_ec':
size 102400 MB in 25600 objects
order 22 (4096 kB objects)
data_pool: ec_ssd
block_name_prefix: rbd_data.4.67218c74b0dc51
format: 2
features: layering, exclusive-lock, data-pool
flags:
create_timestamp: Mon May 14 13:06:46 2018
Copying test data:
Code:
rbd map rbd_ssd/test_ec --name client.admin -k /etc/pve/priv/ceph.client.admin.keyring;
dd if=/var/lib/vz/template/100G_test_vm of=/dev/rbd0 bs=1G count=20;
Copy stalled and one of the BlueStore ssd OSDs started flapping. Left it for a while with no difference (boots, comes online and then crashes, repeats continually). Set compressor to snappy instead and OSD was stable thereafter:
Code:
ceph osd pool set ec_ssd compression_algorithm snappy
Sample log information when OSD is crashing:
Code:
2018-05-14 13:27:47.329732 7f2f4b4d2700 -1 *** Caught signal (Aborted) **
in thread 7f2f4b4d2700 thread_name:tp_osd_tp
ceph version 12.2.5 (dfcb7b53b2e4fcd2a5af0240d4975adc711ab96e) luminous (stable)
1: (()+0xa31194) [0x5637f831d194]
2: (()+0x110c0) [0x7f2f640c70c0]
3: (gsignal()+0xcf) [0x7f2f6308efff]
4: (abort()+0x16a) [0x7f2f6309042a]
5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x28e) [0x5637f836509e]
6: (BlueStore::_do_alloc_write(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>, boost::intrusive_ptr<BlueStore::Onode>, BlueStore::WriteContext*)+0x352d) [0x5637f820151d]
7: (BlueStore::_do_write(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, unsigned long, unsigned long, ceph::buffer::list&, unsigned int)+0x545) [0x5637f820eb95]
8: (BlueStore::_write(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>&, unsigned long, unsigned long, ceph::buffer::list&, unsigned int)+0xfc) [0x5637f820f64c]
9: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ObjectStore::Transaction*)+0x19f0) [0x5637f8213580]
10: (BlueStore::queue_transactions(ObjectStore::Sequencer*, std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x546) [0x5637f82147f6]
11: (PrimaryLogPG::queue_transactions(std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction> >&, boost::intrusive_ptr<OpRequest>)+0x66) [0x5637f7f331b6]
12: (ECBackend::handle_sub_write(pg_shard_t, boost::intrusive_ptr<OpRequest>, ECSubWrite&, ZTracer::Trace const&, Context*)+0x867) [0x5637f807c507]
13: (ECBackend::try_reads_to_commit()+0x37db) [0x5637f808d5fb]
14: (ECBackend::check_ops()+0x1c) [0x5637f808de3c]
15: (ECBackend::start_rmw(ECBackend::Op*, std::unique_ptr<PGTransaction, std::default_delete<PGTransaction> >&&)+0xac0) [0x5637f80978d0]
16: (ECBackend::submit_transaction(hobject_t const&, object_stat_sum_t const&, eversion_t const&, std::unique_ptr<PGTransaction, std::default_delete<PGTransaction> >&&, eversion_t const&, eversion_t const&, std::vector<pg_log_entry_t, std::allocator<pg_log_entry_t> > const&, boost::optional<pg_hit_set_history_t>&, Context*, Context*, Context*, unsigned long, osd_reqid_t, boost::intrusive_ptr<OpRequest>)+0x3b2) [0x5637f8099202]
17: (PrimaryLogPG::issue_repop(PrimaryLogPG::RepGather*, PrimaryLogPG::OpContext*)+0x9fa) [0x5637f7ecdfea]
18: (PrimaryLogPG::execute_ctx(PrimaryLogPG::OpContext*)+0x134d) [0x5637f7f17f9d]
19: (PrimaryLogPG::do_op(boost::intrusive_ptr<OpRequest>&)+0x2ef5) [0x5637f7f1b7d5]
20: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0xec5) [0x5637f7ed6025]
21: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3ab) [0x5637f7d4b87b]
22: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest> const&)+0x5a) [0x5637f7ff613a]
23: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x102d) [0x5637f7d79d1d]
24: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x8ef) [0x5637f8369d7f]
25: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x5637f836d080]
26: (()+0x7494) [0x7f2f640bd494]
27: (clone()+0x3f) [0x7f2f63144acf]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
--- begin dump of recent events ---
-4> 2018-05-14 13:27:47.326040 7f2f49ccf700 1 get compressor lz4 = 0x56380e1803f0
-3> 2018-05-14 13:27:47.326854 7f2f49ccf700 1 get compressor lz4 = 0x56380e1803f0
-2> 2018-05-14 13:27:47.328798 7f2f49ccf700 1 get compressor lz4 = 0x56380e1803f0
-1> 2018-05-14 13:27:47.328861 7f2f49ccf700 1 get compressor lz4 = 0x56380e1803f0
0> 2018-05-14 13:27:47.329732 7f2f4b4d2700 -1 *** Caught signal (Aborted) **
in thread 7f2f4b4d2700 thread_name:tp_osd_tp
ceph version 12.2.5 (dfcb7b53b2e4fcd2a5af0240d4975adc711ab96e) luminous (stable)
1: (()+0xa31194) [0x5637f831d194]
2: (()+0x110c0) [0x7f2f640c70c0]
3: (gsignal()+0xcf) [0x7f2f6308efff]
4: (abort()+0x16a) [0x7f2f6309042a]
5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x28e) [0x5637f836509e]
6: (BlueStore::_do_alloc_write(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>, boost::intrusive_ptr<BlueStore::Onode>, BlueStore::WriteContext*)+0x352d) [0x5637f820151d]
7: (BlueStore::_do_write(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, unsigned long, unsigned long, ceph::buffer::list&, unsigned int)+0x545) [0x5637f820eb95]
8: (BlueStore::_write(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>&, unsigned long, unsigned long, ceph::buffer::list&, unsigned int)+0xfc) [0x5637f820f64c]
9: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ObjectStore::Transaction*)+0x19f0) [0x5637f8213580]
10: (BlueStore::queue_transactions(ObjectStore::Sequencer*, std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x546) [0x5637f82147f6]
11: (PrimaryLogPG::queue_transactions(std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction> >&, boost::intrusive_ptr<OpRequest>)+0x66) [0x5637f7f331b6]
12: (ECBackend::handle_sub_write(pg_shard_t, boost::intrusive_ptr<OpRequest>, ECSubWrite&, ZTracer::Trace const&, Context*)+0x867) [0x5637f807c507]
13: (ECBackend::try_reads_to_commit()+0x37db) [0x5637f808d5fb]
14: (ECBackend::check_ops()+0x1c) [0x5637f808de3c]
15: (ECBackend::start_rmw(ECBackend::Op*, std::unique_ptr<PGTransaction, std::default_delete<PGTransaction> >&&)+0xac0) [0x5637f80978d0]
16: (ECBackend::submit_transaction(hobject_t const&, object_stat_sum_t const&, eversion_t const&, std::unique_ptr<PGTransaction, std::default_delete<PGTransaction> >&&, eversion_t const&, eversion_t const&, std::vector<pg_log_entry_t, std::allocator<pg_log_entry_t> > const&, boost::optional<pg_hit_set_history_t>&, Context*, Context*, Context*, unsigned long, osd_reqid_t, boost::intrusive_ptr<OpRequest>)+0x3b2) [0x5637f8099202]
17: (PrimaryLogPG::issue_repop(PrimaryLogPG::RepGather*, PrimaryLogPG::OpContext*)+0x9fa) [0x5637f7ecdfea]
18: (PrimaryLogPG::execute_ctx(PrimaryLogPG::OpContext*)+0x134d) [0x5637f7f17f9d]
19: (PrimaryLogPG::do_op(boost::intrusive_ptr<OpRequest>&)+0x2ef5) [0x5637f7f1b7d5]
20: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0xec5) [0x5637f7ed6025]
21: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3ab) [0x5637f7d4b87b]
22: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest> const&)+0x5a) [0x5637f7ff613a]
23: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x102d) [0x5637f7d79d1d]
24: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x8ef) [0x5637f8369d7f]
25: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x5637f836d080]
26: (()+0x7494) [0x7f2f640bd494]
27: (clone()+0x3f) [0x7f2f63144acf]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
--- logging levels ---
0/ 5 none
0/ 1 lockdep
0/ 1 context
1/ 1 crush
1/ 5 mds
1/ 5 mds_balancer
1/ 5 mds_locker
1/ 5 mds_log
1/ 5 mds_log_expire
1/ 5 mds_migrator
0/ 1 buffer
0/ 1 timer
0/ 1 filer
0/ 1 striper
0/ 1 objecter
0/ 5 rados
0/ 5 rbd
0/ 5 rbd_mirror
0/ 5 rbd_replay
0/ 5 journaler
0/ 5 objectcacher
0/ 5 client
0/ 0 osd
0/ 5 optracker
0/ 5 objclass
0/ 0 filestore
0/ 0 journal
0/ 0 ms
1/ 5 mon
0/10 monc
1/ 5 paxos
0/ 5 tp
1/ 5 auth
1/ 5 crypto
1/ 1 finisher
1/ 1 reserver
1/ 5 heartbeatmap
1/ 5 perfcounter
1/ 5 rgw
1/10 civetweb
1/ 5 javaclient
1/ 5 asok
1/ 1 throttle
0/ 0 refs
1/ 5 xio
1/ 5 compressor
1/ 5 bluestore
1/ 5 bluefs
1/ 3 bdev
1/ 5 kstore
4/ 5 rocksdb
4/ 5 leveldb
4/ 5 memdb
1/ 5 kinetic
1/ 5 fuse
1/ 5 mgr
1/ 5 mgrc
1/ 5 dpdk
1/ 5 eventtrace
-2/-2 (syslog threshold)
-1/-1 (stderr threshold)
max_recent 10000
max_new 1000
log_file /var/log/ceph/ceph-osd.4.log
--- end dump of recent events ---
Object storage utilisation:
Code:
[root@kvm1 ~]# rados df
POOL_NAME USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS RD WR_OPS WR
cephfs_data 1767M 445 0 1335 0 0 0 583 78021k 587 1918M
cephfs_metadata 889k 58 0 174 0 0 0 256 14607k 906 1469k
ec_ssd 3072M 769 0 2307 0 0 0 94 4168k 14742 3072M
rbd_hdd 1165G 300427 0 901281 0 0 0 397694691 25133G 932424410 11349G
rbd_ssd 59852M 15331 0 45993 0 0 0 1933348 106G 4721464 160G
total_objects 317030
total_used 3658G
total_avail 6399G
total_space 10058G
Attachments
Last edited: