Clean Proxmox & Ceph Install issues

Loki928

New Member
Aug 8, 2023
4
2
3
Please HELP! I need it regarding a clean Ceph-Quincy install. I’ve tried everything I can think of without success to get Ceph-Quincy up and running on my Dell 2950 with a dual 10GbE nic and dual 1GbE nics. I did a clean Proxmox VE 8 install on 3 nodes (Love-Dell r510, Understanding-Dell r510, & Acceptance-Dell 2950) each updated to the latest version (Proxmox 8.0.4) prior to creating the cluster. Cluster named and created without a problem. All 3 nodes appearing to work flawlessly.

Ceph install (Guide being followed: https://pve.proxmox.com/pve-docs/pveceph.1.html#pve_ceph_install)

I’ve followed the guide via web-based wizard & CLI installation. Node’s Love & Understanding have no issues and are displaying a running status, TCP/IP address with port :6789/0, Version of 17.2.6, and Quorum as Yes. Node Acceptance shows a status of stopped, a TCP/IP with NO port number, NO version number, and Quorum as No.

When mon.Acceptance is selected and a start is attempted it looks like it’s adding the monitor, but nothing happens. I can’t delete it and it won’t add it to the ceph cluster. Any clues/ideas on what’s happening? Here are some log files from the Dell 2950.

Please help, here are some of the logs for Acceptance:

--- begin dump of recent events ---

-35> 2023-08-07T22:38:29.246-0700 7fe45fba8a00 5 asok(0x556aed712000) register_command assert hook 0x556aed3fcae0
-34> 2023-08-07T22:38:29.246-0700 7fe45fba8a00 5 asok(0x556aed712000) register_command abort hook 0x556aed3fcae0
-33> 2023-08-07T22:38:29.246-0700 7fe45fba8a00 5 asok(0x556aed712000) register_command leak_some_memory hook 0x556aed3fcae0
-32> 2023-08-07T22:38:29.246-0700 7fe45fba8a00 5 asok(0x556aed712000) register_command perfcounters_dump hook 0x556aed3fcae0
-31> 2023-08-07T22:38:29.246-0700 7fe45fba8a00 5 asok(0x556aed712000) register_command 1 hook 0x556aed3fcae0
-30> 2023-08-07T22:38:29.246-0700 7fe45fba8a00 5 asok(0x556aed712000) register_command perf dump hook 0x556aed3fcae0
-29> 2023-08-07T22:38:29.246-0700 7fe45fba8a00 5 asok(0x556aed712000) register_command perfcounters_schema hook 0x556aed3fcae0
-28> 2023-08-07T22:38:29.246-0700 7fe45fba8a00 5 asok(0x556aed712000) register_command perf histogram dump hook 0x556aed3fcae0
-27> 2023-08-07T22:38:29.246-0700 7fe45fba8a00 5 asok(0x556aed712000) register_command 2 hook 0x556aed3fcae0
-26> 2023-08-07T22:38:29.246-0700 7fe45fba8a00 5 asok(0x556aed712000) register_command perf schema hook 0x556aed3fcae0
-25> 2023-08-07T22:38:29.246-0700 7fe45fba8a00 5 asok(0x556aed712000) register_command perf histogram schema hook 0x556aed3fcae0
-24> 2023-08-07T22:38:29.246-0700 7fe45fba8a00 5 asok(0x556aed712000) register_command perf reset hook 0x556aed3fcae0
-23> 2023-08-07T22:38:29.246-0700 7fe45fba8a00 5 asok(0x556aed712000) register_command config show hook 0x556aed3fcae0
-22> 2023-08-07T22:38:29.246-0700 7fe45fba8a00 5 asok(0x556aed712000) register_command config help hook 0x556aed3fcae0
-21> 2023-08-07T22:38:29.246-0700 7fe45fba8a00 5 asok(0x556aed712000) register_command config set hook 0x556aed3fcae0
-20> 2023-08-07T22:38:29.246-0700 7fe45fba8a00 5 asok(0x556aed712000) register_command config unset hook 0x556aed3fcae0
-19> 2023-08-07T22:38:29.246-0700 7fe45fba8a00 5 asok(0x556aed712000) register_command config get hook 0x556aed3fcae0
-18> 2023-08-07T22:38:29.246-0700 7fe45fba8a00 5 asok(0x556aed712000) register_command config diff hook 0x556aed3fcae0
-17> 2023-08-07T22:38:29.246-0700 7fe45fba8a00 5 asok(0x556aed712000) register_command config diff get hook 0x556aed3fcae0
-16> 2023-08-07T22:38:29.246-0700 7fe45fba8a00 5 asok(0x556aed712000) register_command injectargs hook 0x556aed3fcae0
-15> 2023-08-07T22:38:29.246-0700 7fe45fba8a00 5 asok(0x556aed712000) register_command log flush hook 0x556aed3fcae0
-14> 2023-08-07T22:38:29.246-0700 7fe45fba8a00 5 asok(0x556aed712000) register_command log dump hook 0x556aed3fcae0
-13> 2023-08-07T22:38:29.246-0700 7fe45fba8a00 5 asok(0x556aed712000) register_command log reopen hook 0x556aed3fcae0
-12> 2023-08-07T22:38:29.246-0700 7fe45fba8a00 5 asok(0x556aed712000) register_command dump_mempools hook 0x556aed510068
-11> 2023-08-07T22:38:29.258-0700 7fe45fba8a00 0 set uid:gid to 64045:64045 (ceph:ceph)
-10> 2023-08-07T22:38:29.258-0700 7fe45fba8a00 0 ceph version 17.2.6 (810db68029296377607028a6c6da1ec06f5a2b27) quincy (stable), process ceph-mon, pid 171462
-9> 2023-08-07T22:38:29.258-0700 7fe45fba8a00 0 pidfile_write: ignore empty --pid-file
-8> 2023-08-07T22:38:29.258-0700 7fe45fba8a00 5 asok(0x556aed712000) init /var/run/ceph/ceph-mon.Acceptance.asok
-7> 2023-08-07T22:38:29.258-0700 7fe45fba8a00 5 asok(0x556aed712000) bind_and_listen /var/run/ceph/ceph-mon.Acceptance.asok
-6> 2023-08-07T22:38:29.258-0700 7fe45fba8a00 5 asok(0x556aed712000) register_command 0 hook 0x556aed488cd0
-5> 2023-08-07T22:38:29.258-0700 7fe45fba8a00 5 asok(0x556aed712000) register_command version hook 0x556aed488cd0
-4> 2023-08-07T22:38:29.258-0700 7fe45fba8a00 5 asok(0x556aed712000) register_command git_version hook 0x556aed488cd0
-3> 2023-08-07T22:38:29.258-0700 7fe45fba8a00 5 asok(0x556aed712000) register_command help hook 0x556aed3fc9a0
-2> 2023-08-07T22:38:29.258-0700 7fe45fba8a00 5 asok(0x556aed712000) register_command get_command_descriptions hook 0x556aed3fc990
-1> 2023-08-07T22:38:29.258-0700 7fe45eb546c0 5 asok(0x556aed712000) entry start
0> 2023-08-07T22:38:29.266-0700 7fe45fba8a00 -1 *** Caught signal (Illegal instruction) **
in thread 7fe45fba8a00 thread_name:ceph-mon
ceph version 17.2.6 (810db68029296377607028a6c6da1ec06f5a2b27) quincy (stable)
1: /lib/x86_64-linux-gnu/libc.so.6(+0x3bfd0) [0x7fe460249fd0]
2: gf_init_hard()
3: gf_init_easy()
4: galois_init_default_field()
5: jerasure_init()
6: __erasure_code_init()
7: (ceph::ErasureCodePluginRegistry::load(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, ceph::ErasureCodePlugin**, std::eek:stream*)+0x2b5) [0x556aeb2f0605]
8: (ceph::ErasureCodePluginRegistry::preload(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::eek:stream*)+0x9f) [0x556aeb2f0baf]
9: (global_init_preload_erasure_code(ceph::common::CephContext const*)+0x7c2) [0x556aead9df92]
10: main()
11: /lib/x86_64-linux-gnu/libc.so.6(+0x271ca) [0x7fe4602351ca]
12: __libc_start_main()
13: _start()
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- logging levels ---
0/ 5 none
0/ 1 lockdep
0/ 1 context
1/ 1 crush
1/ 5 mds
1/ 5 mds_balancer
1/ 5 mds_locker
1/ 5 mds_log
1/ 5 mds_log_expire
1/ 5 mds_migrator
0/ 1 buffer
0/ 1 timer
0/ 1 filer
0/ 1 striper
0/ 1 objecter
0/ 5 rados
0/ 5 rbd
0/ 5 rbd_mirror
0/ 5 rbd_replay
0/ 5 rbd_pwl
0/ 5 journaler
0/ 5 objectcacher
0/ 5 immutable_obj_cache
0/ 5 client
1/ 5 osd
0/ 5 optracker
0/ 5 objclass
1/ 3 filestore
1/ 3 journal
0/ 0 ms
1/ 5 mon
0/10 monc
1/ 5 paxos
0/ 5 tp
1/ 5 auth
1/ 5 crypto
1/ 1 finisher
1/ 1 reserver
1/ 5 heartbeatmap
1/ 5 perfcounter
1/ 5 rgw
1/ 5 rgw_sync
1/ 5 rgw_datacache
1/10 civetweb
1/ 5 javaclient
1/ 5 asok
1/ 1 throttle
0/ 0 refs
1/ 5 compressor
1/ 5 bluestore
1/ 5 bluefs
1/ 3 bdev
1/ 5 kstore
4/ 5 rocksdb
4/ 5 leveldb
4/ 5 memdb
1/ 5 fuse
2/ 5 mgr
1/ 5 mgrc
1/ 5 dpdk
1/ 5 eventtrace
1/ 5 prioritycache
0/ 5 test
0/ 5 cephfs_mirror
0/ 5 cephsqlite
0/ 5 seastore
0/ 5 seastore_onode
0/ 5 seastore_odata
0/ 5 seastore_omap
0/ 5 seastore_tm
0/ 5 seastore_cleaner
0/ 5 seastore_lba
0/ 5 seastore_cache
0/ 5 seastore_journal
0/ 5 seastore_device
0/ 5 alienstore
1/ 5 mclock
1/ 5 ceph_exporter
-2/-2 (syslog threshold)
-1/-1 (stderr threshold)
--- pthread ID / name mapping for recent threads ---
7fe45eb546c0 / admin_socket
7fe45fba8a00 / ceph-mon
max_recent 10000
max_new 10000
log_file /var/lib/ceph/crash/2023-08-08T05:38:29.269862Z_bc058b1a-2607-4ac8-a9ea-1ac60935eae1/log
--- end dump of recent events ---

journalctl -xeu ceph-mon@Acceptance.service

Aug 07 22:38:39 Acceptance systemd[1]: Stopped ceph-mon@Acceptance.service - Ceph cluster monitor daemon.
░░ Subject: A stop job for unit ceph-mon@Acceptance.service has finished
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ A stop job for unit ceph-mon@Acceptance.service has finished.
░░
░░ The job identifier is 2586 and the job result is done.
Aug 07 22:38:39 Acceptance systemd[1]: ceph-mon@Acceptance.service: Start request repeated too quickly.
Aug 07 22:38:39 Acceptance systemd[1]: ceph-mon@Acceptance.service: Failed with result 'signal'.
░░ Subject: Unit failed
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ The unit ceph-mon@Acceptance.service has entered the 'failed' state with result 'signal'.
Aug 07 22:38:39 Acceptance systemd[1]: Failed to start ceph-mon@Acceptance.service - Ceph cluster monitor daemon.
░░ Subject: A start job for unit ceph-mon@Acceptance.service has failed
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ A start job for unit ceph-mon@Acceptance.service has finished with a failure.
░░
░░ The job identifier is 2586 and the job result is failed.

pveceph status

cluster:
id: 4b4bce0e-8cb8-437b-83e4-b61cc519f5ed
health: HEALTH_WARN
OSD count 0 < osd_pool_default_size 3

services:
mon: 2 daemons, quorum Love,Understanding (age 6h)
mgr: Love(active, since 6h)
osd: 0 osds: 0 up, 0 in

data:

pools: 0 pools, 0 pgs

objects: 0 objects, 0 B

usage: 0 B used, 0 B / 0 B avail

pgs:
 
Might not be possible. this server is pre nehalem generation and lacks almost all modern offload capabilities. also, unless you have free power and AC you really dont want to be on this in 2023.
Thanks, it was given to me. Might be worth decommissioning.
 
Last edited:
No, unfortunately, I did not get it to work with the 2950. I found a Dell R720 cheap so I replaced it. I'm going to use the 2950 and frankenstein an M1 Macintosh inside it and use the hard drive bays for storage. Not sure what else to do with it.
 
  • Like
Reactions: pepex7
And did the R510 work?

I have an R510 and 2 R750xs and I have had problems with ceph.
 
Yes, I have both r510's and the r720 in production working flawlessly with Ceph 17.2.6. with a 40GbE ceph network backplane. I bonded both 1GbE ports on each server for the "main" network interface.
 
  • Like
Reactions: pepex7

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!