OSD ghost

JanoStasik

New Member
Sep 23, 2024
2
0
1
Hello,
I am running 3 node proxmox cluster on version 8.3.0. I have CEPH 19.2.0 installed on each node. I am running 3 monitors and managers, healt status is OK. On each node I have Samsung 990 Pro nvme drive, dedicated for CEPH OSD. No matter what i try, no matter what order i pick I always end with OSD as a ghost.
I click on CEPH, OSD and create OSD. System offer unused Samsung drive, and i am not touching anything else except create. TASK run OK with no errors. But after that i can see created OSD on that page, only overal page shows that i have osd.0 as a ghost.
What am I doing wrong?

PS: Before I've started to create OSD, I've erased drive on each node with: ceph-volume lvm zap /dev/nvme0n1 --destroy

Attached are screens:
create osd - how i create it
log file from successful task
ceph_after_create - configuration osd, default is blank, nothing there
gohst_osd - on dashboard is visible ghost osd
 

Attachments

Last edited:
I have the exact same issue too, I can't get OSDs at all, I wonder if this is an issue with squid?

Edit: just noticed this in the logs

root@Instalation01:~# systemctl status ceph-osd@1
× ceph-osd@1.service - Ceph object storage daemon osd.1
Loaded: loaded (/lib/systemd/system/ceph-osd@.service; enabled-runtime; preset: enabled)
Drop-In: /usr/lib/systemd/system/ceph-osd@.service.d
└─ceph-after-pve-cluster.conf
Active: failed (Result: exit-code) since Tue 2025-03-11 23:13:29 MST; 10s ago
Duration: 827ms
Process: 19663 ExecStartPre=/usr/libexec/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id 1 (code=exited, status=0/SUCCESS)
Process: 19668 ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} --id 1 --setuser ceph --setgroup ceph (code=exited, status=1/FAILURE)
Main PID: 19668 (code=exited, status=1/FAILURE)
CPU: 97ms

Mar 11 23:13:29 Instalation01 systemd[1]: ceph-osd@1.service: Scheduled restart job, restart counter is at 3.
Mar 11 23:13:29 Instalation01 systemd[1]: Stopped ceph-osd@1.service - Ceph object storage daemon osd.1.
Mar 11 23:13:29 Instalation01 systemd[1]: ceph-osd@1.service: Start request repeated too quickly.
Mar 11 23:13:29 Instalation01 systemd[1]: ceph-osd@1.service: Failed with result 'exit-code'.
Mar 11 23:13:29 Instalation01 systemd[1]: Failed to start ceph-osd@1.service - Ceph object storage daemon osd.1.

also attached is my log file

Edit again

root@Instalation01:~# systemctl status ceph-osd@0
ceph-osd@0.service - Ceph object storage daemon osd.0
Loaded: loaded (/lib/systemd/system/ceph-osd@.service; enabled-runtime; preset: enabled)
Drop-In: /usr/lib/systemd/system/ceph-osd@.service.d
└─ceph-after-pve-cluster.conf
Active: activating (auto-restart) (Result: exit-code) since Tue 2025-03-11 23:19:38 MST; 2s ago
Process: 21808 ExecStartPre=/usr/libexec/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id 0 (code=exited, status=0/SUCCESS)
Process: 21819 ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} --id 0 --setuser ceph --setgroup ceph (code=exited, status=1/FAILURE)
Main PID: 21819 (code=exited, status=1/FAILURE)
CPU: 99ms
root@Instalation01:~# systemctl status ceph-osd@0
ceph-osd@0.service - Ceph object storage daemon osd.0
Loaded: loaded (/lib/systemd/system/ceph-osd@.service; enabled-runtime; preset: enabled)
Drop-In: /usr/lib/systemd/system/ceph-osd@.service.d
└─ceph-after-pve-cluster.conf
Active: active (running) since Tue 2025-03-11 23:19:48 MST; 679ms ago
Process: 22076 ExecStartPre=/usr/libexec/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id 0 (code=exited, status=0/SUCCESS)
Main PID: 22099 (ceph-osd)
Tasks: 8
Memory: 11.2M
CPU: 97ms
CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@0.service
└─22099 /usr/bin/ceph-osd -f --cluster ceph --id 0 --setuser ceph --setgroup ceph

Mar 11 23:19:48 Instalation01 systemd[1]: Starting ceph-osd@0.service - Ceph object storage daemon osd.0...
Mar 11 23:19:48 Instalation01 systemd[1]: Started ceph-osd@0.service - Ceph object storage daemon osd.0.
root@Instalation01:~# systemctl status ceph-osd@0
ceph-osd@0.service - Ceph object storage daemon osd.0
Loaded: loaded (/lib/systemd/system/ceph-osd@.service; enabled-runtime; preset: enabled)
Drop-In: /usr/lib/systemd/system/ceph-osd@.service.d
└─ceph-after-pve-cluster.conf
Active: activating (auto-restart) (Result: exit-code) since Tue 2025-03-11 23:19:49 MST; 5s ago
Process: 22076 ExecStartPre=/usr/libexec/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id 0 (code=exited, status=0/SUCCESS)
Process: 22099 ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} --id 0 --setuser ceph --setgroup ceph (code=exited, status=1/FAILURE)
Main PID: 22099 (code=exited, status=1/FAILURE)
CPU: 104ms

Just tried this to no avail, seems like its a permisions error almost??
 

Attachments

Last edited:
Just found this in my logs when running [ ls /var/log/ceph/ceph-osd.0.log ] seems like it dose'nt like that I split my cluster network onto ipv6 and my public to ipv4. Found this on https://docs.ceph.com/en/reef/rados/troubleshooting/troubleshooting-osd/ hopefully this helps.


2025-03-11T23:19:59.496-0700 75e39dbea840 0 set uid:gid to 64045:64045 (ceph:ceph)
2025-03-11T23:19:59.496-0700 75e39dbea840 0 ceph version 19.2.0 (3815e3391b18c593539df6fa952c9f45c37ee4d0) squid (stable), process ceph-osd, pid 22244
2025-03-11T23:19:59.496-0700 75e39dbea840 0 pidfile_write: ignore empty --pid-file
2025-03-11T23:19:59.498-0700 75e39dbea840 1 bdev(0x57456e03ee00 /var/lib/ceph/osd/ceph-0/block) open path /var/lib/ceph/osd/ceph-0/block
2025-03-11T23:19:59.498-0700 75e39dbea840 0 bdev(0x57456e03ee00 /var/lib/ceph/osd/ceph-0/block) ioctl(F_SET_FILE_RW_HINT) on /var/lib/ceph/osd/ceph-0/block failed: (22) Invalid argument
2025-03-11T23:19:59.498-0700 75e39dbea840 1 bdev(0x57456e03ee00 /var/lib/ceph/osd/ceph-0/block) open size 500103643136 (0x7470800000, 466 GiB) block_size 4096 (4 KiB) rotational device, discard not supported
2025-03-11T23:19:59.498-0700 75e39dbea840 1 bluestore(/var/lib/ceph/osd/ceph-0) _set_cache_sizes cache_size 1073741824 meta 0.45 kv 0.45 kv_onode 0.04 data 0.06
2025-03-11T23:19:59.499-0700 75e39dbea840 1 bdev(0x57456e03f180 /var/lib/ceph/osd/ceph-0/block.db) open path /var/lib/ceph/osd/ceph-0/block.db
2025-03-11T23:19:59.499-0700 75e39dbea840 0 bdev(0x57456e03f180 /var/lib/ceph/osd/ceph-0/block.db) ioctl(F_SET_FILE_RW_HINT) on /var/lib/ceph/osd/ceph-0/block.db failed: (22) Invalid argument
2025-03-11T23:19:59.499-0700 75e39dbea840 1 bdev(0x57456e03f180 /var/lib/ceph/osd/ceph-0/block.db) open size 50012880896 (0xba5000000, 47 GiB) block_size 4096 (4 KiB) non-rotational device, discard supported
2025-03-11T23:19:59.499-0700 75e39dbea840 1 bluefs add_block_device bdev 1 path /var/lib/ceph/osd/ceph-0/block.db size 47 GiB
2025-03-11T23:19:59.500-0700 75e39dbea840 1 bdev(0x57456e03f500 /var/lib/ceph/osd/ceph-0/block) open path /var/lib/ceph/osd/ceph-0/block
2025-03-11T23:19:59.500-0700 75e39dbea840 0 bdev(0x57456e03f500 /var/lib/ceph/osd/ceph-0/block) ioctl(F_SET_FILE_RW_HINT) on /var/lib/ceph/osd/ceph-0/block failed: (22) Invalid argument
2025-03-11T23:19:59.500-0700 75e39dbea840 1 bdev(0x57456e03f500 /var/lib/ceph/osd/ceph-0/block) open size 500103643136 (0x7470800000, 466 GiB) block_size 4096 (4 KiB) rotational device, discard not supported
2025-03-11T23:19:59.500-0700 75e39dbea840 1 bluefs add_block_device bdev 2 path /var/lib/ceph/osd/ceph-0/block size 466 GiB
2025-03-11T23:19:59.500-0700 75e39dbea840 1 bdev(0x57456e03f180 /var/lib/ceph/osd/ceph-0/block.db) close
2025-03-11T23:19:59.767-0700 75e39dbea840 1 bdev(0x57456e03f500 /var/lib/ceph/osd/ceph-0/block) close
2025-03-11T23:20:00.012-0700 75e39dbea840 1 bdev(0x57456e03ee00 /var/lib/ceph/osd/ceph-0/block) close
2025-03-11T23:20:00.262-0700 75e39dbea840 0 starting osd.0 osd_data /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal
2025-03-11T23:20:00.262-0700 75e39dbea840 -1 unable to find any IPv6 address in networks '192.168.23.1/24' interfaces ''
2025-03-11T23:20:00.262-0700 75e39dbea840 -1 Failed to pick public address.