Unexpected Ceph behaviour from unused Ceph installation

SpinningRust · Aug 27, 2021

Hello everyone,

I honestly don't really know or remember how i got myself into this situation.

What i remember is: quite some time ago (early/mid 2020) I installed Ceph on my now only PVE node to take a look at it.
After some time I uninstalled it; most likely with apt as i wasn't aware of pveceph existence until 2 hours ago.

Until now, i was busy with upgrading from PVE 6 to 7 which was quite troublesome because of mentioned ceph.... residues/leftovers/etc...
pveceph purge fails with the error

Bash:

Error gathering ceph info, already purged? Message: got timeout
Foreign MON address in ceph.conf. Keeping config & keyrings

What is quite interesting from my perspective, if i try to install ceph Nautilus/octopus/pacific (yes i tried all three on their appropriate PVE version) via the web UI, it fails as in the attached screenshot.
If I try a ceph status after the error in the screenshot occurs, it gives me this error message:

Code:

root@ControlCoreAngel:~# ceph status
Error initializing cluster client: ObjectNotFound('RADOS object not found (error calling conf_read_file)')

I believe, this is the sole reason for my situation, I may be wrong though.

Unfortunately, i could not find any helpful resources online; I'll attach the links to forum threads i tried to no avail.

Thanks in advance for any help!

John

Attempted solutions:
https://forum.proxmox.com/threads/reinstall-ceph-after-pveceph-purge.56635/
https://forum.proxmox.com/threads/remove-ceph.59576/ (never had any configurations made in the first place)
https://forum.proxmox.com/threads/removing-ceph.26318/
https://forum.proxmox.com/threads/not-able-to-use-pveceph-purge-to-completely-remove-ceph.59606/

Code:

proxmox-ve: 7.0-2 (running kernel: 5.11.22-3-pve)
pve-manager: 7.0-11 (running version: 7.0-11/63d82f4e)
pve-kernel-5.11: 7.0-6
pve-kernel-helper: 7.0-6
pve-kernel-5.4: 6.4-5
pve-kernel-5.3: 6.1-6
pve-kernel-5.0: 6.0-11
pve-kernel-5.11.22-3-pve: 5.11.22-7
pve-kernel-5.4.128-1-pve: 5.4.128-2
pve-kernel-5.4.124-1-pve: 5.4.124-2
pve-kernel-5.4.119-1-pve: 5.4.119-1
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-5.0.21-5-pve: 5.0.21-10
pve-kernel-5.0.15-1-pve: 5.0.15-1
ceph: 16.2.5-pve1
ceph-fuse: 16.2.5-pve1
corosync: 3.1.2-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: not correctly installed
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.21-pve1
libproxmox-acme-perl: 1.3.0
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.0-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.0-6
libpve-guest-common-perl: 4.0-2
libpve-http-server-perl: 4.0-2
libpve-storage-perl: 7.0-10
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.9-4
lxcfs: 4.0.8-pve2
novnc-pve: 1.2.0-3
proxmox-backup-client: 2.0.9-2
proxmox-backup-file-restore: 2.0.9-2
proxmox-mini-journalreader: 1.2-1
proxmox-widget-toolkit: 3.3-6
pve-cluster: 7.0-3
pve-container: 4.0-9
pve-docs: 7.0-5
pve-edk2-firmware: 3.20200531-1
pve-firewall: 4.2-2
pve-firmware: 3.2-4
pve-ha-manager: 3.3-1
pve-i18n: 2.4-1
pve-qemu-kvm: 6.0.0-3
pve-xtermjs: 4.12.0-1
pve-zsync: 2.2
qemu-server: 7.0-13
smartmontools: 7.2-pve2
spiceterm: 3.2-2
vncterm: 1.7-1
zfsutils-linux: 2.0.5-pve1

Code:

root@ControlCoreAngel:~# systemctl status ceph-crash
● ceph-crash.service - Ceph crash dump collector
     Loaded: loaded (/lib/systemd/system/ceph-crash.service; enabled; vendor preset: enabled)
     Active: active (running) since Fri 2021-08-27 20:13:31 CEST; 1h 38min ago
   Main PID: 2582 (ceph-crash)
      Tasks: 1 (limit: 77036)
     Memory: 5.7M
        CPU: 39ms
     CGroup: /system.slice/ceph-crash.service
             └─2582 /usr/bin/python3 /usr/bin/ceph-crash

Aug 27 20:13:31 ControlCoreAngel systemd[1]: Started Ceph crash dump collector.
Aug 27 20:13:31 ControlCoreAngel ceph-crash[2582]: INFO:ceph-crash:monitoring path /var/lib/ceph/crash, delay 600s

Bash:

root@ControlCoreAngel:/etc/ceph#  pveceph createmgr
got timeout

SpinningRust · Sep 7, 2021

JohnTanner said:
Hello everyone,

I honestly don't really know or remember how i got myself into this situation.

What i remember is: quite some time ago (early/mid 2020) I installed Ceph on my now only PVE node to take a look at it.
After some time I uninstalled it; most likely with apt as i wasn't aware of pveceph existence until 2 hours ago.

Until now, i was busy with upgrading from PVE 6 to 7 which was quite troublesome because of mentioned ceph.... residues/leftovers/etc...
pveceph purge fails with the error

Bash:

Error gathering ceph info, already purged? Message: got timeout Foreign MON address in ceph.conf. Keeping config & keyrings

What is quite interesting from my perspective, if i try to install ceph Nautilus/octopus/pacific (yes i tried all three on their appropriate PVE version) via the web UI, it fails as in the attached screenshot.
If I try a ceph status after the error in the screenshot occurs, it gives me this error message:

Code:

root@ControlCoreAngel:~# ceph status Error initializing cluster client: ObjectNotFound('RADOS object not found (error calling conf_read_file)')

I believe, this is the sole reason for my situation, I may be wrong though.

Unfortunately, i could not find any helpful resources online; I'll attach the links to forum threads i tried to no avail.

Thanks in advance for any help!

John

Attempted solutions:
https://forum.proxmox.com/threads/reinstall-ceph-after-pveceph-purge.56635/
https://forum.proxmox.com/threads/remove-ceph.59576/ (never had any configurations made in the first place)
https://forum.proxmox.com/threads/removing-ceph.26318/
https://forum.proxmox.com/threads/not-able-to-use-pveceph-purge-to-completely-remove-ceph.59606/

Code:

proxmox-ve: 7.0-2 (running kernel: 5.11.22-3-pve) pve-manager: 7.0-11 (running version: 7.0-11/63d82f4e) pve-kernel-5.11: 7.0-6 pve-kernel-helper: 7.0-6 pve-kernel-5.4: 6.4-5 pve-kernel-5.3: 6.1-6 pve-kernel-5.0: 6.0-11 pve-kernel-5.11.22-3-pve: 5.11.22-7 pve-kernel-5.4.128-1-pve: 5.4.128-2 pve-kernel-5.4.124-1-pve: 5.4.124-2 pve-kernel-5.4.119-1-pve: 5.4.119-1 pve-kernel-5.3.18-3-pve: 5.3.18-3 pve-kernel-5.0.21-5-pve: 5.0.21-10 pve-kernel-5.0.15-1-pve: 5.0.15-1 ceph: 16.2.5-pve1 ceph-fuse: 16.2.5-pve1 corosync: 3.1.2-pve2 criu: 3.15-1+pve-1 glusterfs-client: 9.2-1 ifupdown: not correctly installed ifupdown2: 3.1.0-1+pmx3 ksm-control-daemon: 1.4-1 libjs-extjs: 7.0.0-1 libknet1: 1.21-pve1 libproxmox-acme-perl: 1.3.0 libproxmox-backup-qemu0: 1.2.0-1 libpve-access-control: 7.0-4 libpve-apiclient-perl: 3.2-1 libpve-common-perl: 7.0-6 libpve-guest-common-perl: 4.0-2 libpve-http-server-perl: 4.0-2 libpve-storage-perl: 7.0-10 libqb0: 1.0.5-1 libspice-server1: 0.14.3-2.1 lvm2: 2.03.11-2.1 lxc-pve: 4.0.9-4 lxcfs: 4.0.8-pve2 novnc-pve: 1.2.0-3 proxmox-backup-client: 2.0.9-2 proxmox-backup-file-restore: 2.0.9-2 proxmox-mini-journalreader: 1.2-1 proxmox-widget-toolkit: 3.3-6 pve-cluster: 7.0-3 pve-container: 4.0-9 pve-docs: 7.0-5 pve-edk2-firmware: 3.20200531-1 pve-firewall: 4.2-2 pve-firmware: 3.2-4 pve-ha-manager: 3.3-1 pve-i18n: 2.4-1 pve-qemu-kvm: 6.0.0-3 pve-xtermjs: 4.12.0-1 pve-zsync: 2.2 qemu-server: 7.0-13 smartmontools: 7.2-pve2 spiceterm: 3.2-2 vncterm: 1.7-1 zfsutils-linux: 2.0.5-pve1

Code:

root@ControlCoreAngel:~# systemctl status ceph-crash ● ceph-crash.service - Ceph crash dump collector Loaded: loaded (/lib/systemd/system/ceph-crash.service; enabled; vendor preset: enabled) Active: active (running) since Fri 2021-08-27 20:13:31 CEST; 1h 38min ago Main PID: 2582 (ceph-crash) Tasks: 1 (limit: 77036) Memory: 5.7M CPU: 39ms CGroup: /system.slice/ceph-crash.service └─2582 /usr/bin/python3 /usr/bin/ceph-crash Aug 27 20:13:31 ControlCoreAngel systemd[1]: Started Ceph crash dump collector. Aug 27 20:13:31 ControlCoreAngel ceph-crash[2582]: INFO:ceph-crash:monitoring path /var/lib/ceph/crash, delay 600s

Bash:

root@ControlCoreAngel:/etc/ceph# pveceph createmgr got timeout

No one any ideas?

hd-- · Mar 18, 2024

still got the issue ?

Error gathering ceph info, already purged? Message: got timeout
Foreign MON address in ceph.conf. Keeping config & keyrings

Might be that the there is a monitor not properly removed, check if there is a monitor defined in /etc/pve/ceph.conf

SpinningRust · Mar 18, 2024

Thank you very much for replying to such an old thread!

To be honest, no, because I just rebuilt my server about a year ago. One of the reasons were some failed experiments like this one and a new setup where the boot pool is a zfs mirror instead of a singular device.

Search

Search

Unexpected Ceph behaviour from unused Ceph installation

SpinningRust

Active Member

SpinningRust

Active Member

hd--

Proxmox Staff Member

SpinningRust

Active Member

We value your privacy