Unexpected Ceph behaviour from unused Ceph installation

JohnTanner

Active Member
Sep 25, 2019
43
3
28
35
Hello everyone,

I honestly don't really know or remember how i got myself into this situation.

What i remember is: quite some time ago (early/mid 2020) I installed Ceph on my now only PVE node to take a look at it.
After some time I uninstalled it; most likely with apt as i wasn't aware of pveceph existence until 2 hours ago.

Until now, i was busy with upgrading from PVE 6 to 7 which was quite troublesome because of mentioned ceph.... residues/leftovers/etc...
pveceph purge fails with the error
Bash:
Error gathering ceph info, already purged? Message: got timeout
Foreign MON address in ceph.conf. Keeping config & keyrings

What is quite interesting from my perspective, if i try to install ceph Nautilus/octopus/pacific (yes i tried all three on their appropriate PVE version) via the web UI, it fails as in the attached screenshot.
If I try a ceph status after the error in the screenshot occurs, it gives me this error message:
Code:
root@ControlCoreAngel:~# ceph status
Error initializing cluster client: ObjectNotFound('RADOS object not found (error calling conf_read_file)')

I believe, this is the sole reason for my situation, I may be wrong though.

Unfortunately, i could not find any helpful resources online; I'll attach the links to forum threads i tried to no avail.

Thanks in advance for any help!

John


Attempted solutions:
https://forum.proxmox.com/threads/reinstall-ceph-after-pveceph-purge.56635/
https://forum.proxmox.com/threads/remove-ceph.59576/ (never had any configurations made in the first place)
https://forum.proxmox.com/threads/removing-ceph.26318/
https://forum.proxmox.com/threads/not-able-to-use-pveceph-purge-to-completely-remove-ceph.59606/


Code:
proxmox-ve: 7.0-2 (running kernel: 5.11.22-3-pve)
pve-manager: 7.0-11 (running version: 7.0-11/63d82f4e)
pve-kernel-5.11: 7.0-6
pve-kernel-helper: 7.0-6
pve-kernel-5.4: 6.4-5
pve-kernel-5.3: 6.1-6
pve-kernel-5.0: 6.0-11
pve-kernel-5.11.22-3-pve: 5.11.22-7
pve-kernel-5.4.128-1-pve: 5.4.128-2
pve-kernel-5.4.124-1-pve: 5.4.124-2
pve-kernel-5.4.119-1-pve: 5.4.119-1
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-5.0.21-5-pve: 5.0.21-10
pve-kernel-5.0.15-1-pve: 5.0.15-1
ceph: 16.2.5-pve1
ceph-fuse: 16.2.5-pve1
corosync: 3.1.2-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: not correctly installed
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.21-pve1
libproxmox-acme-perl: 1.3.0
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.0-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.0-6
libpve-guest-common-perl: 4.0-2
libpve-http-server-perl: 4.0-2
libpve-storage-perl: 7.0-10
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.9-4
lxcfs: 4.0.8-pve2
novnc-pve: 1.2.0-3
proxmox-backup-client: 2.0.9-2
proxmox-backup-file-restore: 2.0.9-2
proxmox-mini-journalreader: 1.2-1
proxmox-widget-toolkit: 3.3-6
pve-cluster: 7.0-3
pve-container: 4.0-9
pve-docs: 7.0-5
pve-edk2-firmware: 3.20200531-1
pve-firewall: 4.2-2
pve-firmware: 3.2-4
pve-ha-manager: 3.3-1
pve-i18n: 2.4-1
pve-qemu-kvm: 6.0.0-3
pve-xtermjs: 4.12.0-1
pve-zsync: 2.2
qemu-server: 7.0-13
smartmontools: 7.2-pve2
spiceterm: 3.2-2
vncterm: 1.7-1
zfsutils-linux: 2.0.5-pve1

Code:
root@ControlCoreAngel:~# systemctl status ceph-crash
● ceph-crash.service - Ceph crash dump collector
     Loaded: loaded (/lib/systemd/system/ceph-crash.service; enabled; vendor preset: enabled)
     Active: active (running) since Fri 2021-08-27 20:13:31 CEST; 1h 38min ago
   Main PID: 2582 (ceph-crash)
      Tasks: 1 (limit: 77036)
     Memory: 5.7M
        CPU: 39ms
     CGroup: /system.slice/ceph-crash.service
             └─2582 /usr/bin/python3 /usr/bin/ceph-crash

Aug 27 20:13:31 ControlCoreAngel systemd[1]: Started Ceph crash dump collector.
Aug 27 20:13:31 ControlCoreAngel ceph-crash[2582]: INFO:ceph-crash:monitoring path /var/lib/ceph/crash, delay 600s

Bash:
root@ControlCoreAngel:/etc/ceph#  pveceph createmgr
got timeout
 
Hello everyone,

I honestly don't really know or remember how i got myself into this situation.

What i remember is: quite some time ago (early/mid 2020) I installed Ceph on my now only PVE node to take a look at it.
After some time I uninstalled it; most likely with apt as i wasn't aware of pveceph existence until 2 hours ago.

Until now, i was busy with upgrading from PVE 6 to 7 which was quite troublesome because of mentioned ceph.... residues/leftovers/etc...
pveceph purge fails with the error
Bash:
Error gathering ceph info, already purged? Message: got timeout
Foreign MON address in ceph.conf. Keeping config & keyrings

What is quite interesting from my perspective, if i try to install ceph Nautilus/octopus/pacific (yes i tried all three on their appropriate PVE version) via the web UI, it fails as in the attached screenshot.
If I try a ceph status after the error in the screenshot occurs, it gives me this error message:
Code:
root@ControlCoreAngel:~# ceph status
Error initializing cluster client: ObjectNotFound('RADOS object not found (error calling conf_read_file)')

I believe, this is the sole reason for my situation, I may be wrong though.

Unfortunately, i could not find any helpful resources online; I'll attach the links to forum threads i tried to no avail.

Thanks in advance for any help!

John


Attempted solutions:
https://forum.proxmox.com/threads/reinstall-ceph-after-pveceph-purge.56635/
https://forum.proxmox.com/threads/remove-ceph.59576/ (never had any configurations made in the first place)
https://forum.proxmox.com/threads/removing-ceph.26318/
https://forum.proxmox.com/threads/not-able-to-use-pveceph-purge-to-completely-remove-ceph.59606/


Code:
proxmox-ve: 7.0-2 (running kernel: 5.11.22-3-pve)
pve-manager: 7.0-11 (running version: 7.0-11/63d82f4e)
pve-kernel-5.11: 7.0-6
pve-kernel-helper: 7.0-6
pve-kernel-5.4: 6.4-5
pve-kernel-5.3: 6.1-6
pve-kernel-5.0: 6.0-11
pve-kernel-5.11.22-3-pve: 5.11.22-7
pve-kernel-5.4.128-1-pve: 5.4.128-2
pve-kernel-5.4.124-1-pve: 5.4.124-2
pve-kernel-5.4.119-1-pve: 5.4.119-1
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-5.0.21-5-pve: 5.0.21-10
pve-kernel-5.0.15-1-pve: 5.0.15-1
ceph: 16.2.5-pve1
ceph-fuse: 16.2.5-pve1
corosync: 3.1.2-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: not correctly installed
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.21-pve1
libproxmox-acme-perl: 1.3.0
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.0-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.0-6
libpve-guest-common-perl: 4.0-2
libpve-http-server-perl: 4.0-2
libpve-storage-perl: 7.0-10
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.9-4
lxcfs: 4.0.8-pve2
novnc-pve: 1.2.0-3
proxmox-backup-client: 2.0.9-2
proxmox-backup-file-restore: 2.0.9-2
proxmox-mini-journalreader: 1.2-1
proxmox-widget-toolkit: 3.3-6
pve-cluster: 7.0-3
pve-container: 4.0-9
pve-docs: 7.0-5
pve-edk2-firmware: 3.20200531-1
pve-firewall: 4.2-2
pve-firmware: 3.2-4
pve-ha-manager: 3.3-1
pve-i18n: 2.4-1
pve-qemu-kvm: 6.0.0-3
pve-xtermjs: 4.12.0-1
pve-zsync: 2.2
qemu-server: 7.0-13
smartmontools: 7.2-pve2
spiceterm: 3.2-2
vncterm: 1.7-1
zfsutils-linux: 2.0.5-pve1

Code:
root@ControlCoreAngel:~# systemctl status ceph-crash
● ceph-crash.service - Ceph crash dump collector
     Loaded: loaded (/lib/systemd/system/ceph-crash.service; enabled; vendor preset: enabled)
     Active: active (running) since Fri 2021-08-27 20:13:31 CEST; 1h 38min ago
   Main PID: 2582 (ceph-crash)
      Tasks: 1 (limit: 77036)
     Memory: 5.7M
        CPU: 39ms
     CGroup: /system.slice/ceph-crash.service
             └─2582 /usr/bin/python3 /usr/bin/ceph-crash

Aug 27 20:13:31 ControlCoreAngel systemd[1]: Started Ceph crash dump collector.
Aug 27 20:13:31 ControlCoreAngel ceph-crash[2582]: INFO:ceph-crash:monitoring path /var/lib/ceph/crash, delay 600s

Bash:
root@ControlCoreAngel:/etc/ceph#  pveceph createmgr
got timeout

No one any ideas?
 
still got the issue ?
Error gathering ceph info, already purged? Message: got timeout
Foreign MON address in ceph.conf. Keeping config & keyrings
Might be that the there is a monitor not properly removed, check if there is a monitor defined in /etc/pve/ceph.conf
 
Last edited:
Thank you very much for replying to such an old thread!

To be honest, no, because I just rebuilt my server about a year ago. One of the reasons were some failed experiments like this one and a new setup where the boot pool is a zfs mirror instead of a singular device.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!