Installing "ceph-exporter" Daemon

ikogan

Renowned Member
Apr 8, 2017
41
4
73
40
According to the ceph documentation, at least as of Reef, the mgrs no longer export perf counters by default (https://docs.ceph.com/en/reef/mgr/prometheus/#id1) which I thought wouldn't be a big deal for me. However, some of these counters include OSD storage information, in particular ceph_osd_stat_bytes and ceph_osd_stat_bytes_used are used to calculate alerts around OSDs becoming NearFull and similar important metrics. Fixing this is easy by simply turning exclude_perf_counters off again:

Code:
ceph config set mgr mgr/prometheus/exclude_perf_counters false

This makes those metrics, and many others, come right back. However, according to the docs:

Gathering perf-counters from a single Prometheus exporter can degrade ceph-mgr performance, especially in large clusters. Instead, Ceph- exporter daemons are now used by default for perf-counter gathering. This should only be disabled when no ceph-exporters are deployed.

It doesn't look like Proxmox is deploying these exporters by default and I can't quite figure out how to install them. Are there any tips for this or should I just leave the perf counters enabled?
 
I got it working by installing package ceph-exporter, and then added parameter --no-mon-config for the ceph-exporter.service:

* /etc/systemd/system/ceph-exporter.service.d/no-mon-config.conf
Code:
[Service]
ExecStart=
ExecStart=/usr/bin/ceph-exporter -f --id %i --setuser ceph --setgroup ceph --no-mon-config

Without the parameter --no-mon-config, the exporter will not start with this error message (not sure what the problem is exactly):
Code:
Oct 30 10:57:35 pve1 ceph-exporter[20268]: 2024-10-30T10:57:35.183+0100 75840c1a4180 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client..keyring: (13) Permission denied
Oct 30 10:57:35 pve1 ceph-exporter[20268]: 2024-10-30T10:57:35.184+0100 75840c1a4180 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client..keyring: (13) Permission denied
Oct 30 10:57:35 pve1 ceph-exporter[20268]: 2024-10-30T10:57:35.184+0100 75840c1a4180 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client..keyring: (13) Permission denied
Oct 30 10:57:35 pve1 ceph-exporter[20268]: 2024-10-30T10:57:35.184+0100 75840c1a4180 -1 monclient: keyring not found
Oct 30 10:57:35 pve1 ceph-exporter[20268]: failed to fetch mon config (--no-mon-config to skip)

Exporter will be exposed on port 9926.
 
Last edited:
Where did you find the package ceph-exporter, cause I didn't find one for ceph 19.x?
 
Last edited:
We're running Ceph 18.2.4, found it in the Proxmox Enterprise repository:
Code:
root@pve1:~# apt-cache policy ceph-exporter
ceph-exporter:
  Installed: 18.2.4-pve3
  Candidate: 18.2.4-pve3
  Version table:
 *** 18.2.4-pve3 500
        500 https://enterprise.proxmox.com/debian/ceph-reef bookworm/enterprise amd64 Packages
        100 /var/lib/dpkg/status
Proxmox version is pve-manager/8.2.7/3e0176e6bb2ade3b (running kernel: 6.8.12-2-pve)
Hope that helps :)
 
If you're using Grafana, can I ask which dashboard you're using with ceph-exporter? None of the ones I've found seem to match the exposed metrics.
 
Without the parameter --no-mon-config, the exporter will not start with this error message (not sure what the problem is exactly):
I wanted to set up the same and discovered what is wrong here... I think the Proxmox Team has not adjusted the package to work with their management of ceph and this is the bone-stock default file paths/service names it is trying to use.

It is looking for its keyring here: /etc/pve/priv/ceph.client..keyring
This path is built like this: /etc/ceph/ceph.client.{ID}.keyring with ID being set from the systemd unit as %i which by default is empty. %i is populated when you use service names like ceph-exporter@metrics in which case %i would be metrics and it would be looking for the keyring in /etc/pve/priv/ceph.client.metrics.keyring.

Just as a quick POC to confirm my theory I tried using the admin keyring which is present on the system already:
Code:
/usr/bin/ceph-exporter -f --id "admin" --setgroup ceph --setuser root

This works without any errors. I had to use --setuser root for this POC as the admin keyring can only be read by root.

The correct way to deal with this error would be to create a new keyring and let the ceph-exporter run as an unprivileged user (and make sure the keyring can be read by that user. The default user is "ceph" which is probably fine)

Update:
I have tried adding a user that just has mon r caps and that produces the same output as the admin keyring. And you also get the same output when using --no-mon-config. So actually authenticating it against the monitors does not seem to matter at all...
Using --no-mon-config seems fine then I guess? Not missing out on anything as far as I can tell.
 
Last edited: