Installing "ceph-exporter" Daemon

ikogan

Renowned Member
Apr 8, 2017
41
4
73
40
According to the ceph documentation, at least as of Reef, the mgrs no longer export perf counters by default (https://docs.ceph.com/en/reef/mgr/prometheus/#id1) which I thought wouldn't be a big deal for me. However, some of these counters include OSD storage information, in particular ceph_osd_stat_bytes and ceph_osd_stat_bytes_used are used to calculate alerts around OSDs becoming NearFull and similar important metrics. Fixing this is easy by simply turning exclude_perf_counters off again:

Code:
ceph config set mgr mgr/prometheus/exclude_perf_counters false

This makes those metrics, and many others, come right back. However, according to the docs:

Gathering perf-counters from a single Prometheus exporter can degrade ceph-mgr performance, especially in large clusters. Instead, Ceph- exporter daemons are now used by default for perf-counter gathering. This should only be disabled when no ceph-exporters are deployed.

It doesn't look like Proxmox is deploying these exporters by default and I can't quite figure out how to install them. Are there any tips for this or should I just leave the perf counters enabled?
 
I got it working by installing package ceph-exporter, and then added parameter --no-mon-config for the ceph-exporter.service:

* /etc/systemd/system/ceph-exporter.service.d/no-mon-config.conf
Code:
[Service]
ExecStart=
ExecStart=/usr/bin/ceph-exporter -f --id %i --setuser ceph --setgroup ceph --no-mon-config

Without the parameter --no-mon-config, the exporter will not start with this error message (not sure what the problem is exactly):
Code:
Oct 30 10:57:35 pve1 ceph-exporter[20268]: 2024-10-30T10:57:35.183+0100 75840c1a4180 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client..keyring: (13) Permission denied
Oct 30 10:57:35 pve1 ceph-exporter[20268]: 2024-10-30T10:57:35.184+0100 75840c1a4180 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client..keyring: (13) Permission denied
Oct 30 10:57:35 pve1 ceph-exporter[20268]: 2024-10-30T10:57:35.184+0100 75840c1a4180 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client..keyring: (13) Permission denied
Oct 30 10:57:35 pve1 ceph-exporter[20268]: 2024-10-30T10:57:35.184+0100 75840c1a4180 -1 monclient: keyring not found
Oct 30 10:57:35 pve1 ceph-exporter[20268]: failed to fetch mon config (--no-mon-config to skip)

Exporter will be exposed on port 9926.
 
Last edited:
Where did you find the package ceph-exporter, cause I didn't find one for ceph 19.x?
 
Last edited:
We're running Ceph 18.2.4, found it in the Proxmox Enterprise repository:
Code:
root@pve1:~# apt-cache policy ceph-exporter
ceph-exporter:
  Installed: 18.2.4-pve3
  Candidate: 18.2.4-pve3
  Version table:
 *** 18.2.4-pve3 500
        500 https://enterprise.proxmox.com/debian/ceph-reef bookworm/enterprise amd64 Packages
        100 /var/lib/dpkg/status
Proxmox version is pve-manager/8.2.7/3e0176e6bb2ade3b (running kernel: 6.8.12-2-pve)
Hope that helps :)
 
If you're using Grafana, can I ask which dashboard you're using with ceph-exporter? None of the ones I've found seem to match the exposed metrics.
 
Without the parameter --no-mon-config, the exporter will not start with this error message (not sure what the problem is exactly):
I wanted to set up the same and discovered what is wrong here... I think the Proxmox Team has not adjusted the package to work with their management of ceph and this is the bone-stock default file paths/service names it is trying to use.

It is looking for its keyring here: /etc/pve/priv/ceph.client..keyring
This path is built like this: /etc/ceph/ceph.client.{ID}.keyring with ID being set from the systemd unit as %i which by default is empty. %i is populated when you use service names like ceph-exporter@metrics in which case %i would be metrics and it would be looking for the keyring in /etc/pve/priv/ceph.client.metrics.keyring.

Just as a quick POC to confirm my theory I tried using the admin keyring which is present on the system already:
Code:
/usr/bin/ceph-exporter -f --id "admin" --setgroup ceph --setuser root

This works without any errors. I had to use --setuser root for this POC as the admin keyring can only be read by root.

The correct way to deal with this error would be to create a new keyring and let the ceph-exporter run as an unprivileged user (and make sure the keyring can be read by that user. The default user is "ceph" which is probably fine)

Update:
I have tried adding a user that just has mon r caps and that produces the same output as the admin keyring. And you also get the same output when using --no-mon-config. So actually authenticating it against the monitors does not seem to matter at all...
Using --no-mon-config seems fine then I guess? Not missing out on anything as far as I can tell.
 
Last edited:
The ceph-exporter is not included in the debian packages for 19.2.1: https://tracker.ceph.com/issues/70445

Sadly the bug tracker has it targeted for 19.2.3 and looking at the gh repo it does not look like 19.2.2 will have the PR https://github.com/ceph/ceph/pull/62270: https://github.com/ceph/ceph/compare/v19.2.1...v19.2.2 so we might have to wait until 19.2.3 for an easy install
Not sure if needed. My scrape works just by enabling proxmox module as stated in the Ceph documentation. Let me know if you need config sample