ceph-mgr restful api issue

FProxo

Member
Aug 8, 2019
3
0
21
39
Hi!

The ceph-mgr restful api stopped working.

The process is running, but not listening on tcp/8003.

Code:
ceph     3400125  0.0  0.2 598612 263616 ?       Ssl  17:03   0:06 /usr/bin/ceph-mgr -f --cluster ceph --id cloud1 --setuser ceph --setgroup ceph
netstat -tlpn | grep mgr

Code:
{
    "always_on_modules": [
        "balancer",
        "crash",
        "devicehealth",
        "orchestrator",
        "pg_autoscaler",
        "progress",
        "rbd_support",
        "status",
        "telemetry",
        "volumes"
    ],
    "enabled_modules": [
        "iostat",
        "nfs",
        "restful"
    ],

Service restarting did not help.

syslog message after restart:
Code:
Oct 26 17:03:37 cloud1 systemd[1]: ceph-mgr@cloud1.service: Main process exited, code=exited, status=1/FAILURE
Oct 26 17:03:37 cloud1 systemd[1]: ceph-mgr@cloud1.service: Failed with result 'exit-code'.
Oct 26 17:03:37 cloud1 systemd[1]: ceph-mgr@cloud1.service: Consumed 2.962s CPU time.
Oct 26 17:03:47 cloud1 systemd[1]: ceph-mgr@cloud1.service: Scheduled restart job, restart counter is at 1.
Oct 26 17:03:47 cloud1 systemd[1]: ceph-mgr@cloud1.service: Consumed 2.962s CPU time.
Oct 26 17:03:47 cloud1 ceph-mgr[3399710]: 2023-10-26T17:03:47.460+0200 7f677477e500 -1 mgr[py] Module alerts has missing NOTIFY_TYPES member
Oct 26 17:03:47 cloud1 ceph-mgr[3399710]: 2023-10-26T17:03:47.524+0200 7f677477e500 -1 mgr[py] Module balancer has missing NOTIFY_TYPES member
Oct 26 17:03:47 cloud1 ceph-mgr[3399710]: 2023-10-26T17:03:47.592+0200 7f677477e500 -1 mgr[py] Module crash has missing NOTIFY_TYPES member
Oct 26 17:03:48 cloud1 ceph-mgr[3399710]: 2023-10-26T17:03:48.064+0200 7f677477e500 -1 mgr[py] Module influx has missing NOTIFY_TYPES member
Oct 26 17:03:48 cloud1 ceph-mgr[3399710]: 2023-10-26T17:03:48.180+0200 7f677477e500 -1 mgr[py] Module iostat has missing NOTIFY_TYPES member
Oct 26 17:03:48 cloud1 ceph-mgr[3399710]: 2023-10-26T17:03:48.428+0200 7f677477e500 -1 mgr[py] Module nfs has missing NOTIFY_TYPES member
Oct 26 17:03:48 cloud1 ceph-mgr[3399710]: 2023-10-26T17:03:48.528+0200 7f677477e500 -1 mgr[py] Module orchestrator has missing NOTIFY_TYPES member
Oct 26 17:03:48 cloud1 ceph-mgr[3399710]: 2023-10-26T17:03:48.580+0200 7f677477e500 -1 mgr[py] Module osd_support has missing NOTIFY_TYPES member
Oct 26 17:03:48 cloud1 ceph-mgr[3399710]: 2023-10-26T17:03:48.644+0200 7f677477e500 -1 mgr[py] Module pg_autoscaler has missing NOTIFY_TYPES member
Oct 26 17:03:48 cloud1 ceph-mgr[3399710]: 2023-10-26T17:03:48.696+0200 7f677477e500 -1 mgr[py] Module progress has missing NOTIFY_TYPES member
Oct 26 17:03:48 cloud1 ceph-mgr[3399710]: 2023-10-26T17:03:48.876+0200 7f677477e500 -1 mgr[py] Module prometheus has missing NOTIFY_TYPES member
Oct 26 17:03:48 cloud1 ceph-mgr[3399710]: 2023-10-26T17:03:48.944+0200 7f677477e500 -1 mgr[py] Module rbd_support has missing NOTIFY_TYPES member
Oct 26 17:03:49 cloud1 ceph-mgr[3399710]: context.c:56: warning: mpd_setminalloc: ignoring request to set MPD_MINALLOC a second time
Oct 26 17:03:49 cloud1 ceph-mgr[3399710]: 2023-10-26T17:03:49.176+0200 7f677477e500 -1 mgr[py] Module selftest has missing NOTIFY_TYPES member
Oct 26 17:03:49 cloud1 ceph-mgr[3399710]: 2023-10-26T17:03:49.236+0200 7f677477e500 -1 mgr[py] Module snap_schedule has missing NOTIFY_TYPES member
Oct 26 17:03:49 cloud1 ceph-mgr[3399710]: 2023-10-26T17:03:49.352+0200 7f677477e500 -1 mgr[py] Module status has missing NOTIFY_TYPES member
Oct 26 17:03:49 cloud1 ceph-mgr[3399710]: 2023-10-26T17:03:49.420+0200 7f677477e500 -1 mgr[py] Module telegraf has missing NOTIFY_TYPES member
Oct 26 17:03:49 cloud1 ceph-mgr[3399710]: 2023-10-26T17:03:49.548+0200 7f677477e500 -1 mgr[py] Module telemetry has missing NOTIFY_TYPES member
Oct 26 17:03:49 cloud1 ceph-mgr[3399710]: 2023-10-26T17:03:49.656+0200 7f677477e500 -1 mgr[py] Module test_orchestrator has missing NOTIFY_TYPES member
Oct 26 17:03:49 cloud1 ceph-mgr[3399710]: 2023-10-26T17:03:49.800+0200 7f677477e500 -1 mgr[py] Module volumes has missing NOTIFY_TYPES member
Oct 26 17:03:49 cloud1 ceph-mgr[3399710]: 2023-10-26T17:03:49.852+0200 7f677477e500 -1 mgr[py] Module zabbix has missing NOTIFY_TYPES member
Oct 26 17:03:49 cloud1 ceph-mgr[3399710]: did not load config file, using default settings.
Oct 26 17:03:49 cloud1 ceph-mgr[3399710]: ignoring --setuser ceph since I am not root
Oct 26 17:03:49 cloud1 ceph-mgr[3399710]: ignoring --setgroup ceph since I am not root
Oct 26 17:03:49 cloud1 ceph-mgr[3399710]: 2023-10-26T17:03:49.904+0200 7f91092a3500 -1 Errors while parsing config file!
Oct 26 17:03:49 cloud1 ceph-mgr[3399710]: 2023-10-26T17:03:49.904+0200 7f91092a3500 -1 can't open ceph.conf: (2) No such file or directory
Oct 26 17:03:49 cloud1 ceph-mgr[3399710]: unable to get monitor info from DNS SRV with service name: ceph-mon
Oct 26 17:03:49 cloud1 ceph-mgr[3399710]: 2023-10-26T17:03:49.932+0200 7f91092a3500 -1 failed for service _ceph-mon._tcp
Oct 26 17:03:49 cloud1 ceph-mgr[3399710]: 2023-10-26T17:03:49.932+0200 7f91092a3500 -1 monclient: get_monmap_and_config cannot identify monitors to contact
Oct 26 17:03:49 cloud1 ceph-mgr[3399710]: failed to fetch mon config (--no-mon-config to skip)
Oct 26 17:03:49 cloud1 systemd[1]: ceph-mgr@cloud1.service: Main process exited, code=exited, status=1/FAILURE
Oct 26 17:03:49 cloud1 systemd[1]: ceph-mgr@cloud1.service: Failed with result 'exit-code'.
Oct 26 17:03:49 cloud1 systemd[1]: ceph-mgr@cloud1.service: Consumed 2.554s CPU time.
Oct 26 17:04:00 cloud1 systemd[1]: ceph-mgr@cloud1.service: Scheduled restart job, restart counter is at 2.
Oct 26 17:04:00 cloud1 systemd[1]: ceph-mgr@cloud1.service: Consumed 2.554s CPU time.
Oct 26 17:04:00 cloud1 ceph-mgr[3400125]: 2023-10-26T17:04:00.204+0200 7f1372b95500 -1 mgr[py] Module alerts has missing NOTIFY_TYPES member
Oct 26 17:04:00 cloud1 ceph-mgr[3400125]: 2023-10-26T17:04:00.268+0200 7f1372b95500 -1 mgr[py] Module balancer has missing NOTIFY_TYPES member
Oct 26 17:04:00 cloud1 ceph-mgr[3400125]: 2023-10-26T17:04:00.328+0200 7f1372b95500 -1 mgr[py] Module crash has missing NOTIFY_TYPES member
Oct 26 17:04:00 cloud1 ceph-mgr[3400125]: 2023-10-26T17:04:00.788+0200 7f1372b95500 -1 mgr[py] Module influx has missing NOTIFY_TYPES member
Oct 26 17:04:00 cloud1 ceph-mgr[3400125]: 2023-10-26T17:04:00.896+0200 7f1372b95500 -1 mgr[py] Module iostat has missing NOTIFY_TYPES member
Oct 26 17:04:01 cloud1 ceph-mgr[3400125]: 2023-10-26T17:04:01.132+0200 7f1372b95500 -1 mgr[py] Module nfs has missing NOTIFY_TYPES member
Oct 26 17:04:01 cloud1 ceph-mgr[3400125]: 2023-10-26T17:04:01.236+0200 7f1372b95500 -1 mgr[py] Module orchestrator has missing NOTIFY_TYPES member
Oct 26 17:04:01 cloud1 ceph-mgr[3400125]: 2023-10-26T17:04:01.284+0200 7f1372b95500 -1 mgr[py] Module osd_support has missing NOTIFY_TYPES member
Oct 26 17:04:01 cloud1 ceph-mgr[3400125]: 2023-10-26T17:04:01.348+0200 7f1372b95500 -1 mgr[py] Module pg_autoscaler has missing NOTIFY_TYPES member
Oct 26 17:04:01 cloud1 ceph-mgr[3400125]: 2023-10-26T17:04:01.400+0200 7f1372b95500 -1 mgr[py] Module progress has missing NOTIFY_TYPES member
Oct 26 17:04:01 cloud1 ceph-mgr[3400125]: 2023-10-26T17:04:01.584+0200 7f1372b95500 -1 mgr[py] Module prometheus has missing NOTIFY_TYPES member
Oct 26 17:04:01 cloud1 ceph-mgr[3400125]: 2023-10-26T17:04:01.656+0200 7f1372b95500 -1 mgr[py] Module rbd_support has missing NOTIFY_TYPES member
Oct 26 17:04:01 cloud1 ceph-mgr[3400125]: context.c:56: warning: mpd_setminalloc: ignoring request to set MPD_MINALLOC a second time
Oct 26 17:04:01 cloud1 ceph-mgr[3400125]: 2023-10-26T17:04:01.884+0200 7f1372b95500 -1 mgr[py] Module selftest has missing NOTIFY_TYPES member
Oct 26 17:04:01 cloud1 ceph-mgr[3400125]: 2023-10-26T17:04:01.948+0200 7f1372b95500 -1 mgr[py] Module snap_schedule has missing NOTIFY_TYPES member
Oct 26 17:04:02 cloud1 ceph-mgr[3400125]: 2023-10-26T17:04:02.056+0200 7f1372b95500 -1 mgr[py] Module status has missing NOTIFY_TYPES member
Oct 26 17:04:02 cloud1 ceph-mgr[3400125]: 2023-10-26T17:04:02.124+0200 7f1372b95500 -1 mgr[py] Module telegraf has missing NOTIFY_TYPES member
Oct 26 17:04:02 cloud1 ceph-mgr[3400125]: 2023-10-26T17:04:02.240+0200 7f1372b95500 -1 mgr[py] Module telemetry has missing NOTIFY_TYPES member
Oct 26 17:04:02 cloud1 ceph-mgr[3400125]: 2023-10-26T17:04:02.344+0200 7f1372b95500 -1 mgr[py] Module test_orchestrator has missing NOTIFY_TYPES member
Oct 26 17:04:02 cloud1 ceph-mgr[3400125]: 2023-10-26T17:04:02.484+0200 7f1372b95500 -1 mgr[py] Module volumes has missing NOTIFY_TYPES member
Oct 26 17:04:02 cloud1 ceph-mgr[3400125]: 2023-10-26T17:04:02.540+0200 7f1372b95500 -1 mgr[py] Module zabbix has missing NOTIFY_TYPES member

Code:
cluster:
    id:     9cd85f3a-7480-4095-8066-47efc660e635
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum cloud1,cloud2,cloud3 (age 20h)
    mgr: cloud3(active, since 2h), standbys: cloud2, cloud1
    osd: 6 osds: 6 up (since 6M), 6 in (since 19M)
 
  data:
    pools:   2 pools, 129 pgs
    objects: 369.82k objects, 1.4 TiB
    usage:   3.9 TiB used, 1.7 TiB / 5.6 TiB avail
    pgs:     129 active+clean
 
  io:
    client:   341 B/s rd, 1.0 MiB/s wr, 0 op/s rd, 67 op/s wr

Code:
proxmox-ve: 7.4-1 (running kernel: 5.15.102-1-pve)
pve-manager: 7.4-3 (running version: 7.4-3/9002ab8a)
pve-kernel-5.15: 7.3-3
pve-kernel-5.13: 7.1-9
pve-kernel-5.15.102-1-pve: 5.15.102-1
pve-kernel-5.13.19-6-pve: 5.13.19-15
pve-kernel-5.13.19-2-pve: 5.13.19-4
ceph: 16.2.11-pve1
ceph-fuse: 16.2.11-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.4
libproxmox-backup-qemu0: 1.3.1-1
libproxmox-rs-perl: 0.2.1
libpve-access-control: 7.4-2
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.3-4
libpve-guest-common-perl: 4.2-4
libpve-http-server-perl: 4.2-1
libpve-rs-perl: 0.7.5
libpve-storage-perl: 7.4-2
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.2-2
lxcfs: 5.0.3-pve1
novnc-pve: 1.4.0-1
proxmox-backup-client: 2.4.1-1
proxmox-backup-file-restore: 2.4.1-1
proxmox-kernel-helper: 7.4-1
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-offline-mirror-helper: 0.5.1-1
proxmox-widget-toolkit: 3.6.5
pve-cluster: 7.3-3
pve-container: 4.4-3
pve-docs: 7.4-2
pve-edk2-firmware: 3.20230228-1
pve-firewall: 4.3-1
pve-firmware: 3.6-4
pve-ha-manager: 3.6.0
pve-i18n: 2.12-1
pve-qemu-kvm: 7.2.0-8
pve-xtermjs: 4.16.0-1
qemu-server: 7.4-3
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+3
vncterm: 1.7-1
zfsutils-linux: 2.1.9-pve1

Any ideas on what is causing the problem?

Thanks!
 
I'm experiencing this as well. The port disappears when the manager switches to another node. Now the active manager isn't binding 8003 for some reason. I tried disabling and enabling restful but to no avail. I have no idea how to debug this.

Anyone experience this issue?
 
  • Like
Reactions: tilenk18
Hi,

I had an issue with that as well.
For me the manager changed to a node that did not have a generated certificate. (The certificates can be per node or for the cluster)
ceph config-key ls | grep 'mgr/restf' should show the current config.
After fixing the missing certificate I had to disable and enable the module again.
ceph mgr module disable restful;sleep 5;ceph mgr module enable restful
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!