Hello,
I have 4 nodes cluster running PVE 7.4.16 with Ceph 17.2.6. When i try to reboot one of the nodes, VMs are halted and i cannot reboot/start/reset/shutdown until all OSD are up and in, because Ceph OSD's are degraded. It doesn't matter if noout flag is set or not. Ceph has 3 monitors and 3 managers running (nodes with VMs), 4'th node has no VMs running, only ceph.
There is no errors in syslog, only in ceph monitor log it shows this:
Sep 17 17:08:41 pve ceph-mgr[1578]: 2023-09-17T17:08:41.715+0300 7f88f4bae000 -1 mgr[py] Module pg_autoscaler has missing NOTIFY_TYPES member
Sep 17 17:08:41 pve ceph-mgr[1578]: 2023-09-17T17:08:41.810+0300 7f88f4bae000 -1 mgr[py] Module status has missing NOTIFY_TYPES member
Sep 17 17:08:41 pve ceph-mgr[1578]: 2023-09-17T17:08:41.892+0300 7f88f4bae000 -1 mgr[py] Module osd_support has missing NOTIFY_TYPES member
Sep 17 17:08:42 pve ceph-mgr[1578]: 2023-09-17T17:08:42.106+0300 7f88f4bae000 -1 mgr[py] Module alerts has missing NOTIFY_TYPES member
Sep 17 17:08:42 pve ceph-mgr[1578]: 2023-09-17T17:08:42.443+0300 7f88f4bae000 -1 mgr[py] Module telegraf has missing NOTIFY_TYPES member
Sep 17 17:08:42 pve ceph-mgr[1578]: 2023-09-17T17:08:42.583+0300 7f88f4bae000 -1 mgr[py] Module selftest has missing NOTIFY_TYPES member
Sep 17 17:08:42 pve ceph-mgr[1578]: 2023-09-17T17:08:42.816+0300 7f88f4bae000 -1 mgr[py] Module prometheus has missing NOTIFY_TYPES member
Sep 17 17:08:42 pve ceph-mgr[1578]: 2023-09-17T17:08:42.968+0300 7f88f4bae000 -1 mgr[py] Module test_orchestrator has missing NOTIFY_TYPES member
Sep 17 17:08:43 pve ceph-mgr[1578]: 2023-09-17T17:08:43.140+0300 7f88f4bae000 -1 mgr[py] Module telemetry has missing NOTIFY_TYPES member
Sep 17 17:08:43 pve ceph-mgr[1578]: 2023-09-17T17:08:43.208+0300 7f88f4bae000 -1 mgr[py] Module progress has missing NOTIFY_TYPES member
Sep 17 17:08:43 pve ceph-mgr[1578]: 2023-09-17T17:08:43.416+0300 7f88f4bae000 -1 mgr[py] Module orchestrator has missing NOTIFY_TYPES member
Sep 17 17:08:43 pve ceph-mgr[1578]: 2023-09-17T17:08:43.484+0300 7f88f4bae000 -1 mgr[py] Module influx has missing NOTIFY_TYPES member
Sep 17 17:08:43 pve ceph-mgr[1578]: 2023-09-17T17:08:43.556+0300 7f88f4bae000 -1 mgr[py] Module devicehealth has missing NOTIFY_TYPES member
Sep 17 17:08:43 pve ceph-mgr[1578]: 2023-09-17T17:08:43.788+0300 7f88f4bae000 -1 mgr[py] Module nfs has missing NOTIFY_TYPES member
Sep 17 17:08:43 pve ceph-mgr[1578]: 2023-09-17T17:08:43.856+0300 7f88f4bae000 -1 mgr[py] Module zabbix has missing NOTIFY_TYPES member
Sep 17 17:08:43 pve ceph-mgr[1578]: context.c:56: warning: mpd_setminalloc: ignoring request to set MPD_MINALLOC a second time
Sep 17 17:08:43 pve ceph-mgr[1578]: 2023-09-17T17:08:43.956+0300 7f88f4bae000 -1 mgr[py] Module rbd_support has missing NOTIFY_TYPES member
Sep 17 17:08:44 pve ceph-mgr[1578]: 2023-09-17T17:08:44.164+0300 7f88f4bae000 -1 mgr[py] Module volumes has missing NOTIFY_TYPES member
Sep 17 17:08:44 pve ceph-mgr[1578]: 2023-09-17T17:08:44.236+0300 7f88f4bae000 -1 mgr[py] Module osd_perf_query has missing NOTIFY_TYPES member
Sep 17 17:08:44 pve ceph-mgr[1578]: 2023-09-17T17:08:44.312+0300 7f88f4bae000 -1 mgr[py] Module crash has missing NOTIFY_TYPES member
Sep 17 17:08:44 pve ceph-mgr[1578]: 2023-09-17T17:08:44.392+0300 7f88f4bae000 -1 mgr[py] Module snap_schedule has missing NOTIFY_TYPES member
Sep 17 17:08:44 pve ceph-mgr[1578]: 2023-09-17T17:08:44.456+0300 7f88f4bae000 -1 mgr[py] Module iostat has missing NOTIFY_TYPES member
Sep 17 17:08:44 pve ceph-mgr[1578]: 2023-09-17T17:08:44.532+0300 7f88f4bae000 -1 mgr[py] Module balancer has missing NOTIFY_TYPES member
Sep 18 00:00:57 pve ceph-mgr[1578]: 2023-09-18T00:00:57.218+0300 7f88f0b48700 -1 received signal: Hangup from killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw rbd-mirror cephfs-mirror (PID: 279750) UID: 0
Sep 18 00:00:57 pve ceph-mgr[1578]: 2023-09-18T00:00:57.238+0300 7f88f0b48700 -1 received signal: Hangup from (PID: 279751) UID: 0
Sep 19 00:00:57 pve ceph-mgr[1578]: 2023-09-19T00:00:57.210+0300 7f88f0b48700 -1 received signal: Hangup from killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw rbd-mirror cephfs-mirror (PID: 1221089) UID: 0
Sep 19 00:00:57 pve ceph-mgr[1578]: 2023-09-19T00:00:57.226+0300 7f88f0b48700 -1 received signal: Hangup from (PID: 1221090) UID: 0
Maybe there is some configuration missing in cluster?
I have 4 nodes cluster running PVE 7.4.16 with Ceph 17.2.6. When i try to reboot one of the nodes, VMs are halted and i cannot reboot/start/reset/shutdown until all OSD are up and in, because Ceph OSD's are degraded. It doesn't matter if noout flag is set or not. Ceph has 3 monitors and 3 managers running (nodes with VMs), 4'th node has no VMs running, only ceph.
There is no errors in syslog, only in ceph monitor log it shows this:
Sep 17 17:08:41 pve ceph-mgr[1578]: 2023-09-17T17:08:41.715+0300 7f88f4bae000 -1 mgr[py] Module pg_autoscaler has missing NOTIFY_TYPES member
Sep 17 17:08:41 pve ceph-mgr[1578]: 2023-09-17T17:08:41.810+0300 7f88f4bae000 -1 mgr[py] Module status has missing NOTIFY_TYPES member
Sep 17 17:08:41 pve ceph-mgr[1578]: 2023-09-17T17:08:41.892+0300 7f88f4bae000 -1 mgr[py] Module osd_support has missing NOTIFY_TYPES member
Sep 17 17:08:42 pve ceph-mgr[1578]: 2023-09-17T17:08:42.106+0300 7f88f4bae000 -1 mgr[py] Module alerts has missing NOTIFY_TYPES member
Sep 17 17:08:42 pve ceph-mgr[1578]: 2023-09-17T17:08:42.443+0300 7f88f4bae000 -1 mgr[py] Module telegraf has missing NOTIFY_TYPES member
Sep 17 17:08:42 pve ceph-mgr[1578]: 2023-09-17T17:08:42.583+0300 7f88f4bae000 -1 mgr[py] Module selftest has missing NOTIFY_TYPES member
Sep 17 17:08:42 pve ceph-mgr[1578]: 2023-09-17T17:08:42.816+0300 7f88f4bae000 -1 mgr[py] Module prometheus has missing NOTIFY_TYPES member
Sep 17 17:08:42 pve ceph-mgr[1578]: 2023-09-17T17:08:42.968+0300 7f88f4bae000 -1 mgr[py] Module test_orchestrator has missing NOTIFY_TYPES member
Sep 17 17:08:43 pve ceph-mgr[1578]: 2023-09-17T17:08:43.140+0300 7f88f4bae000 -1 mgr[py] Module telemetry has missing NOTIFY_TYPES member
Sep 17 17:08:43 pve ceph-mgr[1578]: 2023-09-17T17:08:43.208+0300 7f88f4bae000 -1 mgr[py] Module progress has missing NOTIFY_TYPES member
Sep 17 17:08:43 pve ceph-mgr[1578]: 2023-09-17T17:08:43.416+0300 7f88f4bae000 -1 mgr[py] Module orchestrator has missing NOTIFY_TYPES member
Sep 17 17:08:43 pve ceph-mgr[1578]: 2023-09-17T17:08:43.484+0300 7f88f4bae000 -1 mgr[py] Module influx has missing NOTIFY_TYPES member
Sep 17 17:08:43 pve ceph-mgr[1578]: 2023-09-17T17:08:43.556+0300 7f88f4bae000 -1 mgr[py] Module devicehealth has missing NOTIFY_TYPES member
Sep 17 17:08:43 pve ceph-mgr[1578]: 2023-09-17T17:08:43.788+0300 7f88f4bae000 -1 mgr[py] Module nfs has missing NOTIFY_TYPES member
Sep 17 17:08:43 pve ceph-mgr[1578]: 2023-09-17T17:08:43.856+0300 7f88f4bae000 -1 mgr[py] Module zabbix has missing NOTIFY_TYPES member
Sep 17 17:08:43 pve ceph-mgr[1578]: context.c:56: warning: mpd_setminalloc: ignoring request to set MPD_MINALLOC a second time
Sep 17 17:08:43 pve ceph-mgr[1578]: 2023-09-17T17:08:43.956+0300 7f88f4bae000 -1 mgr[py] Module rbd_support has missing NOTIFY_TYPES member
Sep 17 17:08:44 pve ceph-mgr[1578]: 2023-09-17T17:08:44.164+0300 7f88f4bae000 -1 mgr[py] Module volumes has missing NOTIFY_TYPES member
Sep 17 17:08:44 pve ceph-mgr[1578]: 2023-09-17T17:08:44.236+0300 7f88f4bae000 -1 mgr[py] Module osd_perf_query has missing NOTIFY_TYPES member
Sep 17 17:08:44 pve ceph-mgr[1578]: 2023-09-17T17:08:44.312+0300 7f88f4bae000 -1 mgr[py] Module crash has missing NOTIFY_TYPES member
Sep 17 17:08:44 pve ceph-mgr[1578]: 2023-09-17T17:08:44.392+0300 7f88f4bae000 -1 mgr[py] Module snap_schedule has missing NOTIFY_TYPES member
Sep 17 17:08:44 pve ceph-mgr[1578]: 2023-09-17T17:08:44.456+0300 7f88f4bae000 -1 mgr[py] Module iostat has missing NOTIFY_TYPES member
Sep 17 17:08:44 pve ceph-mgr[1578]: 2023-09-17T17:08:44.532+0300 7f88f4bae000 -1 mgr[py] Module balancer has missing NOTIFY_TYPES member
Sep 18 00:00:57 pve ceph-mgr[1578]: 2023-09-18T00:00:57.218+0300 7f88f0b48700 -1 received signal: Hangup from killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw rbd-mirror cephfs-mirror (PID: 279750) UID: 0
Sep 18 00:00:57 pve ceph-mgr[1578]: 2023-09-18T00:00:57.238+0300 7f88f0b48700 -1 received signal: Hangup from (PID: 279751) UID: 0
Sep 19 00:00:57 pve ceph-mgr[1578]: 2023-09-19T00:00:57.210+0300 7f88f0b48700 -1 received signal: Hangup from killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw rbd-mirror cephfs-mirror (PID: 1221089) UID: 0
Sep 19 00:00:57 pve ceph-mgr[1578]: 2023-09-19T00:00:57.226+0300 7f88f0b48700 -1 received signal: Hangup from (PID: 1221090) UID: 0
Maybe there is some configuration missing in cluster?
Last edited: