libceph socket closed

kbechler

Active Member
Jul 16, 2019
16
1
43
45
Hello,

I've got pretty simple 4-node, 8 OSD (2 OSDs per node) Ceph cluster installed on PVE8.
Everything seems to be working fine, but on one of the nodes (in dmesg) I've got a lot of "libcepf: socket closed" messages:

Code:
[czw lut 20 09:52:54 2025] libceph: osd7 (1)10.40.0.111:6809 socket closed (con state OPEN)
[czw lut 20 09:53:14 2025] libceph: osd4 (1)10.40.0.106:6805 socket closed (con state OPEN)
[czw lut 20 09:53:20 2025] libceph: osd4 (1)10.40.0.106:6805 socket closed (con state OPEN)
[czw lut 20 09:53:25 2025] libceph: osd4 (1)10.40.0.106:6805 socket closed (con state OPEN)
[czw lut 20 09:53:26 2025] libceph: osd4 (1)10.40.0.106:6805 socket closed (con state OPEN)
[czw lut 20 09:53:32 2025] libceph: osd4 (1)10.40.0.106:6805 socket closed (con state OPEN)
[czw lut 20 09:53:33 2025] libceph: osd4 (1)10.40.0.106:6805 socket closed (con state OPEN)
[czw lut 20 09:53:33 2025] libceph: osd4 (1)10.40.0.106:6805 socket closed (con state OPEN)
[czw lut 20 09:54:05 2025] libceph: osd1 (1)10.40.0.108:6805 socket closed (con state OPEN)
[czw lut 20 09:54:11 2025] libceph: osd1 (1)10.40.0.108:6805 socket closed (con state OPEN)
[czw lut 20 09:54:25 2025] libceph: osd1 (1)10.40.0.108:6805 socket closed (con state OPEN)
[czw lut 20 09:54:43 2025] libceph: osd1 (1)10.40.0.108:6805 socket closed (con state OPEN)
[czw lut 20 09:54:44 2025] libceph: osd1 (1)10.40.0.108:6805 socket closed (con state OPEN)
[czw lut 20 09:54:45 2025] libceph: osd1 (1)10.40.0.108:6805 socket closed (con state OPEN)
[czw lut 20 09:54:58 2025] libceph: osd1 (1)10.40.0.108:6805 socket closed (con state OPEN)
[czw lut 20 09:55:01 2025] libceph: osd1 (1)10.40.0.108:6805 socket closed (con state OPEN)
[czw lut 20 09:55:03 2025] libceph: osd1 (1)10.40.0.108:6805 socket closed (con state OPEN)
[czw lut 20 09:55:10 2025] libceph: osd1 (1)10.40.0.108:6805 socket closed (con state OPEN)

All nodes have very similar hardware setup (AMD EPYC, 25Gbps NIcs) and runs the same 6.11.11-1-pve kernel (I was hoping that switching from 6.8 will help).
Could anyone tell me what this message exactly means? And how to debug it.

Regards,
Konrad
 
Funny thing. We've migrated two of our containers to a different node, and these messages:
Code:
libceph: osd0 (1)10.40.0.108:6804 socket closed (con state OPEN)
are now reported by... another node (the one with migrated CTs).

So, it looks like some "issue" between CEPH and CT...
 
Hi

I have same problem

Apr 24 17:26:10 sr1 kernel: libceph: osd5 (1)192.168.10.201:6812 socket closed (con state OPEN)
Apr 24 17:26:10 sr1 kernel: libceph: osd2 (1)192.168.10.203:6815 socket closed (con state OPEN)
Apr 24 17:26:10 sr1 kernel: libceph: osd2 (1)192.168.10.203:6815 socket closed (con state OPEN)
Apr 24 17:26:35 sr1 kernel: libceph: osd2 (1)192.168.10.203:6815 socket closed (con state OPEN)
Apr 24 17:26:35 sr1 kernel: libceph: osd5 (1)192.168.10.201:6812 socket closed (con state OPEN)
Apr 24 17:27:01 sr1 kernel: libceph: osd8 (1)192.168.10.205:6801 socket closed (con state OPEN)

After change HDD to SSD

Have 3 nodes (sr1,sr3,sr5) , but only one report sr1.