Ceph osd socket close

pablomart81

Active Member
Dec 9, 2020
33
1
28
43
Does anyone know what this log is due to and how it affects the functioning of Ceph?

Aug 26 07:46:03 pve01-poz kernel: libceph: osd10 (1)10.0.0.1:6827 socket closed (con state OPEN)
Aug 26 07:47:35 pve01-poz kernel: libceph: osd10 (1)10.0.0.1:6827 socket closed (con state OPEN)
Aug 26 07:50:38 pve01-poz kernel: libceph: osd10 (1)10.0.0.1:6827 socket closed (con state OPEN)
Aug 26 07:54:34 pve01-poz kernel: libceph: osd10 (1)10.0.0.1:6827 socket closed (con state OPEN)
Aug 26 07:55:52 pve01-poz kernel: libceph: osd10 (1)10.0.0.1:6827 socket closed (con state OPEN)
Aug 26 07:56:56 pve01-poz kernel: libceph: osd10 (1)10.0.0.1:6827 socket closed (con state OPEN)
Aug 26 07:57:11 pve01-poz kernel: libceph: osd10 (1)10.0.0.1:6827 socket closed (con state OPEN)
Aug 26 07:58:27 pve01-poz kernel: libceph: osd16 (1)10.0.0.2:6817 socket closed (con state OPEN)
Aug 26 07:59:15 pve01-poz kernel: libceph: osd10 (1)10.0.0.1:6827 socket closed (con state OPEN)
Aug 26 07:59:19 pve01-poz kernel: libceph: osd5 (1)10.0.0.2:6816 socket closed (con state OPEN)
Aug 26 08:00:03 pve01-poz kernel: libceph: osd10 (1)10.0.0.1:6827 socket closed (con state OPEN)
Aug 26 08:03:11 pve01-poz kernel: libceph: osd10 (1)10.0.0.1:6827 socket closed (con state OPEN)
Aug 26 08:04:05 pve01-poz kernel: libceph: osd15 (1)10.0.0.1:6809 socket closed (con state OPEN)
Aug 26 08:12:01 pve01-poz kernel: libceph: osd3 (1)10.0.0.1:6807 socket closed (con state OPEN)
 
Could you provide us with more information if there are any issues? Does pveceph status/ceph -s report any warnings/errors? Otherwise this log message could have a variety of reasons, e.g. network issues, firewall settings, different node kernel/ceph versions, etc.

Edit: You could also look at Ceph's logs for more information about the OSDs which are facing issues.
 
Last edited:
I have made an enquiry with HP, the servers are HP and we pay for hardware support.
It seems that the latest firmware version they have installed has generated problems with the hardware, fans running out of control, network cards that do not work correctly and disk control problems, which I imagine is where this problem comes from.
HP's is really a disgrace, they forced us to update the BIOS, otherwise they would remove support, when with the previous version everything worked fine.

Regarding the OSD and cluster status, it looks fine but the operation is slow, although latencies are very low, we are talking about 1/1 or 2/2.
There is no overload in the cluster either

One of the problems we detected is that one of the network cards worked at 1G instead of 10G, one of the ports did not even turn on the link LED, due to the update. We solved this by installing a port on another of the Ceph cards, which works at 10G.

I'll keep you posted if you're interested, if you need any further clarification, let me know.

Thank you very much in advance
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!