Hello everyone,
We are currently running 7 Proxmox-Servers, all using supermicro mainboards, inside one cluster.
2 servers are used as a Ceph-Backend to store the VM images, so on 3 of the 7 servers Ceph-OSD is installed and running.
The problem we are currently facing is that the syslogs of those 3 servers are filled up with the following messages from ceph:
/var/log/syslog:
The timing of the messages doesn’t seem to follow a specific pattern. In one mailing list I read that the messages should occur at most every 15 minutes; This is not the case. We can observe the messages every 2-10 minutes, seemingly random. According to the ceph-sourcecode this message is printed when Ceph attempts to close an unavailable socket.
So the question I am having is, are these messages cause for concern for an bigger underlying problem? And if the messages are just expected debugging messages, how do I turn them off?
Thank you and kind regards!
We are currently running 7 Proxmox-Servers, all using supermicro mainboards, inside one cluster.
2 servers are used as a Ceph-Backend to store the VM images, so on 3 of the 7 servers Ceph-OSD is installed and running.
The problem we are currently facing is that the syslogs of those 3 servers are filled up with the following messages from ceph:
/var/log/syslog:
Aug 9 04:25:31 scci-hv11 ceph-osd[921008]: 2021-08-09T04:25:31.202+0200 7f5a6e6d3700 -1 reset not still connected to 0x5615cbcc5790
Aug 9 04:25:31 scci-hv11 ceph-osd[921008]: 2021-08-09T04:25:31.202+0200 7f5a6e6d3700 -1 reset not still connected to 0x5615cbdb6b60
Aug 9 04:25:31 scci-hv11 ceph-osd[921008]: 2021-08-09T04:25:31.202+0200 7f5a6e6d3700 -1 reset not still connected to 0x5615cc33edd0
...
Aug 9 04:25:31 scci-hv11 ceph-osd[921008]: 2021-08-09T04:25:31.202+0200 7f5a6e6d3700 -1 reset not still connected to 0x56161314a750
Aug 9 04:25:31 scci-hv11 ceph-osd[921008]: 2021-08-09T04:25:31.202+0200 7f5a6e6d3700 -1 reset not still connected to 0x56161314aa90
Aug 9 04:25:31 scci-hv11 ceph-osd[921008]: 2021-08-09T04:25:31.202+0200 7f5a6e6d3700 -1 reset not still connected to 0x56161b84e000
The timing of the messages doesn’t seem to follow a specific pattern. In one mailing list I read that the messages should occur at most every 15 minutes; This is not the case. We can observe the messages every 2-10 minutes, seemingly random. According to the ceph-sourcecode this message is printed when Ceph attempts to close an unavailable socket.
So the question I am having is, are these messages cause for concern for an bigger underlying problem? And if the messages are just expected debugging messages, how do I turn them off?
Thank you and kind regards!