Search results

  1. A

    CEPH - osd problems every night

    I see that this problem occur in either mode, but it is possible to work around it - at least on our Dell servers. What I see is that SMART is working fine initially after adding a new disk. After I create an OSD on the the disk, SMART will hang. If I then reboot the server to Dell "lifecycle...
  2. A

    CEPH - osd problems every night

    @dllex By the way, is the RAID controller set to HBA mode, or are you mapping single disks out of the controller?[/CODE]
  3. A

    CEPH - osd problems every night

    Thanks for the detailed explanation. I can confirm that this matches my initial problem. We observed this on two Dell R730s. The backplane and RAID controller was using the latest firmware, but after upgrading the BIOS to the latest version the problem seems to have gone away.
  4. A

    CEPH - osd problems every night

    Thanks for the suggestions. I've looked for things happening at this time, but can't really find anything. Ceph is running in its own VLAN, but I have other systems (including another - functional - Ceph cluster) running on other separate VLANs on the same switches. The only thing I see having...
  5. A

    CEPH - osd problems every night

    Not really. I've build the cluster from scratch, reinstalling Proxmox and still see the same behaviour. A little after midnight (UTC time) the problem appear. The only progress I've made is that this is related some of my SSDs, but not all of them. I've got appx. 20 Sandisk Cloudspeed Eco Gen...
  6. A

    CEPH - osd problems every night

    This is not related to backups, we are backing up to PBS 4 times a day, and the last backup does finish appx. 90 minutes before the problems start. As far as I know, there is nothing in particular happening at this time, but it occurs regular as clockwork a few minutes past 2AM every night. I...
  7. A

    CEPH - osd problems every night

    Softly bumping this again. We're still seeing this problem, and any information as to what could cause ceph/osd problems at regular intervals every night would be appreciated.
  8. A

    CEPH - osd problems every night

    Normal latency on all interfaces is around 0.1ms, I have not yet stayed up at night to see what i might be when the problem starts. Also the nodes does not go down. It is only ceph that experience these problems. VMs with disks on ceph behave badly during the period with these problems...
  9. A

    CEPH - osd problems every night

    Bumping this. Any pointers as to why this is happening at exactly the same time (appx. 02:00) every night would be appreciated.
  10. A

    CEPH - osd problems every night

    I have a 5 node cluster running Proxmox 8.0.3. It has been running fine for several months without any problems. After rebooting 3 of the nodes last week, I've started to have problems with CEPH. Every night at around 02:05 (00:00 UTC) I see loads of error messages on all OSDs, and VMs lose...