Recent content by LOGINTechBlog

  1. L

    Node randomly reboots

    Thank you for this instruction. Do I really need to destroy all managers and monitors? Do I have to create them again? Because I just want to remove the old volume and want to create a new volume with the new ssds.
  2. L

    Node randomly reboots

    Thank you for your reply. Due the long amount of time we need when replacing each of the 16 OSDs its not possible. I tink we need to get rid of the existing ceph and create a fresh new one. How we can proceed with it?
  3. L

    Node randomly reboots

    After many tests and no real solution to the problem, I have decided to replace the built-in Samsung SSDs with new enterprise SSDs. So my question is: What is the best approach? Should I create a backup and destroy the existing Ceph setup? Or should I replace all the drives at once? The new...
  4. L

    Node randomly reboots

    Thank you for reply. during the weekend I was very studiously. I have done this so far: Updated firmware of every server to the most recent version Updated Proxmox server to the most recent version Wiped the crashed 3 SSD and took them back as OSD to Ceph Deactivated NOCD for the SSD's in...
  5. L

    Node randomly reboots

    Here is another update to this issue. During the backup I saw that some osd reported to be unavailable but came back in some seconds. I am wondering what this issue could cause. 2025-01-31T21:26:37.709202+0100 mgr.node04 (mgr.11276218) 38817 : cluster [DBG] pgmap v38807: 193 pgs: 12...
  6. L

    Node randomly reboots

    The Problem with our Ceph storage is quite new. It comes last night in top. But when I opened the post Ceph was okay. We have since yesterday that one node stopped and outed 3 of 4 disks. I tried to take them in and try to start them but I had no luck. here the status: pveceph status...
  7. L

    Node randomly reboots

    Herr some of the logs and infos you have asked for: Cluster information ------------------- Name: cluster01 Config Version: 4 Transport: knet Secure auth: on Quorum information ------------------ Date: Wed Jan 29 13:17:03 2025 Quorum provider...
  8. L

    Node randomly reboots

    Thanks for your replay. I am talking about the nodes (physical servers). They just reboot without any hint of the cause. quote; Jan 22 22:49:14 node04 smartd[1264]: Device: /dev/sdd [SAT], SMART Usage Attribute: 190 Airflow_Temperature_Cel changed from 63 to 60 Jan 22 23:00:05 node04...
  9. L

    Node randomly reboots

    Thank you for your reply! The Server-Hardware is this: Supermicro AS1015-A MT - 192GB RAM ECC - 4 x Samsung 4TB SSD - AMD Ryzen 9 7900X 2 x LAN on Board - Dual 10GB LAN Intel X550-T2 & 2xUSB-C Network Adapter On this machine are running 3 VM's Windows SQL Server agent: 1 bios: ovmf boot...
  10. L

    Node randomly reboots

    The hardware is new, so defective hardware is rather unlikely but not impossible. What’s strange is that I’ve seen this issue on some of the other four servers as well.
  11. L

    Node randomly reboots

    Hi everyone. First, some information about the setup we are running: • 4 x Proxmox nodes (version 8.3.2) with Ceph installed – cluster without HA • Separate networks for Ceph (2 x 10GB), Corosync (1GB), and Backup (1GB) - 2 switches (10GB & 1GB) • 1 x Proxmox Backup...