osd

  1. J

    Ceph OSD woes after NVMe hotplug

    We're in the process of validating a PVE cluster setup that will be deployed to prod some time in 2026, and for that purpose, we've spun up the MVC (Minimum Viable Cluster) that mimicks, except in node count, what we're planning to have by then. As a result, we have three modern Dell boxen with...
  2. L

    Hyperconverged cluster logging seemingly random crc errors

    We have 4 nodes (dual Xeon CPU's, 256G RAM, 4 NVMe SSD's, 4 HDD's and dual Melanox 25Gb/s sfp's) in a cluster. Randomly I have started noticing crc errors in the osd logs. Node B, osd.6 2025-10-23T10:32:59.808+0200 7f22a75bf700 0 bad crc in data 3330350463 != exp 677417498 from...
  3. R

    Impact of Changing Ceph Pool hdd-pool size from 2/2 to 3/2

    Scenario I have a Proxmox VE 8.3.1 cluster with 12 nodes, using CEPH as distributed storage. The cluster consists of 96 OSDs, distributed across 9 servers with SSDs and 3 with HDDs. Initially, my setup had only two servers with HDDs, and now I need to add a third node with HDDs so the pool can...
  4. A

    CEPH Expected Scrubbing, Remap and Backfilling Speeds Post Node Shutdown/Restart

    Good Morning While we were doing upgrades to our cluster (upgraded each memory from 256 to 512 - 3 identical nodes), doing one node at a time and all VM's removed from HA and switched off, we noticed that after a node comes online it takes approximately 20-30 minutes for the Remap/Scrub/Clean...
  5. D

    Ceph: Verhalten beim Ausfall eines Knoten

    Guten Tag, ich habe eine Vefrständnisfrage bzgl. des Verhaltens von Ceph beim Ausfall eines Knotens. Szenario: 3+ Knoten Ceph in einer 3/2-Kopnfiguration Ceph-Storage inkl. CephFS ist zu 75+% gefüllt Bei dem plötzlichen Ausfall eines Knoten beginnt Ceph die PGs neu zu verteilen bzw...
  6. N

    Ceph - Which is faster/preferred?

    I am in the process of ordering new servers for our company to set up a 5-node cluster with all NVME. I have a choice of either going with (4) 15.3TB drives or (8) 7.68TB drives. The cost is about the same. Are there any advantages/disadvantages in relation to Proxmox/Ceph performance? I think I...
  7. I

    Ceph OSD drives disconnect when I move LXC storage to it

    Hi, I’m running Plex in an LXC container with the root disk on my local-zfs storage. However, when I try to move the storage to my ceph pool, my local OSD drives disconnect during the process. I tried doing something similar with a larger VM disk (300GB) with no issues. Likewise, when I move...
  8. P

    removing OSD on failed Hardware leaves the OSD service

    Hello all, Here is the situation : We have a Ceph cluster on top of proxmox on Dell Hardware. On of the DELL virtual disk failed, hence the corresponding OSD failed. This is a HDD disk not NVME and "thankfully" the bluestore was not split out on local NVME disks. Anyway, we followed the...
  9. C

    Help moving ceph network

    I tried to move my ceph network to another subnet and now all osd's are not picking up the new network and staying down. I fear i may have hosed ceph, but as a learning experience, would like to see if i can recover them. this is what they look like in the cluster: this is the contents of...
  10. Q

    [PLEASE HELP]I can't create Ceph OSD.

    Hi, I am trying to create Ceph OSD(Squid) on a hard drive from Proxmox web GUI but it failed. Please tell me how to fix it! Here's log: https://pastebin.mozilla.org/5zwd5RVA Thank you in advance!
  11. M

    [SOLVED] On node crash, OSD is down but stays "IN" and all vm's on all nodes keep in error and unusable.

    Hello, I work for multiple clients and one of them wanted us to create a Proxmox cluster to assure them fault tolerance and a good hypervisor that's cost-efficient. It's the first time we put a Proxmox cluster in Production environment for a client. We've only used single node proxmox. Client...
  12. M

    CEPH Reweight

    Hello everyone! I have a question regarding CEPH on PROXMOX. I have a CEPH cluster in production and would like to rebalance my OSDs since some of them are reaching 90% usage. My pool was manually set to 512 PGs with the PG Autoscale option OFF, and now I've changed it to PG Autoscale ON. I...
  13. L

    [SOLVED] 2 stuck OSD's in ceph database

    I tried to remove all OSD's from a cluster and recreate them, but 2 of them are still stuck in the ceph configuration database. I have done all the standard commands to remove them, but the reference stays. # ceph osd crush remove osd.1 removed item id 1 name 'osd.1' from crush map # ceph osd...
  14. H

    [SOLVED] Removing ceph DB disk

    Hello, I've added some more drives to our 3 node ceph cluster, started creating OSDs and accidently created LVM,CEPH (DB) disk instead of OSD. I do not need a seperate DB disk. How can i destroy it and re-create it to regular OSD? Actually i did the same mistake on two nodes. Here's output...
  15. M

    OSD with iSCSI

    Hi, Could anyone help how to configure OSD with iSCSI??
  16. F

    [SOLVED] CEPH Reef osd still shutdown

    Hi everyone, I'm working with a 3 node cluster running with ceph 17 and I'm about to upgrade. I also add a new node to the cluster and install ceph 18.2 . The first OSD i'm creating seems OK yet after a few moments it's shut down. In the logs here is what I can find : May 18 15:34:44 node4...
  17. M

    Having trouble clearing some ceph warnings.. Reduced data availability & Slow ops

    Hey all, I'm having trouble clearing some warnings from my ceph cluster. 1.) HEALTH_WARN: Reduced data availability: 1 pg inactive pg 1.0 is stuck inactive for 5m, current state unknown, last acting [] 2.) HEALTH_WARN: 2 slow ops, oldest one blocked for 299 sec, daemons [osd.0,osd.1] have...
  18. P

    Bluestore erroring opening db

    Hello, I'm managing a cluster of 4 nodes using proxmox 7.4-17 with CEPH. After a messy shutdown caused by a long power outage, a couple of VM images were corrupted, but we could restore them from backup. However, two of the OSD services refuse to come back, showing this kind of error: Apr 18...
  19. I

    [SOLVED] Ceph configuration - best practices

    Long story short... not possible. I'm planning to install and use ceph and HA in my working cluster enviroment. I have 4 nodes with 256GB RAM each and 2x10G NICs dedicated for ceph cluster network traffic. I already know this may be not enough for perfect performance so I'm planning to swap NICs...
  20. A

    [SOLVED] Ceph OSD adding issues

    Greetings community! After few month of using ceph from Proxmox i decided to add new disk and stuck with this issue. ceph version 17.2.7 (2dd3854d5b35a35486e86e2616727168e244f470) quincy (stable) Running command: /usr/bin/ceph-authtool --gen-print-key Running command: /usr/bin/ceph --cluster...