osd

  1. M

    On node crash, OSD is down but stays "IN" and all vm's on all nodes keep in error and unusable.

    Hello, I work for multiple clients and one of them wanted us to create a Proxmox cluster to assure them fault tolerance and a good hypervisor that's cost-efficient. It's the first time we put a Proxmox cluster in Production environment for a client. We've only used single node proxmox. Client...
  2. M

    CEPH Reweight

    Hello everyone! I have a question regarding CEPH on PROXMOX. I have a CEPH cluster in production and would like to rebalance my OSDs since some of them are reaching 90% usage. My pool was manually set to 512 PGs with the PG Autoscale option OFF, and now I've changed it to PG Autoscale ON. I...
  3. L

    [SOLVED] 2 stuck OSD's in ceph database

    I tried to remove all OSD's from a cluster and recreate them, but 2 of them are still stuck in the ceph configuration database. I have done all the standard commands to remove them, but the reference stays. # ceph osd crush remove osd.1 removed item id 1 name 'osd.1' from crush map # ceph osd...
  4. H

    [SOLVED] Removing ceph DB disk

    Hello, I've added some more drives to our 3 node ceph cluster, started creating OSDs and accidently created LVM,CEPH (DB) disk instead of OSD. I do not need a seperate DB disk. How can i destroy it and re-create it to regular OSD? Actually i did the same mistake on two nodes. Here's output...
  5. M

    OSD with iSCSI

    Hi, Could anyone help how to configure OSD with iSCSI??
  6. F

    [SOLVED] CEPH Reef osd still shutdown

    Hi everyone, I'm working with a 3 node cluster running with ceph 17 and I'm about to upgrade. I also add a new node to the cluster and install ceph 18.2 . The first OSD i'm creating seems OK yet after a few moments it's shut down. In the logs here is what I can find : May 18 15:34:44 node4...
  7. M

    Having trouble clearing some ceph warnings.. Reduced data availability & Slow ops

    Hey all, I'm having trouble clearing some warnings from my ceph cluster. 1.) HEALTH_WARN: Reduced data availability: 1 pg inactive pg 1.0 is stuck inactive for 5m, current state unknown, last acting [] 2.) HEALTH_WARN: 2 slow ops, oldest one blocked for 299 sec, daemons [osd.0,osd.1] have...
  8. P

    Bluestore erroring opening db

    Hello, I'm managing a cluster of 4 nodes using proxmox 7.4-17 with CEPH. After a messy shutdown caused by a long power outage, a couple of VM images were corrupted, but we could restore them from backup. However, two of the OSD services refuse to come back, showing this kind of error: Apr 18...
  9. I

    [SOLVED] Ceph configuration - best practices

    Long story short... not possible. I'm planning to install and use ceph and HA in my working cluster enviroment. I have 4 nodes with 256GB RAM each and 2x10G NICs dedicated for ceph cluster network traffic. I already know this may be not enough for perfect performance so I'm planning to swap NICs...
  10. A

    [SOLVED] Ceph OSD adding issues

    Greetings community! After few month of using ceph from Proxmox i decided to add new disk and stuck with this issue. ceph version 17.2.7 (2dd3854d5b35a35486e86e2616727168e244f470) quincy (stable) Running command: /usr/bin/ceph-authtool --gen-print-key Running command: /usr/bin/ceph --cluster...
  11. Y

    Ceph - 4 Node NVME Cluster Recommendations

    I need to build a 4 node Ceph cluster and I need an effective capacity of 30TB. Each node is configured like this: Supermicro 2U Storage Server 24 x NVME, X11DPU, Dual 1600W 2 x Intel Xeon Gold 6240 18 Core 2.6Ghz Processor 768G Ram 2x 100G Network 6 x Samsung PM9A3 3.84TB PCIe 4.0 2.5...
  12. N

    [SOLVED] What is Quay.io ?

    Hi, when I try to attach a new osd to my ceph cluster, I get an error regarding the link https://quay.io/v2/ I would like to know where this error comes from and why ? And what is the real use of the quay.io/v2/, does ceph retrieve information on the remote server ? Thanks in advance Error...
  13. UdoB

    [SOLVED] FYI: do not extend Ceph with OSDs connected via USB

    Just written down for your amusement, on a lazy, dark and rainy Sunday afternoon: Someone (me) might try to extend a Ceph cluster by adding NVMe (or other SSDs) via USB3 Adapters. For a small homelab this should be feasable, isn't it? My three PVE nodes already run a single Ceph OSD on an...
  14. G

    OSD rebalance at 1Gb/s over 10Gb/s network?

    Hi, I'm trying to build a hyper-converged 3 node cluster with 4 OSD each on proxmox but I'm having some issues with the OSDs... First one is the rebalance speed: I've noticed that, even over a 10Gbps network, ceph rebalance my pool at max 1Gbps but iperf3 confirm that the link is effectively...
  15. L

    Hyperconverged Proxmox + Ceph Cluster - how to reconnect the right disk to nodes

    Hi, i had created a 3 Nodes Proxmox cluster with 3 Lenovo M720Q (for simplicity i call the nodes N1,N2 and N3). Then i had added 4 disks (D1, D2, D3 and D4). All was working fine. Then i move all the SFF PC and the disk from my desk to the rack but unfortunately i do not write down the...
  16. H

    Ceph failed OSD disk replace

    Hello, We have a failed OSD disk to replace on a production server. We have to hot replace it using remote hands in data center. Are theese steps correct? 1.Set global OSD flags: noout/norebalance/nobackfill 2.Out and destroy failed disk 3.Wipe failed disk under Disks /dev/sdx 4.Physically...
  17. C

    Replace all SSDs on a 3 nodes cluster

    Hi everyone, I have a 3 nodes cluster running on PVE 6.4 with a total of 24 SSDs with CEPH. Considering that: - the cluster can be brought to a total stop - each node is more than capable to host all the machines - new SSDs are bigger (from 960GB to 1.92TB) - I'd highly prefer to not stress the...
  18. M

    PVE 7 to 8: VM crashes after migrating, OSD not found

    I run a 3-node PVE with CEPH. I migrated all VMs away from node 3, upgraded to the latest CEPH (Quincy) and then started the PVE 7 to 8 upgrade on node 3. After rebooting node 3 (now PVE 8), everything seemed to work well. So I migrated two VMs, one each from node 1 (still on PVE 7) and node 2...
  19. L

    Virtual machines freezes with no console vnc output.

    Hi ! I have a cluster of pve 7.2.7 and ceph 16.2.9 About a month ago, some virtual machines in my cluster with CIS rolled on them (https://github.com/ansible-lockdown/UBUNTU20-CIS ) started to hang up for no reason. Only resetting the machine helps. There is nothing in the logs of the machines...
  20. J

    Kernel Crashes on IO error on dev dm-X

    Hey People, I have some devastating issues with my Backup Ceph Filesystem. It started with a failed disk some days ago, which is set it into recovery/rebuild mode. Its based on two erasure coding pools, one being 8/7 and the other 12/11 with a total of 16 disks. I didn't worry about anything as...

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!