osd

  1. P

    Bluestore erroring opening db

    Hello, I'm managing a cluster of 4 nodes using proxmox 7.4-17 with CEPH. After a messy shutdown caused by a long power outage, a couple of VM images were corrupted, but we could restore them from backup. However, two of the OSD services refuse to come back, showing this kind of error: Apr 18...
  2. I

    [SOLVED] Ceph configuration - best practices

    Long story short... not possible. I'm planning to install and use ceph and HA in my working cluster enviroment. I have 4 nodes with 256GB RAM each and 2x10G NICs dedicated for ceph cluster network traffic. I already know this may be not enough for perfect performance so I'm planning to swap NICs...
  3. A

    [SOLVED] Ceph OSD adding issues

    Greetings community! After few month of using ceph from Proxmox i decided to add new disk and stuck with this issue. ceph version 17.2.7 (2dd3854d5b35a35486e86e2616727168e244f470) quincy (stable) Running command: /usr/bin/ceph-authtool --gen-print-key Running command: /usr/bin/ceph --cluster...
  4. Y

    Ceph - 4 Node NVME Cluster Recommendations

    I need to build a 4 node Ceph cluster and I need an effective capacity of 30TB. Each node is configured like this: Supermicro 2U Storage Server 24 x NVME, X11DPU, Dual 1600W 2 x Intel Xeon Gold 6240 18 Core 2.6Ghz Processor 768G Ram 2x 100G Network 6 x Samsung PM9A3 3.84TB PCIe 4.0 2.5...
  5. N

    [SOLVED] What is Quay.io ?

    Hi, when I try to attach a new osd to my ceph cluster, I get an error regarding the link https://quay.io/v2/ I would like to know where this error comes from and why ? And what is the real use of the quay.io/v2/, does ceph retrieve information on the remote server ? Thanks in advance Error...
  6. UdoB

    [SOLVED] FYI: do not extend Ceph with OSDs connected via USB

    Just written down for your amusement, on a lazy, dark and rainy Sunday afternoon: Someone (me) might try to extend a Ceph cluster by adding NVMe (or other SSDs) via USB3 Adapters. For a small homelab this should be feasable, isn't it? My three PVE nodes already run a single Ceph OSD on an...
  7. G

    OSD rebalance at 1Gb/s over 10Gb/s network?

    Hi, I'm trying to build a hyper-converged 3 node cluster with 4 OSD each on proxmox but I'm having some issues with the OSDs... First one is the rebalance speed: I've noticed that, even over a 10Gbps network, ceph rebalance my pool at max 1Gbps but iperf3 confirm that the link is effectively...
  8. L

    Hyperconverged Proxmox + Ceph Cluster - how to reconnect the right disk to nodes

    Hi, i had created a 3 Nodes Proxmox cluster with 3 Lenovo M720Q (for simplicity i call the nodes N1,N2 and N3). Then i had added 4 disks (D1, D2, D3 and D4). All was working fine. Then i move all the SFF PC and the disk from my desk to the rack but unfortunately i do not write down the...
  9. H

    Ceph failed OSD disk replace

    Hello, We have a failed OSD disk to replace on a production server. We have to hot replace it using remote hands in data center. Are theese steps correct? 1.Set global OSD flags: noout/norebalance/nobackfill 2.Out and destroy failed disk 3.Wipe failed disk under Disks /dev/sdx 4.Physically...
  10. C

    Replace all SSDs on a 3 nodes cluster

    Hi everyone, I have a 3 nodes cluster running on PVE 6.4 with a total of 24 SSDs with CEPH. Considering that: - the cluster can be brought to a total stop - each node is more than capable to host all the machines - new SSDs are bigger (from 960GB to 1.92TB) - I'd highly prefer to not stress the...
  11. M

    PVE 7 to 8: VM crashes after migrating, OSD not found

    I run a 3-node PVE with CEPH. I migrated all VMs away from node 3, upgraded to the latest CEPH (Quincy) and then started the PVE 7 to 8 upgrade on node 3. After rebooting node 3 (now PVE 8), everything seemed to work well. So I migrated two VMs, one each from node 1 (still on PVE 7) and node 2...
  12. L

    Virtual machines freezes with no console vnc output.

    Hi ! I have a cluster of pve 7.2.7 and ceph 16.2.9 About a month ago, some virtual machines in my cluster with CIS rolled on them (https://github.com/ansible-lockdown/UBUNTU20-CIS ) started to hang up for no reason. Only resetting the machine helps. There is nothing in the logs of the machines...
  13. J

    Kernel Crashes on IO error on dev dm-X

    Hey People, I have some devastating issues with my Backup Ceph Filesystem. It started with a failed disk some days ago, which is set it into recovery/rebuild mode. Its based on two erasure coding pools, one being 8/7 and the other 12/11 with a total of 16 disks. I didn't worry about anything as...
  14. C

    [SOLVED] Ceph health warning: unable to load:snappy

    Hello, after a server crash I was able to repair the cluster. Health check looks ok, but there's this warning for 68 OSDs: unable to load:snappy All OSDs are located on the same cluster node. Therefore I was checking version of related file libsnappy1v5; this was 1.1.9 Comparing this file...
  15. G

    Ceph: actual used space?

    Hi, I'm running a Proxmox 7.2-7 cluster with Ceph 16.2.9 "Pacific". I can't tell the difference between Ceph > Usage and Ceph > Pools > Used (see screenshots). Can someone please explain what's the actual space used in my Ceph storage? Do you think that 90% used pool is potentially dangerous...
  16. H

    Adding smaller size OSDs to ceph cluster

    Hello, Currently we have a ceph cluster of 6 nodes, 3 of the nodes are dedicated ceph nodes. Proxmox build 7.2-4. There's 8 x 3.84 Tib drives in each ceph node (24 total in three nodes). We are running out of space in ceph pool with 86%-87% usage. We currently do not have additional spare...
  17. B

    Is it possible to recover CEPH when proxmox is dead ?

    Hello, Sorry, i'm frech, and i dont speack english fluently, so excuse me for my bad writting. I was wondering if i could rebuild a CEPH clusters without having the original monitors/manager. So i made a trying lab for testing it. In this case, the OSD data is preserved. I made this test simple...
  18. I

    [SOLVED] ceph problem - Reduced data availability: 15 pgs inactive

    proxmox 7.1-8 yesterday i executed a large delete operation on the ceph-fs pool (around 2 TB of data) the operation ended withing few seconds successful (without any noticeable errors). and then the following problem occurred: 7 out of 32 osds went to down and out. trying to set them in and...
  19. 0

    Ceph ghost OSDs

    Hi all, After an upgrade, Proxmox would not start and I had to reinstall it completely. I made a backup of the config but presumably missed something : ceph.mon keeps crashing and 4 OSDs appear as ghosts (out/down). proxmox version : 7.2-3 ceph version : 15.2.16 Any help appreciated !
  20. L

    Adding new Ceph OSD when using multiple Cluster LAN IPs

    We had problems adding disks as new Ceph OSDs pveceph createosd /dev/sdX Error was: command '/sbin/ip address show to '192.168.1.201/24 192.168.1.202/24' up' failed: exit code 1 The workaround was to teporarily deactivate the 2nd Cluster IP in /etc/pve/ceph.conf cluster_network =...