Search results

  1. M

    Proxmox Ceph

    public_network is what is used by the Ceph clients to connect to the Ceph data. I believe you need to move your public_network to your 10Gb/s subnet
  2. M

    Proxmox Ceph cluster - mlag switches choice -

    I also use Mikrotiks for my network, but the MLAG was unreliable at that time I tested a few years ago, so I used l3hw capabilities of the switches and created a redundant routed ECMP configuration. Works great for me...
  3. M

    Ceph Cluster Won’t Finish Rebuild (Many PGs Stuck in active+cleaned+remapped) — Need Help!

    Hi @ralphte I am not a Ceph expert, but to me it looks as if you removed and re-added a host with the existing OSDs to the cluster. If you do that the host might change its id, that will make the crush algorithm to produce a different desired placement for a lot of pgs. I see you don't have any...
  4. M

    what is the point of CEPH, if there is no HA?

    Hi @jpv31 Not sure about the GUI, but you should be able to move all VMs from a bad pve1 server to a good pve2 server from CLI using a command like that 'mv /etc/pve/nodes/pve1/qemu-server/*.conf /etc/pve/nodes/pve2/qemu-server/'. Then you should be able to start them.. The etc/pve is a...
  5. M

    How to disable quorum mechanism in a cluster ?

    Yes, that's when you would need the 'pvecm expected 1' for the recovery. Better IMHO than using 'pvecm expected 1' permanently...
  6. M

    How to disable quorum mechanism in a cluster ?

    In theory, you should be able to modify corosync configuration (/etc/pve/corosync.conf) and assign to the node that you want to keep enough of votes so that the single node will be the quorum with no regards to the other ones (e.g. set quorum_votes to 4, the other 3 nodes with quorum_votes of 1...
  7. M

    CEPH REEF 16 pgs not deep-scrubbed in time and pgs not scrubbed in time

    Not a huge Ceph expert, just a long time user in my home lab cluster, you just pick up various small things as you go... I setup my cluster when there were no autoscale functionality, there used to be different calculators, that's where I know that it is supposed to be a few dozens pgs per OSD...
  8. M

    CEPH REEF 16 pgs not deep-scrubbed in time and pgs not scrubbed in time

    Maybe it just not able to finish all the scrubbing operations for all your pgs in two weeks? I had it when scrubbing stopped due to some OSD issues, and it took maybe a week or more after the issue fixed for the pgs to get all scrubbed... Also, to me 97pgs for 64 osd seem too low... They say...
  9. M

    Ceph and TBW with consumer grade SSD

    Yep. I run my cluster in this configuration for more than 7 years and still waiting for something to happen... It's a lab anyway, I actually wanted to see what could go wrong as a learning exercise, but it just works...
  10. M

    Ceph and TBW with consumer grade SSD

    I have a 3-node proxmox/ceph cluster with consumer grade NVME SSD and it works fine and I use a dozen or so of different VMs. Just checked at my 1TB Intel 660p and Crucial P1 that I started to use in 2019, one of them has 108 TB written, the other 126TB. Basically that is less that 2/3 of their...
  11. M

    RE-IP Proxmox VE and Ceph

    @complexplaster27 For each OSD you configure the address to use, something like below (I use different cluster and public subnets, but you can just use the same address for both). Then you restart that OSD (osd.8) to start using the new IP address. Check that everything works, the cluster still...
  12. M

    Ceph Health Error & pgs inconsistent

    I think you should give it some time for the deep scrubbing to finish and fix the inconsistencies.
  13. M

    RE-IP Proxmox VE and Ceph

    @complexplaster27 No, it's not related to proxmox network. When you restart an OSD configured to use the new subnet you will need it to be able to communicate to the OSDs that are still on the old subnet. Same with the MONs, when you re-create the new one still need to communicate to the old ones...
  14. M

    RE-IP Proxmox VE and Ceph

    Hi @complexplaster27 I responded to similar questions on the forum. I believe you will have to ensure there is a routing between your old and new subnets for the duration of the transition. You are correct, you just modify those parameters appropriately as you go (I believe there is also...
  15. M

    Guidance: Installing second network and re-configuring Ceph

    I would say the very first step would be to configure your new network and ensure hosts can talk to each other on this network. I believe that all can be done from GUI and not related to ceph at all. You will also need to enable routing between the old subnet and the new subnet. During the...
  16. M

    Remove and re-add cepf OSD

    OK, the GUI command should do zap. And if you were able to re-create the OSD from the GUI, that means the disks were zapped (first sectors zeroed). The command is 'ceph-volume lvm zap ...' I can see that your OSDs are assigned some class 'os'. Not sure where it came from, maybe you were playing...
  17. M

    Remove and re-add cepf OSD

    Hi @vaschthestampede Did you do zap on the destroyed OSD? I believe 'pveceph osd destroy' command does this, but if you use regular ceph command it needs to be done manually...
  18. M

    4 node cluster - very slow network/ceph speeds

    I know in certain mikrotik misconfigurations the traffic needs to be processed by CPU resulting in very poor performance... You can check if this is the case by running 'system resource monitor' on mikrotik, then your iperf3 and watch the CPU. In properly configured mikrotik there should be no...
  19. M

    Ceph: Import OSD to a new node

    Here's my notes for my moving OSD between the hosts that a did a couple of years ago, you might try the lvm activate command and see if that helps... On the old host - Stop OSD - Mark it OUT lvchange -a n $VG/$LV vgexport $VG Move the disk lsblk vgscan vgimport $VG vgchange -a y $VG...
  20. M

    Cheap Homelab NUC Cluster with Ceph.

    I run 3-node proxmox/ceph cluster for several years now. I use cheap Dell refurbished desktops with cheap consumer-grade NVMe and Solarflare 10GB adapters (though now my preference is Intel X520). Works fine regarding the cluster itself, live-migrating the VMs and doing host maintenance without...