huh. this may not only be a Proxmox thing. https://www.spinics.net/lists/ceph-users/msg73008.html
also, do it on all nodes in the cluster https://forum.proxmox.com/threads/ceph-dashboard-not-working-after-update-to-proxmox-7-from-6-4.104911/ and then restart the mgrs?
so it turned out that when I reinstalled Ceph into the two nodes that had had the "ghost" mons, every time it completed a ghost mon would disappear. I was able to create real mons and OSDs and everything seems fine right now, cross your fingers!
see above. Proxmox 7.3, Ceph Pacific. when trying to finish initializing the cluster, I get this:
no monitors are intentionally configured, but it still seems to "know" about two mons that were on two hosts which will not be part of the new Ceph cluster. those mons aren't there anymore and I...
So, I have since reformatted and reinstalled all three nodes and added a fourth. I am able to migrate a VM off and back on to the problematic node (Orinoco), but if I try to create a new VM on it, I get all kinds of "timed out" errors. I don't know how to get it to be more specific...
that benchmark tip is great, thank you - surprisingly, the Flash devices I thought were the fastest actually had the highest apply/commit latencies. now at least I know the culprits to replace.
not sure what you mean by non-standard. multiple OSDs per 2TB NVMe seems to be standard practice...
well, I "solved" my cephx problem (https://forum.proxmox.com/threads/new-install-cannot-create-ceph-osds-bc-of-keyring-error.119375/) by disabling cephx
now, however, I always have clock skew being complained about on the Ceph dashboard (ex: mon.ganges clock skew 0.131502s > max 0.05s (latency...
I think a lot of times the desire for this is motivated by a desire for a single pane of glass in management. I wonder if that's on the roadmap - managing multiple "datacenter" objects connected by one management system.
I tried copying the keyring file that does exist into the locations it seems to be looking for it. still getting this:
root@riogrande:~# ceph-volume lvm batch --osds-per-device 2 /dev/nvme0n1
--> DEPRECATION NOTICE
--> You are using the legacy automatic disk sorting behavior
--> The Pacific...
hi - I have a new install of PVE with 4 nodes, all with Ceph installed. when I go to create OSDs, specifically two on one 2TB NVMe, I get this:
root@riogrande:/etc/pve/priv# ceph-volume lvm batch --osds-per-device 2 /dev/nvme0n1
--> DEPRECATION NOTICE
--> You are using the legacy automatic disk...
It is stable, but no VM or container started on the problematic node is able to use Ceph block device resources. It times out trying to access them, UNLESS the VM or container already existed when this problem started ~6 months ago. It is also not able to move containers or VMs off of Ceph...
root@ibnmajid:~# sudo systemctl status pve-cluster
● pve-cluster.service - The Proxmox VE cluster filesystem
Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
Active: active (running) since Thu 2022-12-01 04:55:58 CST; 1 day 1h ago
Main PID...
Hi, thanks so much for replying - I actually turned it down to 2 and 2 for more space. Haven't run into any issues with it yet, although of course I can always put it back up to 3/2 if that's prerequisite for getting things migrated.
I can do that:
root@ibnmajid:~# pvecm status
Cluster...
hi,
I have had a problem with one node in my cluster for months now with no idea how to fix it, with no one on this forum or in the Ceph mailing list ever replying, which makes me think no one has ever seen it before and no one has any idea how to troubleshoot it, so I am trying to figure out...
I don't have $900 a year to spend on my 3-node homelab cluster. I would actually be willing to pay for a ticket on an individual basis, but that isn't an option.
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.