Thanks, that helped for me right now.
I just learned that the SSDs are not for enterprise use. Since we just have got a new bunch of servers with enterprise SSDs and more NICs, I will migrate to the new ones ASAP.
Thanks for your reply and help!
Losing data is something else than losing write access.
In an erasure coded pool if you lose more than m OSDs in an affected PG you lose data.
If you have less than min_size OSDs you lose write access to the placement group.
Although they were about replicated pools (so no ec) following reads might serve as a hint why (outside of experiments/lab setups) it's not a good idea to go against the recommendations...
With size=min_size you cannot lose any OSDs without losing write access to the affected objects.
And it has nothing to do with number of nodes or number of OSDs.
Yes. In erasure coded pools with m=2 you can lose 2 OSDs for one PG at the same time without losing data.
The same can be achieved in replicated pools with size=3. You can lose 2 OSDs for a PG without losing its data.
This is not recommended and certainly not HA. With m=1 you cannot loose a single disk.
An erasure coded pool should have size=k+m and min_size=k+1 settings which would be size=3 and min_size=3 in your case.
No no no. You got your math wrong.
To achieve the same availability as EC with k=6 and m=2 you need triple replication (three copies) meaning a storage efficiency of 33%. It is rarely necessary to go beyond 4 copies.
"lower" and "higher" are subjective. Ceph achieves HA using raw capacity.
suit yourself. this is not a recommended deployment. You are far better served by just having two SEPERATE VMs each serving all those functions without any ceph at all-...
The number of OSDs isn't relevant to a pool as long as it is larger then the minimum required by the crush rule. For example, If you have an EC profile of K=8,N=2 rule, you need a minimum of 10 OSDs DISTRIBUTED ACROSS 10 NODES. so 1 OSD per node...
Hi,
most likely you did install PBS on top of a vanilla Debian installation? In that case the vanilla Debian kernel should be uninstalled, e.g via apt remove linux-image-amd64 'linux-image-6.1*'
Hi Victor,
One other 'temporary' thing that you may configure if there is a critical need for all OSDs to be up is to change the allocation_size for each OSD from 64k to 4k using the 'bluestore_shared_alloc_size' parameter [0], which you can...
The failure domain must never be the OSD.
With failure domain = host you only have one copy or one chunk of the erasure coded object in one host. All the other copies or chunks live on other hosts.
That is why you need at least three hosts for...
you dont need pci passthrough for lxc- just would need to install the proper nvidia driver based on hardware and kernel deployed. You are better off creating an installation script, especially if you intend on having multiple nodes with GPUs...
I think you need to carefully consider what your end goal is. PCIe passthrough is not a good citizen in a PVE cluster, since VMs with PCIe pins not only cannot move anywhere, but also liable to hang the host. if you MUST use PCIe passthrough...
In a cluster you dont need or even want to backup a host. everything important lives in /etc/pve which exists on all nodes. If you DID back up a host(s), you'd open the possibility of restoring a node that has been removed from the cluster and...
iSCSI is deprecated in the Ceph project and should not be used any more.
And there is no need to backup a single Proxmox node (if you have a cluster).
You may want to backup the VM config files but everything else is really not that important...
Ceph can deploy NVMEoF gateways. You need to find hardware that is able to boot from that.
Or you use a PXE network boot where the initrd contains all necessary things to continue with a Ceph RBD as root device.