then you have only one copy of your data left. In case of problems.
min-size is not regarding monitors, but data copies. Nodes that serve data by osd. You must differ between hypervisor cluster and ceph cluster / osd nodes.
If you are planning to loose more than one node at a time, than...
iirc you better set the crush weight of the "leaving" osd to zero. So the weight of the host is also altered.
Otherwise the second rebalancing occured due to altered host weight after destroying osd. Setting osd out on it´s own does not alter the host weight (iirc).
maybe misconfigured offloading features hurt the performance. I have no details here, but i often read about tso / lro / checksum offloading. maybe better turn it of in the vm.
iirc ethtool -K or something.
i think you have to "zap" your disk. But that´s a guess only.
Maybe this is the solution for you:
But please double check device, because command does, what it says, it destroys!!
ceph-volume lvm zap --destroy /dev/sdb
wenn man genügend Zeit und Platzreserve hat, ist das eine stressfreie Vorgehensweise. Ungefähr so wie Du beschrieben hast.
1. OSD auf out setzen. Nicht auf Stop.
Dann organisiert (rebalance) sich der Ceph Cluster neu. (Geht nur bei genügend freier Kapazität). Wenn Du...
in the gui at cluster level in the storage configuration you can set the block size of your zfs pool. This size is choosen for new "disks" as in your example.
If the backup does not fit, it is often because block size is to big.
But be warned, that may lead to bad performance.
i can´t give you technical details, but maybe it has to do with fragmentation.
As i understand, not only the sum of free RAM is important, but also the free RAM in the right category and in the right size.
Jan 20 04:13:50 proxmox-1 kernel: [86074.425514] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB...
I have not checked all details of this thread, so take this as silly question/hint only: have the source and the target the same timezone configured?
I saw a 1 hour diff between stamps.
And yesterday i had a weird problem with ceph/osd (wholy other topic, i know) due to different times. (reason...
If you have major problems with your Proxmox VE host, e.g. hardware issues, it could be helpful to just copy the pmxcfs database file /var/lib/pve-cluster/config.db and move it to a new Proxmox VE host. On the new host (with nothing running), you need to stop the pve-cluster service...
if you have enough storage, you can change the volblocksize parameter in the storage configuration of the cluster and move the disk away and back. Than the volblocksize of the new zvol reflects the changed value.
As for the question for best values, i think you have to benchmark...
i used today on command line:
ceph crash ls
that gives a list of archived and new crashes.
then you can archive your crash:
ceph crash archive 2020-10-29_03:47:12.641232Z_843e3d9d-bc56-46dc-8175-9026fa7f44a4
(you must replace the id with your values of course)
(sorry, did not see that you...
as i understand, you want the VMs to use the bridge, so you must ping from the VMs.
Proxmox itself don´t "uses" the bridge in my opinion.
To the sysctl.conf: you must look yourself, it was only a hint. I do not know the right values for your situation.