Maybe useful for people coming back to this thread one day:
Take a look at the kernel.org thread which describes this bug including examples from Proxmox and QEMU. There is also an excellent writeup here.
TLDR; This is an excellent go-through the requirements of corosync-tuning for larger clusters by @fweber (which is in fact part of linked thread by @bbgeek17)
A status update on this:
Two corosync parameters that are especially relevant for larger clusters are the "token timeout" and the "consensus timeout". When a node goes offline, corosync (or rather the totem protocol it implements) will need to...
@fstrankowski I'm fully aware about the risks of OSD being full and know how to deal with that, but in any case an OSD should break because of that ;)
Definitely fragmentation has an impact on this and will watch it more closely from now on...
Best of luck to you * fingers crossed *. I had to rebuild the whole cluster in my clients case and fix ceph by manually restoring placement groups - which was a pain.
This will only fix your problem in the short term. Fragmentation will come back relativly quick. You better add more OSDs or wipe some data off your pools :-)
Initially i'd like to raise concerns about the amount of available storage already beeing in use. By default CEPH doesnt allow more then 80% so you'd have to take precautions really soon while taking these concerns into consideration.
I'd highly...
Es funktioniert nicht ausschließlich mittels ZFS Replikation. Meine Aussage war vollkommen korrekt, jedoch in diesem Fall schlecht formuliert, denn dafür wäre dann wie erwähnt HA notwendig. Deinen Elfenbeinturm der didaktischen Überheblichkeit...
@bbgeek17 wrote two pieces which should cover your questions:
https://kb.blockbridge.com/technote/proxmox-lvm-shared-storage/
https://kb.blockbridge.com/technote/proxmox-qcow-snapshots-on-lvm/
Basically snapshot on lvm/thick wasn't supported...
Would you be so kind to not post untested ai generated answers?
Parts like this
are simply not correct. This would potentially increase the used memory for ARC.
Wenn Du wirkliches HA möchtest geht nichts ohne shared Storage. Bei ZFS Replikation hast Du immer das Delta zwischen den Syncs und im worst Case musst Du eben per Hand die VM/LXC-Config manuell verschieben.
Welcome to the Forum!
You're lucky, Proxmox recently released an archive mirror which comes in handy for your "usecase":
http(s)://archive.proxmox.com/debian/pve
Add that repository to your sources and you should be able to finish what you're...