The loglines you posted from the Proxmox host contain dhclient lines which is unusual.
Can you show more details about the "crash"?
My suspicion is that dhclient on the host messes up the network config and this makes the host unreachable but does not crash it.
OK, but then they need to run their own DHCP client and not the Proxmox host.
You have not posted log lines from the actual crash. Please set up remote syslogging or take a screen capture from the crash message.
1. There is an option to create new MAC addresses when cloning a VM template.
2. Remove /etc/machine-id as a last step before creating the VM template. A new machine ID will be created at first boot.
cloud-init configures an account and the network settings inside a newly cloned VM.
What is the connection to CephFS or RBD that is not working for you?
Yes, because the assumed failure zone is the host. If just an OSD fails it should be replaced.
In small clusters the time to replace a failed disk is more crucial than in larger clusters where the data is more easily replicated to the other OSDs in the nodes.
There is a neat calculator at https://florian.ca/ceph-calculator/ that will show you how the set the nearfull ratio for a specific number of disks and nodes.
All these applications already replicate their data in the application level. They do not need a storage system that does this.
Let these VMs run on local storage and you will get way better performance than Ceph with size=1.
If you keep the smaller disks the larger ones will get 8 times the IOPS because the size is one of the factors of Ceph's algorithms to distribute the data.
The NVMe disks may be able to handle that.
Usually Ceph will move the data for you.
Before you bring down a host drain and remove its OSDs. After swapping the disks create a new OSD on the new NVMe and Ceph will happily use it.
After doing this with all nodes all your data will be on the new disks.