Hi,
Since we've been migrating more and more stuff to ceph under proxmox, we've found a quirky behavior and I've built a test case for that on my test cluster.
Create a small cluster with minimum 4 nodes.
create one ceph pool sharing using one disk per node with 4 times mirroring, with minimum set to 2 let's call it pool_1
create VM's using this pool as storage, let's call it vm_A,
create second pool using single (yet separate) disk per node with 2 times mirroring with minimum set to 1, let's call it pool_2 (here best is to use sata / usb disks to make it easy to unplug those for failure simulation)
create a VM using second pool as a storage, let's call it vm_B
Now, you can switch of two nodes, OR simple unplug two disks that are storing pool_2.
What I would expect to happen is proxmox to kill / suspend VM_B - and it happens (well those get killed but hey)
What I would NOT expect to happen is proxmox to kill VM_A as well.
I know that some will come after me with pitch fork for using blasphemous mode of less then 3 mirrors or later will try to dismiss problems using this mode as an excuse but ¯\_(ツ)_/¯ sorry, this is easiest I can show how to replicate the problem. In our case it was that we've joined few unreliable nodes to the cluster and put one pool on those, those nodes went down for RAM replacement and whole cluster just imploded.
Since we've been migrating more and more stuff to ceph under proxmox, we've found a quirky behavior and I've built a test case for that on my test cluster.
Create a small cluster with minimum 4 nodes.
create one ceph pool sharing using one disk per node with 4 times mirroring, with minimum set to 2 let's call it pool_1
create VM's using this pool as storage, let's call it vm_A,
create second pool using single (yet separate) disk per node with 2 times mirroring with minimum set to 1, let's call it pool_2 (here best is to use sata / usb disks to make it easy to unplug those for failure simulation)
create a VM using second pool as a storage, let's call it vm_B
Now, you can switch of two nodes, OR simple unplug two disks that are storing pool_2.
What I would expect to happen is proxmox to kill / suspend VM_B - and it happens (well those get killed but hey)
What I would NOT expect to happen is proxmox to kill VM_A as well.
I know that some will come after me with pitch fork for using blasphemous mode of less then 3 mirrors or later will try to dismiss problems using this mode as an excuse but ¯\_(ツ)_/¯ sorry, this is easiest I can show how to replicate the problem. In our case it was that we've joined few unreliable nodes to the cluster and put one pool on those, those nodes went down for RAM replacement and whole cluster just imploded.