2+1 CEPH cluster - long downtime before VM starts after 1 node goes down

zinop

New Member
Dec 26, 2022
2
0
1
Hi, everyone!
I have 2+1 nodes CEPH cluster on PVE with configuration: 2 same nodes with 1 OSD and monitor on each (node-1 and node-2), 3rd node with a monitor only for quorum (i know it's necessarry to have minimum 3 node cluster, but it's no options to buy 3d server to deploy full-fledged 3-node cluster).
Settings of pool are: size - 2, min. size - 1.
It works pretty well when nodes-1,2 are up, but when node-1(2) with VM on it goes down, i tooks a lot of time (about 30 minutes) to start VM after it has been migrated to node-2(1). During this VM is frozen and it's impossible to view CEPH pools via GUI (page doesn't respond). But no problem with VM migration after one of nodes-1(2) is down, exept long starting time after it has been migrated.
I'm new with PVE and CEPH and don't know if the problem with cluster build (when only 1 node with OSD still working) or maybe i need to change something in configs.

Much thanks for any suggestions!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!