Proxmox HA - my concept, is it possible

tytanick · Sep 30, 2014

Hello guys,
There are so many infos in google but i cant find interesting solution.
Tell me is that possible with proxmox 3.3:

I have two identical hardware servers Dell [FONT=Verdana, sans-serif]PowerEdgeR620

Now i wat to use High Avaibility like this:
- there is no shared storage, so there is only disk space on node1 and node2.
Normally VMs are running on node1 and syncing with node2 all the time via 1gbit or meaby better interface ?
Now when node2 will see that node1 isnt answering it wil imidietly turn on VM that was running on node1 (with disk and ram synced ?)

Is it possible ?

[/FONT]

christophe · Sep 30, 2014

Hi,

High Availability needs fencing.
Live migration needs shared storage.

You have 2 nodes and no shared storage? Drbd is for you! Now, you have 2 nodes AND shared storage.
You need disk space on each node, and one or 2 extra NICs for drbd link.

Then : http://pve.proxmox.com/wiki/DRBD

Christophe.

dietmar · Sep 30, 2014

tytanick said:
Now i wat to use High Avaibility

IMHO, such setup is not suitable I you want "High Availability". You need at least 3 nodes, and some kind of
shared/distributed storage (maybe ceph).

tytanick · Sep 30, 2014

dietmar said:
IMHO, such setup is not suitable I you want "High Availability". You need at least 3 nodes, and some kind of
shared/distributed storage (maybe ceph).

Yeah but external storage will alho have bigger possibility of failure than two servers

I dont want to use external NAS.

Tell me if i am correct. Can Ceph be used like shared storage on those two servers ?
so they will each have copy of the same VMs ? and will be synced in realtime ?

RAM doesnt have to be synced, only VM storage should be working all the time and be synced n those two nodes.

Is CEPH a way to go if i dont want external storage and use only drives in node1 and node2 ?

mir · Sep 30, 2014

ceph requires 3 nodes. You could consider using GFS to form a shared storage on your 2 nodes.

dietmar · Sep 30, 2014

why GFS??

mir · Sep 30, 2014

You can make a shared storage with GFS on the two nodes which makes live migration possible between the two nodes.

tytanick · Sep 30, 2014

mir said:
You can make a shared storage with GFS on the two nodes which makes live migration possible between the two nodes.

yeah that is what i need and also need to add to this HA

anyone succeded do that ?

mir · Sep 30, 2014

Works fine here but only with VMs

tytanick · Sep 30, 2014

you mean it doesnt work with Cointainers ?

No problem, i will be using only VMs with debian wheedy 7 on it.

But i need to configure VE in case of one machine failure and without any other external NAS

Wish me luck

mir · Sep 30, 2014

Yep, no containers due to a possible kernel bug.

dietmar · Oct 1, 2014

mir said:
You can make a shared storage with GFS on the two nodes which makes live migration possible between the two nodes.

You can do that without GFS, for example using LVM.

tytanick · Oct 1, 2014

tell me, isnt that CEPH ws build for ?
Cluster storage on two or more proxmox nodes ?

e100 · Oct 1, 2014

dietmar said:
IMHO, such setup is not suitable I you want "High Availability". You need at least 3 nodes, and some kind of
shared/distributed storage (maybe ceph).

I agree with Dietmar

If you want to use HA then you want to use shared or distributed storage.
CEPH works well for this, if you have three or more nodes but you may be disappointed in the performance.
I have three CEPH nodes with 4 SATA disks each, performance is just acceptable but HA works fine with CEPH.

DRBD will likely give you the best performance for your two nodes, but I would not trust using HA with DRBD.
HA is unaware of DRBD status and triggering an HA event while DRBD is split-brain would have the opposite effect of HA.
If you are satisfied with just manually recovering from a node failure, DRBD is likely your best option given your resources.

tytanick · Oct 17, 2014

Ok,
So i will go on my 2 nodes with DRBD and NO HA at all.
But i have question.

Lets say that we have condition described below:
node1: running 3 VMs
node2: just waiting as a backup for node1 crash, so no VMs running
All that data is synced on both nodes with DRBD.
And on node 1 had all VMs set to "VM autostart enabled"

Lets say that i will leave this on production and some day there will be something wrong with node1 so i wont see my VMs running and i will start to do recovery.
How should i proceed ? Can i manually restart all VMs on my DRBD replicated data on second-node2 (by copying config from directory node1 to node2)?
And meaby send some kind of command from node2 to node1: "if you will see node1 running, dont accept any DRBR changes from node1 and shut all VMs down on node1 and halt node1)

And tell me, if all VMs are running on node1 and node2 is empty but only DRBR is syncing, and multiple power losses in time will occur, anything bad could happen to DRBD or not ?

Any other suggestions with my setup ?
2x NODE: Dell 620 only with internal storage and i want to make some kind of HA because i have to show it working

acidrop · Oct 19, 2014

hello

for such config you have two options: drbd or glusterfs in dual replica mode.

This article describes what you need if you follow drbd road: https://pve.proxmox.com/wiki/Two-Node_High_Availability_Cluster

For gluster search thet inet there are several guides.

Try to not use HA in two node setup.
In case if nodeA is down you can simply move vm config files from /etc/pve/nodes/proxmoxA/qemu-server to /etc/pve/nodes/proxmoxB/qemu-server and you can start then vms on second node.
Just be sure to have formed quorum before doing that by giving: pvecm expected 1
Take some time to read and get familiar with drbd or glusterfs and how proxmox clustering works and all your questions will be answered.
Proxmox wiki and DRBD documentation is a good start.

cesarpk · Oct 19, 2014

Hi tytanick, i believe that i have the best solution for your target.

With this strategy, you will have HA without lose information, but with some cautions for have caution if you don't want to have problems of lost information when HA must be applied.

The strategy is this:
1) You should use DRBD with only two (2) PVE nodes (not more PVE nodes of according with this setup).

2) You should use HA (using rgmanager)

3) If you want avoid the lost information up last moment in that "High Availability" should be applied, you should to have in the configuration of the virtual disks of the VMs this option enabled "Write through" or "Direct sync". For better performance I prefer use "Write through" due to that "Write through" does that the information of block of disks will be in the available RAM memory of the host PVE (and obviously with cache enabled in the PVE host, the read of disk blocks in the same dishes of disks will be less frequent due to that the information will be in the RAM memory).

4) For don't have problems of quorum with only two Servers, you should add a configuration to your cluster.conf file:
In your case, that is the same case that in my test lab:
<cman expected_votes="1" keyfile="/var/lib/pve-cluster/corosync.authkey" two_node="1"/>
Note: this strategy of configuration isn't the recomended by PVE team, but for me, with some extra cautions works very well in a production enviroment.

5) I know that this configuration (explained above) for cluster.conf file isn't the best (but works), and if you are controlling all the aspects that are really important, and in the manual mode, you will not have problems.

6) In your "/etc/drbd.d/global_common.conf" file , you should to have enabled literally this lines:
split-brain "/usr/lib/drbd/notify-split-brain.sh <some address@of mail account>";
out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh <some address@of mail account>";

Notes about of these setup of DRBD:

- The target of these configurations in DRBD is for that DRBD notify by email immediately if DRBD have some problem with the synchronous replication (obviously, you need to be sure that this directives work as it is expected it to work, so my recomendation is tested before).

- If DRBD notifies by email at the same time that a server was decomposed, obviously you will not have problems of loss of information when you need apply HA.

7) When a "PVE host" be decomposed, these VMs in this "PVE host" don't will work more, but you can analyze the problem with all tranquility and patience, the purpose of analyze the problem quietly is for be sure that DRBD had no problems of replication until it broke the PVE host.

8) When are you sure that DRDB don't had any problem, and the problem only was the break down of the PVE host, obviously, you can apply HA with all tranquility.

These are the steps:
A) Manually disconnect the electric power in the physical server that is supposedly dead (it is the best practice for be sure that this Node no will work more and avoid serious problems with the information contained in your virtual disks of the VMs implicated - This step is critical)

B) For apply "HA" (and restart your VMs on the other PVE host that is alive), you should write by CLI in the secure shell of the PVE host that is alive this command:
Secure Shell of "PVE host" that is alive# /usr/sbin/fence_ack_manual <name of the PVE host that is dead>

9) Enjoy again of your VMs in the other PVE host and without lost of information.

Extra Notes:
A) PVE or Red Hat have other methods for apply "HA" with all security and without human interaction, but this type of setups requires more equipments and as minimum a PDU, so i have intent of explain you only a method that i believe that for your small infrastructure will be the better solution.

B) By other hand, Ceph (The star storage solution of PVE) speaking in basic mode have some problems of developed (but not in his concept and nor in his structure) , and for me understand "Ceph" isn't listed for use it in production environment when your information is very critical and you needs that all tools of "Ceph" works perfectly (i am sure that the problems at present of "Ceph" only are temporals, and in a short time Ceph will be really ready for use it in real environments of production to great scale without a unique point of failure).

C) In the case of DRBD, this software have many years of exist (a great difference with Ceph software), so this product since much time ago spent his first epoch of development, and actually in this epoch is considered super stable. By other hand, at present I'm using the latest version of DRDB (8.4.5) in production environments without any problem.

D) A point in very favour of DRBD is that the reads of blocks of disk are locals (and obviously with the speed of the bus of your hard disk controller installed in the same PC or Server), with the difference that "GlusterFS" or "Ceph" as are installed in other computers in a network , the speed of reads or writes of disks are determined by the hardware more slow in the path to information, or the efficiency of his software, ie, or your speed of network communication, or the speed of the hard disks in the storage layer, or the efficiency of the software involved for do these tasks, that in the majority of cases is very slow (and maybe unbearable, for example in a problem of software for do backups, Ceph is very slow compared with DRBD). It is for several reasons (including money need for purchase a good infraestructure) that i prefer to use DRBD.

For all people of this forum: If i am wrong in some thing that I have said, please feel free to correct me.

For tytanick: Awaiting that my recomendations be of great help for you, i say see you soon.

Best regards
Cesar Peschiera

Search

Search

Proxmox HA - my concept, is it possible

tytanick

Member

christophe

Renowned Member

dietmar

Proxmox Staff Member

tytanick

Member

mir

Famous Member

dietmar

Proxmox Staff Member

mir

Famous Member

tytanick

Member

mir

Famous Member

tytanick

Member

mir

Famous Member

dietmar

Proxmox Staff Member

tytanick

Member

e100

Renowned Member

tytanick

Member

acidrop

Renowned Member

cesarpk

Well-Known Member