New 2 node cluster with ZFS replication

geertv

New Member
Jan 27, 2020
7
0
1
Belgium
Hi,
i'm new to proxmox and try to setup a 2-node cluster (no HA) with ZFS replication.
Both systems have a 2 small disks in RAID 1 for the Promox hypervisor and 6x 1.8Tb disks in RAID 6 which were meant for the VM's.
Both system are meanwhile deployed with the latest Proxmox version.
If i am correct, 1st thing to do is setup the cluster, is it OK to use a back-to-back ethernet cable between the machine for the cluster traffic ?
Next would be the ZFS storage, i would make use of the replication possibilities.
When i select create ZFS on a node, i see my /dev/sdb disk, but i also see a warning that ZFS is not compatible with hardware raid, so i think this is my first issue...
Replication is only possible with ZFS ? Should i remove all hardware raid setup ?
Current config:
Disk /dev/sdb: 6.6 TiB, 7199170494464 bytes, 14060879872 sectors Disk model: PRAID EP400i Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Disklabel type: gpt

Thanks for your support !
 
Hi,
If i am correct, 1st thing to do is setup the cluster, is it OK to use a back-to-back ethernet cable between the machine for the cluster traffic ?
Yes, this is completely OK.

HW Raid and ZFS are not compatible.
Maybe you can check if your HWRaid card can be flashed in IT-Mode.
Passthrough or JBOD is not an option.
 
I was able to modify the RAID-controller config (standard JBOD), the disks are now individually available to proxmox, after which i created the ZFS pool with RAIDZ2. Is there anything else to do for the replication to start ? For now i have a test machine running on node 1, but no replication to node 2...
 
You have to setup the replication. Either on the Datacenter level or on the individual VM. It's a separate panel.
 
This is a feature that is built on top of ZFS snapshots and the send/receive mechanism of ZFS. It is not enabled automagically. You have to enable it per VM and define to which node it should be replicated and the interval.

Luckily, because I only replicate the important VMs in my 2 node cluster. The ones that are not so important (not email/network related) can be offline for a bit without a problem. I also want different intervals, The email server is replicated each minute while the DHCP and DNS server can be replicated in longer intervals. It comes down to how long of a data loss you can live with should a node go down during the interval between replications.
 
I have just setup this very same scenario and created a 2-node cluster just for replication. Now, I wanted to perform some failure checks and and wanted to know, what steps would be necessary to re-create, in this case the 2nd replication node in case it's system got corrupted or broken.

So I shut down the replication node and reinstalled pve. However, I cannot rejoin the reinstalled node, since the 1st node claims that there already exists a node by the same name. Is there any way achieve this?

EDIT: Ahh… found it anyway - just remove the old nodename although it is not mentioned when running pvecm node onthe remaining node…
 
Last edited: