Best practices for 2-server+DAS-setup with Proxmox?

r4dh4l

Well-Known Member
Feb 5, 2018
91
7
48
Hi all,

given I have two well-equipped identical servers each with 4 SSDs and each 192 GB Ram (let's call them "primary" and "secondary" server). Each of these servers is connected to one identitcal Direct Attach Stoage unit with each 12 16TB SAS4 HDDs.

The secondary server should work as a backup system for the primary server. The primary server shall provide a higher double-digit TB amount of data via a file server.

Are there any best practice experiences you can share (like which RaidZ setup for the Direct Attach Storage is recommended)?

Best regards!
 
Hello.

HDDs are big. Thus you should use error detection and correction (at RAID or FS level).

HDDs are low IOps.
Thus you should use data cache if you want to speed up things : either with RAID card cache (+ BBU) or with ZFS + zlog.
But NOT RAID + ZFS !

As disks tends to fail at same time when produced on the same time and used the same way, I would recommend RAIDZ2 or RAID6 at last.

I am curious to read other analysis and recommendations.
 
Dear @lucavornheder

thank you very much for your reply. To answer your question:
what do you expect from the "backup-Node". Do you want to run a replication or do you want to run a shard scenario where you do HA?

Well, initially there was a server with two Direct Attach Storages but I want to avoid a single point of (hardware) failure. Unfortunately a cluster setup with 3 identical nodes by buying another two servers and a thirds DAS was out of budget. So the alternative was a 2 node setup (buying just a second server to have the 2 identical servers with 2 identical DAS units mentioned above), maybe combined with a QDevice instead of a full node. No sharding.
 
Dear @ghusson

thank you very much for your reply as well. To answer your comments:

HDDs are big. Thus you should use error detection and correction (at RAID or FS level).

Yes. They were bought to get a maximum storage capacity for the available budget without having a good backup concept.

HDDs are low IOps.
Thus you should use data cache if you want to speed up things : either with RAID card cache (+ BBU) or with ZFS + zlog.
But NOT RAID + ZFS !

I would prefer to go with ZFS because of good experience with ZFS in a (much smaller) server. By mentioning "zlog" you suggest to use one of the four SSDs as ZFS write cache (ZFS SLOG/Separate intent LOG)?

As disks tends to fail at same time when produced on the same time and used the same way, I would recommend RAIDZ2 or RAID6 at last.

Playing with https://jro.io/capacity/: Do you have any recommendations for the amount of vdevs and spares?

I am curious to read other analysis and recommendations.

Thank you for making the start.
 
I would prefer to go with ZFS because of good experience with ZFS in a (much smaller) server. By mentioning "zlog" you suggest to use one of the four SSDs as ZFS write cache (ZFS SLOG/Separate intent LOG)?
Yes SLOG (see https://www.truenas.com/blog/o-slog-not-slog-best-configure-zfs-intent-log/) . If you do so, use 2 SSD in mirroring for mitigating hardware failure. Warning, all SSD cannot manage SLOG load. Maybe it is not a good idea in your case.

Playing with https://jro.io/capacity/: Do you have any recommendations for the amount of vdevs and spares?
Ok. I am not a ZFS expert, I cannot tell.

For optimization, do not forget to reserve RAM to ZFS.