Advice for cluster with SAS shared devices per 2 server

Zulgrib

New Member
Dec 22, 2022
9
0
1
Hello,

In a scenario with 3 chassis, each chassis housing two servers and all chassis SAS disks shared with the two servers, what would be the appropriate methodology to not cause data corruption between the servers within the shared chassis, and keeping the vm image in sync between the chassis?

Based on my understanding of the wiki, I should use thick LVM volumes to share the SAS disks between the two servers, but syncing local vm images to other chassis seems to be limited to ZFS and doesn't support FQDN ("Name resolution not taken into account"), AFAIK ZFS shouldn't be on top of LVM. ZFS over iSCSI is documented, but nothing over shared SAS.

The documentation mention CephFS but I don't know if this scenario of shared SAS drive would be supported at all. "Hyper-Converged Ceph Cluster" didn't help me there.

Some "plug and play" vendors use an active / passive scenario here, in my case it would be wasted CPU and memory to let it sit unused, in addition to not finding documentation to achieve that with proxmox.

Any suggestion appreciated.
 
Forgot to mention OCFS2 was considered too, but it lacks the replication and require IPs in config file instead of FQDNs.
 
Looks like you did your research, which is great.

You are correct, you shouldn't use ZFS on top of thick LVM, and perhaps "you cant". I am not a ZFS expert.

ZFS/iSCSI requires an iSCSI target on the storage side accessible via SSH. You will need to front-end and proxy all storage requests through, in your case, a VM. For a business production environment it would be far from ideal from a support/performance point of view. The HA functionality in such a setup would need to be thoroughly tested.

Ceph is not designed to be used with shared storage. It was designed for local hard disks. You could try to defeat Ceph's purpose by passing through disks directly to each node and dedicating them to that node only. I.E. no disk failover on chassis failure. You will probably not find much supporting documentation for this setup, as this is not how people use Ceph or shared SAS storage.

OCFS2 is another solution, as you noted. I am not sure why using IPs is a blocking issue. Storage is complex enough to introduce a dependency on DNS (which as we know is always at fault). If you are hard coding IPs in the hosts file, then why not just use IPs? Its your environment and decision. In addition, since you need replication - that is not provided with OCFS2.

Given your environmentally imposed limits (shared SAS storage) and your need for cross-chassis replication, your final option is to jerry-rig a combination of SAS/corosync/Pacemaker/ZFS where a given disk is only ever accessed by one of the two nodes in the chassis. I've seen articles on such DIY systems and know of people running it in production. This requires a deep technical knowledge and serious internal support commitment.

Good luck


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
  • Like
Reactions: Zulgrib
Hello bbgeek17,thanks for your answer.

I could technically use local zfs with the sync, get 3 more chassis later and move the servers there, keeping only one per chassis, but it feels non optimal in a physical space usage to have empty slots. It won't prevent me from sleeping at night but will always feel uncompleted when I'll be facing that rack.

Or abandon the inter-chassis sync and stick to the natively supported LVM method, reducing the redundancy level... Which isn't optimal too.

I opened the thread mostly hoping I overlooked something and won't need to take the same route I had to take for HA firewalls.

For the DNS part, it happens I am either very lucky or resourceful enough, not having any issues there for 10 years now.
 
I opened the thread mostly hoping I overlooked something
Regrettably, there is no easy out-of-the-box Proxmox compatible solution that will give you: shared SAS support+hyperconverged+HA+replication

We, at Blockbridge, do support legacy shared SAS, but not in a hyper-converged setup.

Good luck in your search


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
  • Like
Reactions: Zulgrib

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!