Shared Remote ZFS Storage

alexskysilk · Apr 17, 2025

RolandK said:
shared storage is overrated very often - and often introduces more problems then it's solving

ok I'll bite. How do you make a compute cluster without it?

RolandK · Apr 17, 2025

alexskysilk said:
ok I'll bite. How do you make a compute cluster without it?

it depends what's your definition of "compute cluster".

you can cluster pve hosts without a problem without shared storage.

live migration of VMs between hosts in that cluster is also possible without shared storage, though it puts much more effort to migrate vm's, as virtual disks need to be shufled around.

guruevi · Apr 17, 2025

alexskysilk said:
ok I'll bite. How do you make a compute cluster without it?

What @RolandK is talking about is the definition of "shared storage" in the industry. Which means a central set of disks connected over fabric (NAS/SAN). Whereas Ceph etc are shared-nothing architectures (ALL your "storage" can fail in one area, and things will still work). For all intents and purposes, shared storage is largely dead except for very small setups. Even the big ones are starting to diverge to RAIN over RAID.

alexskysilk · Apr 17, 2025

guruevi said:
What @RolandK is talking about is the definition of "shared storage" in the industry. Which means a central set of disks connected over fabric (NAS/SAN). Whereas Ceph etc are shared-nothing architectures (ALL your "storage" can fail in one area, and things will still work). For all intents and purposes, shared storage is largely dead except for very small setups. Even the big ones are starting to diverge to RAIN over RAID.

Shared storage is shared storage. the HOW is less relevant then the "what." both the above and the following list are all example of multi initiator capable solutions, either block or file. Logically its all the same thing. As for Glusterfs being "current" and dual controller storage being "dead," I have a bridge to sell you.

RolandK said:
live migration of VMs between hosts in that cluster is also possible without shared storage

True, but thats not what shared storage is suppose to mitigate (at least not primarily.) its meant to retain payload file system coherency in case of an active host drop. Perhaps you should rephrase your original comment by appending "TO ME."

guruevi · Apr 17, 2025

Red Hat abandoned Gluster, but the community and other Linux vendors still support it. That's the beauty of this, you can just have a vendor drop support/products and you are not stuck.

Shared storage was supposed to mitigate the "expensive RAID controllers / storage" in every machine and coherency over slow and unreliable networks. Ethernet has since passed both FibreChannel and SAS in all of those aspects and even NVIDIA is now moving people from InfiniBand to Ethernet, you can get lossless Ethernet and both SCSI and DMA also go faster over Ethernet. ASICs in FibreChannel and InfiniBand switches have been developed for Ethernet for a few years now and many can handle both protocols on a port-by-port basis (so you want to run your FibreChannel in another datacenter, that is now possible without special fiber, provided you have a lossless Ethernet fabric).

Make me a case for buying FC 'today' that isn't CHEAPER fronted by ANY Ethernet system. Make me a case for developing features like that in a Proxmox solution, that's geared towards the open source community. Hell, ask Red Hat if they will support it (not through a hardware partner) and you'll get pretty much the same answer - why?

alexskysilk · Apr 17, 2025

guruevi said:
Shared storage was supposed to mitigate the "expensive RAID controllers / storage" in every machine and coherency over slow and unreliable networks

I've been selling storage for 25 years. that was never the intent. its intended to facilitate multiple initiators so you're no longer tied to a single server spof.

guruevi said:
Ethernet has since passed both FibreChannel and SAS in all of those aspects and even NVIDIA is now moving people from InfiniBand to Ethernet

Kinda true, but not because it "surpassed" the others. There are simply larger economies of scale for Ethernet making incremental improvement far cheaper and more accessible than the alternatives. What made the shift away from Infiniband possible was ROCE, making it possible to use rdma over cheaper ethernet. There are actually arguments why both FC and IB are superior to Ethernet but in the end it doesn't matter because you can buy an equivalently performing one-up ethernet without lead times and at less cost. At the end of the day, port cost trumps everything else.

guruevi said:
Make me a case for buying FC 'today' that isn't CHEAPER fronted by ANY Ethernet system. Make me a case for developing features like that in a Proxmox solution, that's geared towards the open source community. Hell, ask Red Hat if they will support it (not through a hardware partner) and you'll get pretty much the same answer - why?

Wholeheartedly agree, which is why it's incumbent on the buyer to be educated and aware. Caveat emptor is a universal principle.

guruevi · Apr 17, 2025

alexskysilk said:
I've been selling storage for 25 years. that was never the intent. its intended to facilitate multiple initiators so you're no longer tied to a single server spof.

The holy grail has always been to spread your load over many machines to avoid the SPOF that a single set of disks with just 2 controllers are. The storage pod is still a SPOF, just perhaps slightly more stable than your average server running Windows or VMware. You also get the benefit of aggregation of 20+ spindles which is relevant if all you have is spindles. But once SSD became more afforable, even VMware got in the game with their own SDS (where before you needed 10s of 15k RPM spindles not for the capacity, but for the throughput, you now can do the same in a pair of 2 SSDs )

But clustered systems always came with a price tag from the likes of Solaris and IBM through systems like GPFS with InfiniBand fabrics. There were some open source systems but they were considered slow, unstable and archaic because of the often 100M-1G limit even on "enterprise" datacenter Ethernet and if you were spending cash on "10G" IB (which is really 4x2.5Gb), you may as well spend a bit more on GPFS.

Lustre and Gluster, later Ceph are rather modern and only viable with the rise of cheap 10G+ Ethernet and Ceph specifically only with SSD.

Search

Search

Shared Remote ZFS Storage

alexskysilk

Distinguished Member

RolandK

Renowned Member

guruevi

Well-Known Member

alexskysilk

Distinguished Member

guruevi

Well-Known Member

alexskysilk

Distinguished Member

guruevi

Well-Known Member

We value your privacy