VPS Hosting providers - why no zfs ?

Saahib · Feb 10, 2024

Hi,
I have seen that most hosting provider those sell VPS over proxmox use software raid with lvm thin instead of ZFS. On few hosting forums have seen discussion where hosting providers prefers software raid over zfs raid.
I want to ask here, providers those are selling vms over proxmox, why they do not prefer zfs despite it being good. One of the reason was given that zfs occupy lots of RAM and they need RAM to sell VMs. They can lower the RAM usage of ZFS but better stick with mdraid to keep things simple and predictable.

Also, as VPS provider, what raid you prefer ?|
1. Traditional software raid ? (mdadm)
2. zfs raid ?
3. HW raid ?

sb-jw · Feb 10, 2024

Saahib said:
Also, as VPS provider, what raid you prefer ?|
1. Traditional software raid ? (mdadm)
2. zfs raid ?
3. HW raid ?

None of that at all, everything is far too inflexible.

We treat ourselves to the luxury and put a CEPH with Replica 3 underneath. This offers us the opportunity to scale the storage infinitely and at the same time be able to access it from every node. In addition, we completely free ourselves from hardware dependency and can simply physically move the data carriers to another server.

With the conventional variants you often have the problem that you still have a lot of storage but no more CPU / RAM or vice versa, you no longer have any storage but still have a lot of RAM and CPU free. With CEPH this simply no longer matters because the storage is available on every node. We just have to pay attention to the CPU and RAM, which often go hand in hand anyway.
We can use it to carry out maintenance work without having to inform customers; this can happen during ongoing operations and in the middle of the day. Thanks to shared storage, the VMs can be easily migrated to another node without major impact. If a server goes down, that's no problem, just move the VMs to the other node and start again. I can then take a few days to fix the server and not have customers on my back and have to drive to the data center at 3 a.m.

Of course, the CEPH also brings new challenges; it has to be monitored extensively so that the smallest disruption is immediately noticed. You are also forced to build a larger network for this, but customers usually benefit directly from the larger bandwidth. You also have to set appropriate I/O limits for the VMs so that not one VM can pull down the entire storage.
Of course, it is also the case that if there is a malfunction in the CEPH, everything stops working; you don't have that problem with local storage. But in our opinion the advantages of CEPH outweigh the disadvantages.

All systems with which we serve customers are redundant, we no longer have anything online that doesn't have a mirror for the OS, two power supplies and two network connections. There is no longer any SPoF in our infrastructure.

Saahib · Feb 10, 2024

Ceph is good but its different kind of storage solution.
Here what would the local FS of choice for provider of VM ?

esi_y · Feb 10, 2024

Saahib said:
Hi,
I have seen that most hosting provider those sell VPS over proxmox use software raid with lvm thin instead of ZFS. On few hosting forums have seen discussion where hosting providers prefers software raid over zfs raid.
I want to ask here, providers those are selling vms over proxmox, why they do not prefer zfs despite it being good. One of the reason was given that zfs occupy lots of RAM and they need RAM to sell VMs. They can lower the RAM usage of ZFS but better stick with mdraid to keep things simple and predictable.

Cost / benefit math? And it's unnecessary in terms of feature set for that use. Ideally you have many nodes and shared storage. Why would you choose a COW filesystem if you do not intend any snapshotting or such? Lots of writes, lots of RAM, integrity features not needed for that use case, etc.

Saahib · Feb 11, 2024

You are correct but what about small providers with 1 or 2 but large nodes. CEPH is definitely good but it need its own port 10gig or more + granular monitoring (as @sb-jw mentioend) for it as if it breaks, everything breaks.
So if single node or two node are there, snapshots help to take quick backups. Here ceph is not choice, so which one to go for ?

esi_y · Feb 11, 2024

Saahib said:
You are correct but what about small providers with 1 or 2 but large nodes.

Then it depends to what sort of SLA (if other than best-effort) you have with the customers. If you have just 2 nodes, it still nails down to the cost-benefit considerations. ZFS replication is not real time, High Availability is risky to run, it's better to have manual disaster recovery implemented with downtime accounted for. With that, ZFS is just extra overhead when you resell those resources.

Saahib said:
CEPH is definitely good but it need its own port 10gig or more + granular monitoring (as @sb-jw mentioend) for it as if it breaks, everything breaks.

CEPH is not the only shared storage possible.

Saahib said:
So if single node or two node are there, snapshots help to take quick backups. Here ceph is not choice, so which one to go for ?

This is not helpful to say now, but more "smaller" nodes would be better than two "large" in this scenario. It's definitely good to have best-effort or no SLA in place, in which case having backups is just fine. Consider that e.g. 15-minute old replica when used to restart a database-heavy VPS may have impact your customer did not expect. Having it on ZFS is not "just as good" for those cases.

Saahib · Feb 11, 2024

@temaccount392742 , thanks for the input.

For colos, big node and fewer nodes makes more sense.
Also, I am here referring to small budget providers, SLA is not issue as long as there is no complete loss and downtimes are handled promptly.

For you second comment, what other centralized storage is suggested (which doesn't cost bomb) other than ceph with good IO ?

Also, then what is suitable approach on single node / independent nodes used to sell resources ?

RolandK · Feb 11, 2024

https://bugzilla.kernel.org/show_bug.cgi?id=99171

https://forum.proxmox.com/threads/z...ty-and-reliability-of-zfs.116871/#post-505697

Saahib · Feb 11, 2024

RolandK said:
https://bugzilla.kernel.org/show_bug.cgi?id=99171

https://forum.proxmox.com/threads/z...ty-and-reliability-of-zfs.116871/#post-505697

Thanks for the link, I went through whole discussion and was informative. However, the inconsistency there in soft raid is not that common, can say from my experience where I have been looking after several soft raid PVE node, besides, in my use case, SLA is not an issue as long as there are backups.

Still discussion there had mixed feedback about use of soft raid vs zfs. My experience is also this that ZFS slow downs things when you have multiple users, it is good when node is used by single or few users even if there is heavy usage in few vms.

esi_y · Feb 11, 2024

Saahib said:
@temaccount392742 , thanks for the input.

For colos, big node and fewer nodes makes more sense.
Also, I am here referring to small budget providers, SLA is not issue as long as there is no complete loss and downtimes are handled promptly.

Where are you backing it up to?

Saahib said:
For you second comment, what other centralized storage is suggested (which doesn't cost bomb) other than ceph with good IO ?

This might not be helpful given your other note on the need for good networking, but basically NFS, iSCSI, some others [1].

Saahib said:
Also, then what is suitable approach on single node / independent nodes used to sell resources ?

Make it clear to a buyer (if it's general public) that it's a best-effort service with no SLA (and they benefit by getting low cost offering, hopefully) and they are responsible for their own e.g. backup strategy. If I am running a DB on your VPS and feel like I can consider it fully managed then discover otherwise will give you bad reputation in the end.

[1] https://pve.proxmox.com/wiki/Storage#_storage_types

esi_y · Feb 11, 2024

RolandK said:
https://bugzilla.kernel.org/show_bug.cgi?id=99171

https://forum.proxmox.com/threads/z...ty-and-reliability-of-zfs.116871/#post-505697

Roland, thanks for putting these two here into the discussion. Can I ask you if there's anything open in PVE's bugzilla on this? If so, I hope it does not say it's an upstream problem and NOTOURBUG.

The reason I ask is you dropped a reference in comment 8 [1] of the kernel.org BZ to self, I suspect you meant to refer to another bug report?

Just to make it clear, I am getting at the fact that with mdadm, a check could be made not to default qemu to cache=none, that's the least PVE could do other than state it's "not supported". It's clear the "bug" will never be fixed from the references you had found already yourself.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=99171#c8

esi_y · Feb 11, 2024

Since this is a discussion on mdadm, ZFS, etc. I will also drop in this one that came to mind re SWAP on zvols:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=199189

RolandK · Feb 11, 2024

> The reason I ask is you dropped a reference in comment 8 [1] of the kernel.org BZ to self, I suspect you meant to refer to another bug report?

ouch, indeed. sorry.

that should have been :
https://bugzilla.proxmox.com/show_bug.cgi?id=4573

alexskysilk · Feb 11, 2024

Saahib said:
Also, as VPS provider, what raid you prefer ?

local raid, be it controller, mdadm or zfs, isnt really compatible with a vps environment since its necessarily single host. What you're describing is a host rental, in which case you (as a customer) should be able to configure it however you want up to and including the OS.

esi_y · Feb 11, 2024

RolandK said:
> The reason I ask is you dropped a reference in comment 8 [1] of the kernel.org BZ to self, I suspect you meant to refer to another bug report?

ouch, indeed. sorry.

that should have been :
https://bugzilla.proxmox.com/show_bug.cgi?id=4573

Thanks! I took the liberty to create a new one (not related to installer request): https://bugzilla.proxmox.com/show_bug.cgi?id=5235

esi_y · Feb 11, 2024

alexskysilk said:
local raid, be it controller, mdadm or zfs, isnt really compatible with a vps environment since its necessarily single host. What you're describing is a host rental, in which case you (as a customer) should be able to configure it however you want up to and including the OS.

He is clearly renting out individual guests, not the host itself.

bbgeek17 · Feb 11, 2024

VPS hosting, especially in low budget market, is a race to the bottom. Its often a loss-leader for CSPs. The providers in question ( offering no HA, at best once a day backup recovery, and no protection against host failure ) have to squeeze every penny out of their infrastructure. Every gigabyte of ram spent on ZFS is a gigabyte not sold. Every gigabyte spent on checkpoints, if not billed properly, is reducing profit, if there is any.

Good for OP if he can find a niche and make money there. I think its safe to say that in that market segment best practices are often not followed and best technologies are not used.

PS Credit card fraud is much higher in low-budget VPS segment than B2B one.

Blockbridge: Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

esi_y · Feb 11, 2024

Saahib said:
I have seen that most hosting provider those sell VPS over proxmox use software raid with lvm thin instead of ZFS. On few hosting forums have seen discussion where hosting providers prefers software raid over zfs raid.

bbgeek17 said:
VPS hosting, especially in low budget market, is a race to the bottom. Its often a loss-leader for CSPs. The providers in question ( offering no HA, at best once a day backup recovery, and no protection against host failure ) have to squeeze every penny out of their infrastructure.

Could it be that's why PVE is popular with them in the first place?

esi_y · Feb 11, 2024

bbgeek17 said:
PS Credit card fraud is much higher in low-budget VPS segment than B2B one.

Why would that be? It's not like AWS/GCP/Azure run somehow more extensive checks on their base.

bbgeek17 · Feb 11, 2024

tempacc346235 said:
Could it be that's why PVE is popular with them in the first place?

May be, I dont have stats on what is being used for hypervisor there. PVE is free, and I dont see providers, solely focused on low-budget VPS, buying a subscription either.

tempacc346235 said:
Why would that be? It's not like AWS/GCP/Azure run somehow more extensive checks on their base.

Thats what we heard from our customers. I dont have first hand experience.

Blockbridge: Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

VPS Hosting providers - why no zfs ?

Member

Famous Member

Member

Active Member

Member

Active Member

Member

Renowned Member

Member

Active Member

Active Member

Active Member

Renowned Member

Distinguished Member

Active Member

Active Member

Distinguished Member

Active Member

Active Member

Distinguished Member