Hi,
looking for some suggestions for the system we want to improve on.
We want to run PVE on a dedicated server running below described services. Up until now we use Hetzner and are pretty OK with service/costs.
This is what we use for our data analytics stack:
Base system (Debian+Proxmox) running:
* Router-VM (fw/routing services): connecting file/web-server to the world, offering ssh/(open)VPN access for development.
* ETL: some daily/weekly loads performed over night (spread out), usually idle during the day.
* Fileserver: ETL server gets its files from here as well as different clients, all sorts of access-methods)
* BI Host: Some BI-Tool like Tableau, Microstrategy, Jaspersoft etc, no direct connection but linked to web-server and database.
* Webserver, main connection f. BI-Tool <-> World
* 0-3 hosts offering some additional services, like different web-services, etc. Trivia: bc some of the BI-tools need more fancy testing-features, don't "taint" the BI-host/main webserver with these. Usually low load/requirements.
DB-system:
DWH running PostgreSQL on (minimal)CentOS/Debian. Connection only allowed to base-system via ssh/postgres. Typical installation would probably be running 4databases with ~400GB total. Largest database with ~160GB where data is stored mostly in 1-3large tables (200M++ rows).
We do have a host with 256GB RAM where we use as much as 30%-RAM and 6cores for one huge query that still runs ~1hour. Usual load: typical DWH-queries where long time-spans are summarized across a large amount of attributes.
NOTE: co-locate in the same rack (&connect directly) if possible.
General usage pattern is as follows:
* Users push their files to the fileserver
* Mixed ETL/ELT jobs grab these files and load them into the DB.
* BI-Host scheduled updates to refresh data for dashboards, reports etc after ETL finishes. There are still a couple of ad-hoc queries but most of the time queries hit (BI-application)cached data. Updates daily/weekly, depending on the client-project.
* Webserver (+other tooling-server) connecting to the BI-Host and primary source of info for customers.
Requirements:
Users: usually 5-20, max concurrent usage <=50 users.
Downtime: we can live with 1day downtime, max. 3days acceptable if ultra-rare (lt once in 3y). This mostly affects backup-strategy, the 3days would be worst-setup-time if the host fails completely and it would take some time to "acknowledge" that (and a new one would need 1-2days before it's ready. Gladly not speaking from experience).
Backups: Our current setup does not entail a hot standby, we backup all the databases (pg_dump) corresponding to the load-frequency (to different locations, also off-site). Similar strategy with the VMs, only router-vm/tooling/webserver will have low(er) frequency as there are no "changing" parts. Also backup all relevant configs for the DB/base-system as well as a clear "install-script" using these.
Storage: DB roughly betwen 300-450GB while 300-600GB for the base system.
CPU: Single thread performance not negligible as some of the tasks won't run in parallel (for both systems).
* Base system: 4-8cores for the BI-Host, the more the merrier. 4+ for web, router and base. Overprovisioning for ETL/fileserver/add.services useful as the load-time does not overlap.
* DB system: as much as possible
RAM: Both as much as possible (going for 128GB(each)/256GB(total) seem to align with Hetzner-prices at least)
* Base system: BI needs some 60GB plus. ETL observed to eat up to 10GB and the rest probably takes some 20GB as well. 128GB should suffice.
* DB system: Would need to start fine-grained measurements but I'd say 128GB should be enough, probably less.
Issue at hand:
With the costs for set-up relatively small we usually swap hosts from time to time to leverage monthly based fees in terms of new HW-generation and minimize risk of worn out HW.
As of now the PX-series of Hetzner was relatively cheap and allowed us to slap in 4*480GB-SSDs along with a BBU-HW-Raid. We used two identical hosts for the two systems above.
With pricing shifted and the advent of NVME I wonder if still going for these kind of set-ups is worth it. We might also include the DB host in the base system and only have to deal with one host and have everything else running inside a container.
There are a couple of choices to be made and I wondered if somebody already has experience in running a similar setup and cares to share some info. Most pressing questions (based on services from Hetzner, if anyone can suggest similarly prized services I'd gladly consider them too. My google-fu didn't turn up with something vastly superior and they never failed us up until now):
* Storage:
** All NVME and zfs (included in pve) with raidz?
*** If so, seperate SSD for pve-system?
** NVME(2pieces) zil/l2arc alongside spinning HDDs or even SSDs? The 2 NVMEs would be costly and too large (min 980GB) for this kind of usage, also include the pve-system there?
* RAM:
** DB/Base will use up at least 128GB, more would most likely be best.
** With this ~1TB storage, how much would I need for zfs? Found a few
* CPU: only question would be whether some configuration was not supported (threadripper, epyc?), but I doubt that as the HW is not exactly fresh.
As said, I'm thankful for any thoughts on this.
Thanks, Thomas.
looking for some suggestions for the system we want to improve on.
We want to run PVE on a dedicated server running below described services. Up until now we use Hetzner and are pretty OK with service/costs.
This is what we use for our data analytics stack:
Base system (Debian+Proxmox) running:
* Router-VM (fw/routing services): connecting file/web-server to the world, offering ssh/(open)VPN access for development.
* ETL: some daily/weekly loads performed over night (spread out), usually idle during the day.
* Fileserver: ETL server gets its files from here as well as different clients, all sorts of access-methods)
* BI Host: Some BI-Tool like Tableau, Microstrategy, Jaspersoft etc, no direct connection but linked to web-server and database.
* Webserver, main connection f. BI-Tool <-> World
* 0-3 hosts offering some additional services, like different web-services, etc. Trivia: bc some of the BI-tools need more fancy testing-features, don't "taint" the BI-host/main webserver with these. Usually low load/requirements.
DB-system:
DWH running PostgreSQL on (minimal)CentOS/Debian. Connection only allowed to base-system via ssh/postgres. Typical installation would probably be running 4databases with ~400GB total. Largest database with ~160GB where data is stored mostly in 1-3large tables (200M++ rows).
We do have a host with 256GB RAM where we use as much as 30%-RAM and 6cores for one huge query that still runs ~1hour. Usual load: typical DWH-queries where long time-spans are summarized across a large amount of attributes.
NOTE: co-locate in the same rack (&connect directly) if possible.
General usage pattern is as follows:
* Users push their files to the fileserver
* Mixed ETL/ELT jobs grab these files and load them into the DB.
* BI-Host scheduled updates to refresh data for dashboards, reports etc after ETL finishes. There are still a couple of ad-hoc queries but most of the time queries hit (BI-application)cached data. Updates daily/weekly, depending on the client-project.
* Webserver (+other tooling-server) connecting to the BI-Host and primary source of info for customers.
Requirements:
Users: usually 5-20, max concurrent usage <=50 users.
Downtime: we can live with 1day downtime, max. 3days acceptable if ultra-rare (lt once in 3y). This mostly affects backup-strategy, the 3days would be worst-setup-time if the host fails completely and it would take some time to "acknowledge" that (and a new one would need 1-2days before it's ready. Gladly not speaking from experience).
Backups: Our current setup does not entail a hot standby, we backup all the databases (pg_dump) corresponding to the load-frequency (to different locations, also off-site). Similar strategy with the VMs, only router-vm/tooling/webserver will have low(er) frequency as there are no "changing" parts. Also backup all relevant configs for the DB/base-system as well as a clear "install-script" using these.
Storage: DB roughly betwen 300-450GB while 300-600GB for the base system.
CPU: Single thread performance not negligible as some of the tasks won't run in parallel (for both systems).
* Base system: 4-8cores for the BI-Host, the more the merrier. 4+ for web, router and base. Overprovisioning for ETL/fileserver/add.services useful as the load-time does not overlap.
* DB system: as much as possible
RAM: Both as much as possible (going for 128GB(each)/256GB(total) seem to align with Hetzner-prices at least)
* Base system: BI needs some 60GB plus. ETL observed to eat up to 10GB and the rest probably takes some 20GB as well. 128GB should suffice.
* DB system: Would need to start fine-grained measurements but I'd say 128GB should be enough, probably less.
Issue at hand:
With the costs for set-up relatively small we usually swap hosts from time to time to leverage monthly based fees in terms of new HW-generation and minimize risk of worn out HW.
As of now the PX-series of Hetzner was relatively cheap and allowed us to slap in 4*480GB-SSDs along with a BBU-HW-Raid. We used two identical hosts for the two systems above.
With pricing shifted and the advent of NVME I wonder if still going for these kind of set-ups is worth it. We might also include the DB host in the base system and only have to deal with one host and have everything else running inside a container.
There are a couple of choices to be made and I wondered if somebody already has experience in running a similar setup and cares to share some info. Most pressing questions (based on services from Hetzner, if anyone can suggest similarly prized services I'd gladly consider them too. My google-fu didn't turn up with something vastly superior and they never failed us up until now):
* Storage:
** All NVME and zfs (included in pve) with raidz?
*** If so, seperate SSD for pve-system?
** NVME(2pieces) zil/l2arc alongside spinning HDDs or even SSDs? The 2 NVMEs would be costly and too large (min 980GB) for this kind of usage, also include the pve-system there?
* RAM:
** DB/Base will use up at least 128GB, more would most likely be best.
** With this ~1TB storage, how much would I need for zfs? Found a few
* CPU: only question would be whether some configuration was not supported (threadripper, epyc?), but I doubt that as the HW is not exactly fresh.
As said, I'm thankful for any thoughts on this.
Thanks, Thomas.