Community Help Request, Purchasing Suggestions and Options (All Opinions Welcome)

Austin H

New Member
Apr 10, 2019
8
0
1
31
Ontario, Canada
Available Hardware

AMD RIG for Proxmox

INTEL RIG for Gaming <- don't care, take what you need to upgrade/improve AMD RiG

Cheap Video Cards <- that I can use with Mac OS VMs x2

I have 3 older hdd (listed below)
  • Seagate SkyHawk Surveillance 12 TB 3.5" 7200RPM Internal Hard Drive ST12000VE0008-2KW101
  • Western Digital Red Pro 10 TB 3.5" 7200RPM Internal Hard Drive WDC_WD101KFBX-68R56N0
  • Seagate Barracuda 3 TB 3.5" 7200RPM Internal Hard Drive ST3000DM008-2DM166

APC SMT2200C <- for Battery Back-Up in case of power outage

Hardware I Am Considering

Server Plan <- Will buy HDDs as they are needed.

Previous Attempts

My first attempts with Proxmox have been rather awful except that I managed to serve some Dockerized stuff from my home server for use by friends and family.
It worked for a while until I decided to attempt a reinstall.

Goals (Sorry for the long winded post)

I have read through the forums, watched videos and realized I am trying to accomplish several clear goals now.

0. Use my available hardware and maybe buy a few 18TB seagate exos to build a home server with pretty much every computing need I have.
  1. Install Proxmox and use SSDs with ZFS to optimize the speed of the VMs/LXCs and operating system while having some redundancy/backup.
    1. I will likely use both 2TB NVMEs to accomplish this but will probably need advice on best practices and where to start.
  2. Use ZFS to create growable storage in some kind of raidz storage pool so more can be added later.
    1. I will likely buy new and use two or more Seagate EXOS Enterprise X18 18 TB 3.5" 7200RPM Internal Hard Drive
    2. I really like the idea of ZFS because I don't want to spend all my money on HDDs until I actually need them.
  3. Decide where the ideal spot for the server is?
    1. I could put my server and battery backup in the basement, but I'm worried about moderate flooding and hdmi acccess for VMs/GPU passthrough
      1. Pros, potentially serve data in front of router, faster/safer network if I take the time to create & harden an OPNsense or pfsense VM with added WiFi peripherals. Out of the way (frees my room/office for other tasks)
      2. Cons, Family network could go down if server or RIG gets powered off or VMs stop
      3. Pros Alternatives, I could create a small office in the laundry room if my family permits.
    2. I could keep my server in my room/office where it is currently.
      1. Pros, spacious, nearby and my networking largely goes unnoticed by the rest of the household.
      2. Cons, using a 5G WiFi repeater bridge slows down internet connections a bit.
  4. Network based on my decision
    1. If the server ends up in the basement
      1. I'll build a storage unit high up so it does not have water damage if the basement floods.
      2. I'll likely improve our family network security and safety.
      3. I'll have the server directly facing the internet without any routers or subnets in between.
      4. I'll probably pipe the network data through my machine's three gigabit (or more) Ethernet ports.
        1. One is going to be directly from our home fiber optic router in DMZ mode to my server.
        2. One is going to be output from my device to our home network.
        3. One is going to be output to my own router for my own WiFi network (with superior password strength and hopefully security) and access Proxmox's web GUI.
      5. I'll setup Proxmox's built in firewalls and my own OPNsense or pfSense firewall. (becuase well... why not?)
    2. If the server ends up staying in my room/office
      1. I mostly know what to do to set it up for networking.
  5. Ideally this build would be easily reconfigured if I moved to a new location where I could just hook it up to the internet and go.
Questions
  1. I would like redundancy and performance where possible
    1. After reading the above is ZFS my best option?
  2. I would like security and I am not extremely worried about privacy.
    1. Does anybody have any good tutorials on hardening Proxmox after the initial install and setup of disks, networks/VMs and base software?
  3. I have watched and read pros/cons of using RAM without ECC for ZFS and people discussing several nightmare scenarios, is there any truth to that?
    1. Should I pay for a secondary backup to a data-center or can I get away with a secondary NAS running Proxmox to clone important data to?
    2. What are the chances my 2TB NVME raidz fails?
    3. Should I buy a third 2TB NVME for better raidz failure protection?
    4. What are the chances my 18TB+ HDD raidz fails?
    5. What can I do to monitor and mitigate disk failures in raidz?
    6. Can I really add redundancy and space to ZFS every time I add a new disk to the pool?
    7. Does adding disks require downtime or any kind of system slowdown?
    8. Does ZFS require any software to monitor failures?
    9. Can I use Proxmox's email system to notify me of drive failures? (I know how to add email from my own custom email server (like gmail) etc.)
  4. How many unique HDD should I eventually have to make raidz safe enough for 1 or 2 hdd failures?
  5. How many unique SDD should I eventually have to make raidz safe enough for 1 sdd failure?
  6. From experience does anybody have a good idea where the best place to locate a home serve is so I can still use GPU pass-through? I think basement office is my best bet.
    1. How can I use a VM firewall and still give Proxmox an accessible IP Address?
    2. If I use Docker on the host how easy is it to router traffic through OPNsense to the host?
  7. Docker on the host or in a VM like Alpine?
    1. I have done both but I wanted to hear professional opinions on which is smartest/best.
  8. What can I do to make my Proxmox build easily reconfigured if I moved to a new home? DHCP settings and more etc?
Things I Will Need Help With (advice, links, videos or tutorials are good too!)
  1. Deciding from a professional standpoint (or DIY experienced people) where I should put the server
  2. Installing Proxmox with ZFS support on 2-3 2TB Samsung NVMEs. (will buy a 3rd if it's recommended/feasible)
  3. Deciding how many 18TB hard drives I will need at minimum to start the ZFS drive pool.
  4. Adding a ZFS HDD storage pool for 2-4 18TB Seagate Exos HDDs
  5. Deciding on best networking plan for the server's location.
  6. Installing the Server to it's location (discussing with family first)
  7. Adding the best networking plan for the server's location.
  8. Adding firewalls and hardening Proxmox's network. (using OPNsense if it's recommended)
  9. Deciding where to put Docker such as the host or in a small VM like Alpine.
  10. Adding docker to host or VM.
  11. Deciding how to backup and restore Docker data/configs if needed (especially if they're on the host machine)
  12. Preparing automated backup (cronjob) of Docker data/configs
  13. Deciding if I should build another Proxmox server for a secondary backup with 18-36TB mirrored storage.
  14. Deciding if I should pay for some 18-36TB truly remote storage (or use the google account script).
  15. Choosing remote storage or secondary backup (if needed)
  16. Building the project and assessing for changes.

Things I Have Decent Knowledge About
  1. Basic knowledge of installing Proxmox.
  2. Basic knowledge of enable IOMMU and GPU Passthrough.
  3. Setting up a Windows VM.
  4. Setting up a Mac OS VM.
  5. Setting up docker in VM or on the Host.
  6. Basic knowledge of networking.
  7. Basic knowledge of firewalls and security.

Summary TL;DR

I am just trying to find good advice on setting up my custom home Server Plan (will buy HDDs as they are needed)
I have read through some of the documentation and run Proxmox installs a bunch of times, but I need some experienced advice.

I would really like to have the community or a professional opinion on what I can do with my hardware and Proxmox.
Any advice, ideas, cost cutters, limitations you see and best practices would really help!
 
You can't add more drives later to a ZFS pool if you are using raidz. For best performance you should try stripped mirrors (like raid10). You can stripe as much mirrors as you like. 2,4,6,8,.. drives but you will always loose 50% of capacity. Striped Mirrors are expandable.

Keep in mind that your consumer SSDs maybe wont last long. I also bought two "Samsung 970 Evo 500 GB M.2" and removed them after some weeks because they wouldn't survive a year. Enterprise SATA/U.2 SSDs and ECC RAM is recommended.

ZFS will create checksums for every datablock to verify data integety. From time to time it will recaclulate the checksums of every datablock again (called scrubbing) if a datablock doesn't match the checksum anymore the data is corrupted. If you got parity data (raidz or mirroring) ZFS will repair that damaged data block. This way you got a self healing fileystem. So everything is relying on checksums. But these checksums are calculated by the CPU and the datablocks are stored in RAM while creating that checksums and if there is an RAM error that data blocks get damaged before the checksum was calculated. So you get a valid checksum for a already damaged data block. For ZFS everything looks fine but that data block got currupted. Thats called bit rot and without ECC it can't be detected nor fixed.

It will run without ECC RAM and chances that you loose a complete pool are minimal, but without ECC you will never now if ZFSs checksums are valid because errors in RAM can'T be detected or fixed. If you don't know if the checksums are right, you dont know if a file is damaged and if you don't know if a file is damaged ZFS can't fix that file. 99.9999% of the files will have valid checksums and it's not a big problem but something will get corrupted and without ECC you will never know it and ZFS won't be able to fix it. And backups won't help against bit rot, because if you don't know if a file got corrupted, you are backing up that damaged file and if you overwrite your old backups with new backups you will replace a healthy file with a damaged one.
 
Last edited:
You can't add more drives later to a ZFS pool if you are using raidz. For best performance you should try stripped mirrors (like raid10). You can stripe as much mirrors as you like. 2,4,6,8,.. drives but you will always loose 50% of capacity. Striped Mirrors are expandable.

Keep in mind that your consumer SSDs maybe wont last long. I also bought two "Samsung 970 Evo 500 GB M.2" and removed them after some weeks because they wouldn't survive a year. Enterprise SATA/U.2 SSDs and ECC RAM is recommended.

ZFS will create checksums for every datablock to verify data integety. From time to time it will recaclulate the checksums of every datablock again (called scrubbing) if a datablock doesn't match the checksum anymore the data is corrupted. If you got parity data (raidz or mirroring) ZFS will repair that damaged data block. This way you got a self healing fileystem. So everything is relying on checksums. But these checksums are calculated by the CPU and the datablocks are stored in RAM while creating that checksums and if there is an RAM error that data blocks get damaged before the checksum was calculated. So you get a valid checksum for a already damaged data block. For ZFS everything looks fine but that data block got currupted. Thats called bit rot and without ECC it can't be detected nor fixed.

It will run without ECC RAM and chances that you loose a complete pool are minimal, but without ECC you will never now if ZFSs checksums are valid because errors in RAM can'T be detected or fixed. If you don't know if the checksums are right, you dont know if a file is damaged and if you don't know if a file is damaged ZFS can't fix that file. 99.9999% of the files will have valid checksums and it's not a big problem but something will get corrupted and without ECC you will never know it and ZFS won't be able to fix it. And backups won't help against bit rot, because if you don't know if a file got corrupted, you are backing up that damaged file and if you overwrite your old backups with new backups you will replace a healthy file with a damaged one.
So you're saying raidz10 is the same as raid10, I will need to rebuild the storage pool/array everytime I want to expand my storage limit?
Also are there any brands of ECC RAM and server grade SSD you recommend?
 
You have written the longest and most detailed post I have ever read I think @Austin H :D

Tons of questions to answer but not from mobile.
As a first step in better understanding I'd like to point you to this URL:
https://louwrentius.com/the-hidden-cost-of-using-zfs-for-your-home-nas.html

It helped me with my decisions.

I really think ZFS is awesome, but it has its downsides. I have experienced some heavy trouble with it a year ago when one disk failed and the other started creating read errors on rebuild (called resolver in ZFS terms).
As I have no high requirements for performance I have decided to go for a raid-z3. So any 3 disks in my array can fail. But that configuration is kind of a dead end. With mirrors you can easily add a mirror at a time (as I did it). in Raid-Z you should always add the same type of array to the pool.

BTW. You only use the "z" for the Raid 5/6/7 equivalents.
Raid-z1 -> Raid 5
Raid-z2 -> Raid 6
Raid-z3 -> Raid 7

I also would highly recommend some decent gear with ECC as memory. I am currently wearing down some consumer grade SSDs (because I have them, they are old and I don't get any money for it anymore) and boy. They wear fast

In the end everything depends on your requirements...
 
You have written the longest and most detailed post I have ever read I think @Austin H :D

Tons of questions to answer but not from mobile.
As a first step in better understanding I'd like to point you to this URL:
https://louwrentius.com/the-hidden-cost-of-using-zfs-for-your-home-nas.html

It helped me with my decisions.

I really think ZFS is awesome, but it has its downsides. I have experienced some heavy trouble with it a year ago when one disk failed and the other started creating read errors on rebuild (called resolver in ZFS terms).
As I have no high requirements for performance I have decided to go for a raid-z3. So any 3 disks in my array can fail. But that configuration is kind of a dead end. With mirrors you can easily add a mirror at a time (as I did it). in Raid-Z you should always add the same type of array to the pool.

BTW. You only use the "z" for the Raid 5/6/7 equivalents.
Raid-z1 -> Raid 5
Raid-z2 -> Raid 6
Raid-z3 -> Raid 7

I also would highly recommend some decent gear with ECC as memory. I am currently wearing down some consumer grade SSDs (because I have them, they are old and I don't get any money for it anymore) and boy. They wear fast

In the end everything depends on your requirements...
I guess I should maybe narrow the focus of my threads from now on.

I really misunderstood several things about how servers work and the essential differences between server grade hardware and consumer grade. It seems I have blown most of my loose dollars on unusable consumer hardware.

At this stage I think my best bet is earning some loose cash in the meantime and searching ebay for used server hardware or buying a used/prebuilt NAS maybe.

There seems to be tons of used server hardware I can tinker with on ebay.
 
Your Threadripper is still a great CPU for a Proxmox host. But you might want to use that with a server/workstation motherboard with lots of RAM slots, electrically connected PCIe slots (later you maybe want to add more U.2 SSDs, GPUs, 10G/1G NICs, SATA HBAs and so on for PCI passthough), IPMI and ECC RAM support.

VMs can't share a passed through hardware (with some exceptions like NICs where you can passthrough single ports if it is a multiport NIC) so If you want to utilize GPU hardware accceleration on multiple VMs you will need to buy multiple GPUs. I always wish I would have more PCIe slots to add more hardware.
Second hand server hardware is fine but keep in mind that all the rack servers are loud as a jet and you don't want them near you. Put them in the basement or choose a more quiet tower-server/workstation with 120mm fans.

Consumer grade SSDs are problematic because ZFS and virtualization in general are causing alot of write amplification. For every 1GB my VM is writing there are 30GB written to the NAND flash of the SSDs. So they will wear quite fast without writing a lot of data. My homeserver is writing 700GB per day to the SSDs while ideling and 90% of that writes are just metrics/logs stored to a mysql and elasticsearch database.
Another problem of consumer grade SSDs is, that these don't have a powerloss protection build in. Without that they can't use the RAM cache for sync writes and without RAM caching your write amplification might explode and if a DB is writing 100x 4kb per minute it might end with the SSDs writing 100x 2MB per minute to store that. So if you are using databases or similar workloads writing alot of small sync writes you really want to buy a SSD with powerloss protection.

I would use two small (32GB is enough) SATA SSDs as system drives as zfs mirror (raid1).
Also two big SSD (U.2 or SAS or SATA) as your storage for all your VMs and LXCs as zfs mirror (raid1).
And If you want some kind of NAS with slow but big and secure storage I would buy a PCIe HBA, connect that HDDs to that HBA and passthrough the HBA to a NAS-VM running TrueNAS or something like that. That way you don't get the virtualization overhead because the VM can physically access the HDDs.
If you want performance and don't care about efficiency you could start with 2 or 4 HDDs as mirror (raid1) or stripped mirror(raid10). You would loose 50% of capacity and only 1 drive may fail (depending on the size of your stripped mirror more drives may fail without damage but if the wrong 2 drives will fail everything is lost). Beside the high performance it is nice because you can add new drives in pairs of two later to increase capacity and performance.
If you want to save some capacity you could use raidz1/raidz2/raidz3. I wouldn't use raidz1 (raid5) for a very big pool if you don't backup everything. It would take days of extreme disk torture to resilver that pool if a HDD fails and if at that time one of the remaining HDDs encounters errors you loose data.
Its more secure to use raidz2 (raid6) because any 2 drives may fail. But with raidz2 you will loose 2 drives for parity. So if you just buy four 18TB HDDs that would be useless because 2 of 4 drives would be used for parity and you would only get 50% of the capacity. So its the same as a stripped mirror (raid10) but slower. The more drives you add, the better the capacity efficiency will be. With raidz2 and 6 drives you would only loose 33% capacity, with 8 drives only 25% capacity. If you really want to use raidz I would atleast buy 3 HDDs for raidz1 or 6 HDDs for raidz2. If you don't need that much capacity it would be a better choice to buy more but smaller HDDs.
But because you can't add drives to that raidz pool later you should plan accordingly so the pool is big and efficient enough from the start on.
If you backup everything regularily to a second NAS (ZFS replication) it would be fine if you just use 3 (33% capacity lost) or better 5 drives (only 20% capacity lost) with raidz1. Your pool would be more likely to fail but if it fails it isn't a problem, because you got a identical copy of everything on the backup NAS.

Also keep in mind that ZFS needs 20% of the space free all the time or it will get slow. And snapshots need space too.
For example with three 18TB HDDs as raidz1:

54TB total raw capacity
- 18TB parity data
--------------------------
36TB
- 20% free space for ZFS operations
--------------------------
28.8TB

So you got 28.8TB of usable space. Depending on how long you might want to keep your snapshots it is a good idea to save 33 to 66% of that capacity for snapshots. If you got 28.8TB of usable capacity it would be a good idea if you only use 14TB of that so 14.8TB are available for snapshotting.

And raid never replaces a backup so you should buy everything twice to be able to replicate the pool to a backup NAS.
So you would need six 18TB HDDs with 108TB of raw capacity to be able to store 10 to 28.8 TB of data.
 
Last edited:
I really misunderstood several things about how servers work and the essential differences between server grade hardware and consumer grade.
Sometimes there are reasons why stuff is more expensive ;)

At this stage I think my best bet is earning some loose cash in the meantime and searching ebay for used server hardware or buying a used/prebuilt NAS maybe.
There seems to be tons of used server hardware I can tinker with on ebay.
Make yourself an investment plan. As @Dunuin mentioned the CPU might be fine, as well as one or the other disk/ssd will get you at least some time. But you need to be aware of what could be coming.
There is ton of stuff out there on ebay, but this might money not well spent either. The gear you get at a decent price usually is too old to be compatible (or fully compatible) with the stuff you have. Not enough PCIe-lanes, DDR3 instead of DDR4, etc.
So think things really through before you jump into something which sounds cheap, but will paste you into a corner...


So its the same as a stripped mirror (raid10) but slower.
... except that any 2 of the disks could fail. In the end the question is: what is your data worth to you?
I had exactly that situation. 1 disk went boom and the other created read-errors. Whole pool containing 6 mirrored devices lost: Damn you mirror!

Aside of that I do agree with you
 
... except that any 2 of the disks could fail. In the end the question is: what is your data worth to you?
I had exactly that situation. 1 disk went boom and the other created read-errors. Whole pool containing 6 mirrored devices lost: Damn you mirror!

Aside of that I do agree with you
Yep. The point was more that it is quite useless for a raidz2 to buy 4x 18TB and loose 36 of 72TB for parity if you could buy 6x 12TB and only loose 24TB of 72TB or 8x 9TB to loose 18TB of 72TB. Less drives are good because of the lower power consumption and fewer drives that could fail but at the cost of capacity efficiency.
 
Last edited:
That's quite the post. Wow... lots of thoughts coming to mind...

If you want to build a hackintosh, do that as a separate matter. I would not advise running your various server/services/firewall ambitions from the same box, as you're likely to have lots of outages impacting your environment by doing this as you tinker your way through fixing issues after updates, etc.

Run server services from a stable platform configured and used within its intended purposes. Do all the PCIE passthrough experimentation and hackery on a separate box. Also... consider unraid for that, it might be a better starting point.

Consider building a hyperconverged cluster from used server gear. For what you're planning to spend on HDDs alone I bet you could be 60-70% of the way to a decent 4-node cluster with ebay server stuff. I built my home-brew 4-node cluster back in late 2019 from SuperMicro IvyBridge generation (X9) Fat Twins, used quad-port Gb ethernet cards, old 3COM managed switches, some PCIE to SSD adapters, 2.5">3.5" trays, and a mix of various HDDs and SSDs, and a startech adjustable rack. I think in total I have ~$3500 into it give or take. (Can't recall exactly)....

The cluster has a total of 48 cores, 320GB RAM, 12X 500GB SSDs, 12 X 4TB HDDs. I have ceph configured with some "pool rules" to separate SSD from HDD, so I can allocate virtual disks to the appropriate media, but you don't even have to do that, ceph can actually manage a pool of mixed storage plenty well for home use.

Once you have a working cluster with ceph you'll never want to host any service/VM from a single point of failure box every again. Ceph is absolutely amazing, as is proxmox's clustering implementation. Need to take a node down for maintenance/upgrades? Need to reboot the node for updates? Need to add more storage to the cluster? You can do all that with no interruption to the VM's running on the cluster by moving the VM's from node to node live, and cycling maintenance from node to node.

One of the reasons I built this cluster at home was so that I could go through all the learning pains at home, and then deploy a cluster at work for an actual production environment serving ~100+ users. I intentionally used old hardware likely to fail in my home cluster to learn how to work through failures. I initially had installed a mix of 1TB and 4TB drives, with many of the 1TB Drives being very old, and likely to fail, and many of the 4TB drives being "consumer" drives that aren't meant for this to see how ceph would handle a bad situation. I have had several HDD failures, and several major overhauls of the drive configuration (adding drives, replacing small with larger, replacing failed drives), with no service interruption or data loss. (Just make sure to perform your drive changes methodologically, only making changes to 1 node at a time and allowing the cluster to completely rebalance/recover before making changes to the next node). With ceph the VM's are backed by a truly resilient, self-recovering and scalable storage solution. My cluster has been running for over a year and I currently host over a dozen VM's from it, including pfsense, freenas, blueiris, proxmox backup server, debian unify wifi controller, nextcloud, security onion manager/search/sensors, BOINC VM's, etc... I would never want to run all this on a single box, but on the cluster I have confidence that a hardware failure, even an entire node failure, won't bring down services (for long) or cause data loss.

You mentioned maybe setting up the server in the basement. Full height 4 post server racks are often cheap/free on local classifieds. keep your eyes peeled. Great way to mount everything up off the ground and make it look nice.

More nodes and more HDD bays means you can buy smaller more cost effective HDDs and have more replicas and better failure tolerance.

Say you set up 4 X 18TB Drives in some sort of RAID10 thing.... you have 36TB of space with 2 copies of the data. If either copy goes bad, do you have any way to prove that the remaining copy is good? If data becomes corrupted how does the system know which is the right and wrong data? Will the remaining drive survive the replication of that data to a replacement drive? There's not much there in that sort of setup to ensure data integrity.

Ceph brings a lot of peace-of-mind to the table for this because it maintains 3 copies of data and is always scrubbing about looking for inconsistencies. When a drive starts failing, you'll start getting easy to repair inconsistencies. Ceph sees 2 copies that are the same, and 1 copy that is different, and therefor has a way to know which data is correct. If the errors keep coming from the same OSD, SMART data usually reveals read errors on the drive causing problems. At this point, you just tell the OSD that it's "out" and the cluster rebalances and recovers to a 3 copies of all data elsewhere. Remove/replace the drive, add it as a new OSD and ceph rebalances into the newly available space. It's so elegant compared to all the "raid" solutions. I will literally never advise any sort of "raid" solution for server data management. I would rather have a cluster of cobbled together old hardware with the confidence of a quorum based data integrity than a new shiny single box of enthusiast fancy pants hardware. Also.. since a cluster of 4+ nodes can easily survive a node failure, the need to install the nodes boot drive on a RAID1 pair of drives is significantly reduced. I mean, you can, if you want, but it's not a requirement.

Admittedly, the cost of used servers is higher now than when I built my cluster. I think covid has had an impact on both supply chains for new servers and demand for more cost effective solutions, but there are still some options out there, and you don't necessarily have to use server gear for this. I see a lot of folks talking about needing server grade SSDs with full data path power protection, etc... For home use the performance penalty of sync-write and endurance of consumer SSD's has not yet given me any issues but YMMV. My SSD's say they're at like 3% wearout after a year in the cluster... I would just say avoid QLC nand for SSD's and avoid SMR recording technology in HDD's. If you find a buying opportunity to pick up a bunch of used enterprise grade SSD's, go for it, if not, eh, probably be fine without.
 
  • Like
Reactions: oz1cw7yymn
Great ideas guys.
I agree, I need ECC RAM, ample safe raidz10 or better storage, a full understanding of ZFS, more appropriate server hardware and a more professional plan of action.

I'll stick to using Proxmox for desktop computing for now.
I have a Mac OS vm and Windows vm I can put to good use for now.

I feel there has to be a good balance between desktop and server computing out there, but maybe I should use a simple NAS for backing my current desktop consumer configuration.

My idea was to have everything in Proxmox with pass-throughs except for one bare metal machine and laptop on the side.

What do you guys think of me buying a pre-built NAS for now just because after reading the literature I don't feel safe using my consumer PCs for data storage?

Any recommendations, or suggestions for cheap NAS.

Plan A

Buy or build a cheap NAS and use regular consumer grade hardware with backups to my NAS.

Plan B (These are two example builds)

First I would sell my AMD RIG and maybe my i9 RIG too. Obviously I'll try to get a fair price.
Then I'd use some cash and the profits from selling all my gaming/consumer gear to build a proper home server setup.
I have a laptop still that dual boots linux and windows so I should be ok for computing during that sell/buy intermission.
The problem is that I think I will only fetch 2-4 grand maximum on resale of all my gear. Maybe 5g tops if somebody buys them outright.
That means I'm still out about 5.6g-6.5g CAD which would take me 3-4 months to earn and I won't use my savings if possible.

Searched BuildPicker Buildhttps://ca.pcpartpicker.com/user/a93/saved/wgr7RB
PartPart #PricevaluePartPricevalue
MOBOMBD-X11SPA-TF$685.76pcie 16 x7MOBO$435.00pcie 16 x 7
CPU FANNH-D15$109.50coolingCPU FANS$219.00
CPUBX806956238$3,432.9922c 2.1 GHz to 3.7 GHz turboCPUs$4,766.0444c 2.1GHz to 3.6 GHz turbo 88 threads
RAMM393A2K40CB2-CVF$1,06814.32 ns 192gbRAM $1,104.0014.06ns 256gb
M.2KXG60ZNV1T02$413.98~3000 MBps r/wn/a$0.00
HDDST10000NM001G$2,640.00~600 MBps r/wHDD$2,640.00~600 MBps r/w
Case PH-ES620PTG_DBK01$199.99space$199.99space
PSUPS-TPD-1200MPCGUS-1$0.001200 WATTS$0.001200 WATTS
total$8,550.22$9,364.03
tax$9,661.75$10,581.36
44 thread or v-cores88 virtual cores
2 core per machine4 cores per machine
22 machines22 machines
8.72727272711.63636364
~8 gb ram per machine~10 gb ram / machine
pros cheaperpros cost effective cpu
pros m.2 disk transferpros faster individual machines
pros i don't really need 22 machinespros more ram
pros modern boardcons no m.2
cons less corescons slow base os (proxmox)
cons no cpu redundancycons more money

Summary

I'm leaning towards Plan B because I ran into a road block today with my AMD machine and nested virtualization for Mac OS VMs.
Turns out nested virtualization just doesn't work with Mac OS and AMD svm flag. One needs the intel vmx flag.
My i9 could solve the problem, but I realize now it's just not the right setup.
Plus I'm getting older, I don't game very much, and I mostly spend my time working, at school or trying to code up new ideas.
I guess I finally know what I want for computing and it's an 18 wheeler or a train but i'm stuck with two regular sports cars.
If I could go back in time. I would NOT have bought gaming RIGs, I would have bought a nice new server.

Now I just have to think through the problem...
 
That's quite the post. Wow... lots of thoughts coming to mind...

If you want to build a hackintosh, do that as a separate matter. I would not advise running your various server/services/firewall ambitions from the same box, as you're likely to have lots of outages impacting your environment by doing this as you tinker your way through fixing issues after updates, etc.

Run server services from a stable platform configured and used within its intended purposes. Do all the PCIE passthrough experimentation and hackery on a separate box. Also... consider unraid for that, it might be a better starting point.

Consider building a hyperconverged cluster from used server gear. For what you're planning to spend on HDDs alone I bet you could be 60-70% of the way to a decent 4-node cluster with ebay server stuff. I built my home-brew 4-node cluster back in late 2019 from SuperMicro IvyBridge generation (X9) Fat Twins, used quad-port Gb ethernet cards, old 3COM managed switches, some PCIE to SSD adapters, 2.5">3.5" trays, and a mix of various HDDs and SSDs, and a startech adjustable rack. I think in total I have ~$3500 into it give or take. (Can't recall exactly)....

The cluster has a total of 48 cores, 320GB RAM, 12X 500GB SSDs, 12 X 4TB HDDs. I have ceph configured with some "pool rules" to separate SSD from HDD, so I can allocate virtual disks to the appropriate media, but you don't even have to do that, ceph can actually manage a pool of mixed storage plenty well for home use.

Once you have a working cluster with ceph you'll never want to host any service/VM from a single point of failure box every again. Ceph is absolutely amazing, as is proxmox's clustering implementation. Need to take a node down for maintenance/upgrades? Need to reboot the node for updates? Need to add more storage to the cluster? You can do all that with no interruption to the VM's running on the cluster by moving the VM's from node to node live, and cycling maintenance from node to node.

One of the reasons I built this cluster at home was so that I could go through all the learning pains at home, and then deploy a cluster at work for an actual production environment serving ~100+ users. I intentionally used old hardware likely to fail in my home cluster to learn how to work through failures. I initially had installed a mix of 1TB and 4TB drives, with many of the 1TB Drives being very old, and likely to fail, and many of the 4TB drives being "consumer" drives that aren't meant for this to see how ceph would handle a bad situation. I have had several HDD failures, and several major overhauls of the drive configuration (adding drives, replacing small with larger, replacing failed drives), with no service interruption or data loss. (Just make sure to perform your drive changes methodologically, only making changes to 1 node at a time and allowing the cluster to completely rebalance/recover before making changes to the next node). With ceph the VM's are backed by a truly resilient, self-recovering and scalable storage solution. My cluster has been running for over a year and I currently host over a dozen VM's from it, including pfsense, freenas, blueiris, proxmox backup server, debian unify wifi controller, nextcloud, security onion manager/search/sensors, BOINC VM's, etc... I would never want to run all this on a single box, but on the cluster I have confidence that a hardware failure, even an entire node failure, won't bring down services (for long) or cause data loss.

You mentioned maybe setting up the server in the basement. Full height 4 post server racks are often cheap/free on local classifieds. keep your eyes peeled. Great way to mount everything up off the ground and make it look nice.

More nodes and more HDD bays means you can buy smaller more cost effective HDDs and have more replicas and better failure tolerance.

Say you set up 4 X 18TB Drives in some sort of RAID10 thing.... you have 36TB of space with 2 copies of the data. If either copy goes bad, do you have any way to prove that the remaining copy is good? If data becomes corrupted how does the system know which is the right and wrong data? Will the remaining drive survive the replication of that data to a replacement drive? There's not much there in that sort of setup to ensure data integrity.

Ceph brings a lot of peace-of-mind to the table for this because it maintains 3 copies of data and is always scrubbing about looking for inconsistencies. When a drive starts failing, you'll start getting easy to repair inconsistencies. Ceph sees 2 copies that are the same, and 1 copy that is different, and therefor has a way to know which data is correct. If the errors keep coming from the same OSD, SMART data usually reveals read errors on the drive causing problems. At this point, you just tell the OSD that it's "out" and the cluster rebalances and recovers to a 3 copies of all data elsewhere. Remove/replace the drive, add it as a new OSD and ceph rebalances into the newly available space. It's so elegant compared to all the "raid" solutions. I will literally never advise any sort of "raid" solution for server data management. I would rather have a cluster of cobbled together old hardware with the confidence of a quorum based data integrity than a new shiny single box of enthusiast fancy pants hardware. Also.. since a cluster of 4+ nodes can easily survive a node failure, the need to install the nodes boot drive on a RAID1 pair of drives is significantly reduced. I mean, you can, if you want, but it's not a requirement.

Admittedly, the cost of used servers is higher now than when I built my cluster. I think covid has had an impact on both supply chains for new servers and demand for more cost effective solutions, but there are still some options out there, and you don't necessarily have to use server gear for this. I see a lot of folks talking about needing server grade SSDs with full data path power protection, etc... For home use the performance penalty of sync-write and endurance of consumer SSD's has not yet given me any issues but YMMV. My SSD's say they're at like 3% wearout after a year in the cluster... I would just say avoid QLC nand for SSD's and avoid SMR recording technology in HDD's. If you find a buying opportunity to pick up a bunch of used enterprise grade SSD's, go for it, if not, eh, probably be fine without.

Plan C

Put together a battle ready eBay RIG with CEPH (instead of ZFS) if I understand you correctly?
As long as I can back-up my vm-hackintosh in a virtual environment, I'm happy, and downtime/uptime doesn't matter, i plan to use my server RIG(s) to do a bit of regular computing too.

I have a few questions for Plan C though, how many VMs/LXCs have you lost and how much data have you lost to this project especially after you thought you had everything nailed down in terms of CEPH and redundancy.

I ask because a member of my family runs a small business and wants to run Windows Server for two clients on the node/cluster once I get it up and running, or just two remote desktop clients with windows 10 and a shared storage?

Also can GPU passthrough assist with remote rendering of graphics, cause I want to sell one of my screens maybe and so I don't have to migrate my office to the basement with the server rig(s).

All this service redundancy you speak of, is it difficult to set up, any chance you could point me in the right direction for documentation?

I'm a bit new to this idea, so I would really need the forums/communities back-up if I went eBay fishing.

I can sell my gaming gear for this!!! $3500-$4000 sounds perfect to me :)

I'm looking for cheap server racks right now.... :)
 
The problem with ceph is you need to run multiple hosts all the time and older second hand servers are power hungy. The upkeep is quite high and you will pay additional 100-200 bucks each month for the energy bill to keep them running. Because of that this was never an option for me.
 
Last edited:
The problem with ceph is you need to run multiple hosts all the time and older second hand servers are power hungy. The upkeep is quite high and you will pay additional 100-200 bucks each month for the energy bill to keep them running. Because of that this was never an option for me.
My cluster hums along at about 400W most of the time (includes the switches). That's about $30/mo where I live. In Hawaii it would be closer to $100/mo....
 
Plan C

Put together a battle ready eBay RIG with CEPH (instead of ZFS) if I understand you correctly?
As long as I can back-up my vm-hackintosh in a virtual environment, I'm happy, and downtime/uptime doesn't matter, i plan to use my server RIG(s) to do a bit of regular computing too.

Ceph requires a cluster of computers. 4 is a good minimum size cluster.

I have a few questions for Plan C though, how many VMs/LXCs have you lost and how much data have you lost to this project especially after you thought you had everything nailed down in terms of CEPH and redundancy.

As I said, I have had many HDD failures, added drives, removed drives, and performed several major configuration changes (both physical and logical) of the ceph pools over the last year, but I have not lost any data or VM's through all of that. The VM's are not impacted by it. That's why I'm such a huge fan of ceph. Even when I go out of my way to try to give it the worst possible hand (old dyeing drives, non-server grade drives, mixed sizes/speeds, everything against the advised approach), it STILL manages to sort it out and always protect the data.

I ask because a member of my family runs a small business and wants to run Windows Server for two clients on the node/cluster once I get it up and running, or just two remote desktop clients with windows 10 and a shared storage?

I would not advise offering to host services for business from your house unless you...
A: protect yourself from liability and have lots of practice/confidence maintaining a cluster.
B: define who is responsible for the maintenance of the VMs themselves.
C: deploy an offsite backup of the VMs.
D: have enough internet upstream bandwidth available to host server services from...
E: Have very good networking skills and the ability to define separate subnets/vlans, and appropriate routing/firewall rules between them.

Also can GPU passthrough assist with remote rendering of graphics, cause I want to sell one of my screens maybe and so I don't have to migrate my office to the basement with the server rig(s).

I would not advise messing with any PCIE passthrough on a cluster configured for hosting server services. The point of hosting VM's on a hyperconverged cluster is to abstract the hardware from the physical even further than traditional VM's on a single server. In a cluster configured properly with software defined storage across all the nodes, the VM is no longer "tied" to any specific node, it can float from node to node as needed to keep it up and running through underlying maintenance to the nodes. As soon as you start doing unique PCIE passthrough, you're sort of defeating the point of a cluster and introducing variables from node to node that will make setup/management/maintenance of the nodes in the cluster more difficult.

Do various PCIE passthrough experimentation on a separate project box. I don't know if proxmox is the best hypervisor to do that experimentation with. Unraid is very popular for doing hackintosh VM's, and I believe many people have got it working on AMD platforms with a kernel mod.

All this service redundancy you speak of, is it difficult to set up, any chance you could point me in the right direction for documentation?

I'm a bit new to this idea, so I would really need the forums/communities back-up if I went eBay fishing.

I can sell my gaming gear for this!!! $3500-$4000 sounds perfect to me :)

I'm looking for cheap server racks right now.... :)

With the proxmox web interface, ceph is very easy to set up and manage. Just read the proxmox ceph documentation.

Look for server hardware that comes with HDD controllers that operate in "direct attach" IT/HBA mode, no RAID. Some controllers have been flashed by ebay resellers into IT mode, others may require you to do this, some don't support it, some are this way out of the box. My Fat Twin servers just have the SATA controllers hosted by the chipset connected directly to the backplane, so there's no hardware raid in the way of anything, but many servers from yesteryear were designed on the assumption of a local hardware RAID solution.
 
Last edited:
My cluster hums along at about 400W most of the time (includes the switches). That's about $30/mo where I live. In Hawaii it would be closer to $100/mo....
I want that prices tooo_O. Here the 400W would be 107$ per month. I'm paying 50€ for just 2 servers.
 
Any recommendations, or suggestions for cheap NAS.
Grab a decent Server (means 12+ drivebays)
Install Linux and ZOL and do it on your own ;)
Requires some learning but will educate and provide best flexibility. I have ended in this place and really was annoyed to spend a lot of time and money to get the "right nas" beforehand.
 
IMO the best "NAS" is a virtualized instance of TrueNAS or similar NAS OS, running on a proxmox cluster with ceph backed storage.

Point in case.... lets say you're thinking about an update from FreeNAS version 11 to TrueNAS version 12. You've read that this doesn't always go smoothly. On bare hardware, if it doesn't work, you might be rebuilding the NAS. On a VM on a ceph backed cluster, with proxmox backup server, you can easily have incremental backup snapshots of the NAS waiting to go back to if the update doesn't work.
 
IMO the best "NAS" is a virtualized instance of TrueNAS or similar NAS OS, running on a proxmox cluster with ceph backed storage.

Point in case.... lets say you're thinking about an update from FreeNAS version 11 to TrueNAS version 12. You've read that this doesn't always go smoothly. On bare hardware, if it doesn't work, you might be rebuilding the NAS. On a VM on a ceph backed cluster, with proxmox backup server, you can easily have incremental backup snapshots of the NAS waiting to go back to if the update doesn't work.
That is in general not the biggest problem. Works with FreeNAS on bare hardware too, because you can switch between OS Images and every configuration is stored in a single file you can export/import. Only problem would be if you upgrade your zfs pool too, because that is a one way road.
Doens't ZFS ontop of ceph cause alot of overhead? If ceph is keeping 3 copies of everthing and ZFS is keeping 2 copies of everything, shouldn't there be 6 copies of everything stored?
 
That is in general not the biggest problem. Works with FreeNAS on bare hardware too, because you can switch between OS Images and every configuration is stored in a single file you can export/import. Only problem would be if you upgrade your zfs pool too, because that is a one way road.
Doens't ZFS ontop of ceph cause alot of overhead? If ceph is keeping 3 copies of everthing and ZFS is keeping 2 copies of everything, shouldn't there be 6 copies of everything stored?
There's no requirement on TrueNAS to configure a storage pool over multiple disks. When running truenas as a VM I just give it a boot disk and a bulk storage disk for each pool I want to define within it. I'll have to check but I don't recall if I'm using ZFS within TrueNAS... Likely not.
 
Last edited:
I thought FreeNAS is always using ZFS. Atleast I never saw an gui option to create something else and many of the FreeNAS features wouldn't work without it like snapshots, replication and so on. Is it still possible for ceph to detect and repair errors if a ZFS is running on top? Atleast without any raid zfs itself can't repair errors.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!