Threadripper Proxmox build, need suggestions

jena

Member
Jul 9, 2020
47
7
13
34
[cross posted on STH]
I work in a research lab that we recently purchased a Threadripper 3970x workstation from a System Integrator.
It is a better better deal than Dell-Intel, which would have cost us twice as much.
The role is to run Proxmox as base Hypervisor and run multiple Windows and Linux for scientific computations.

Build:
Threadripper 3970x
MSI TRX40 (two M.2 SSD on MB, came with a Dual M.2 on add-in PCIE card)
256GB Non-ECC 3200MHz CL16 RAM (no ECC option from this S.I.)
Single RTX2080Ti (we will add Ampere RTX3080-ish later)

The case is CORSAIR 110Q, which only has two 3.5in bay and one 5.25 bay.
I can use adapter to convert 5.25 to 3.5.
Not sure how to JerryRig the fourth 3.5 HDD in it.

Plan to:
  1. ASUS HYPER M.2 X16 GEN 4 card to have Quad NVME. Total of 6 (we won't use MSI's Dual NVME card).
  2. [VM-pool] 6 x 2TB WD SN750 NVME SSD at about $300 each (RAID10, or RAID Z2 for VMs)
  3. [BootDrive] 2 x 1TB Samsung 860 Evo as Proxmox boot drive (RAID1-mirror at install), and some ISOs (I know its way too big, but I have heard that Proxmox wearout boot drive fast?)
  4. [BulkDataPool] 2 x 12TB HGST UltraStar as bulk data storage (RAID1 mirror as one VDEV, add future VDEVs)
My question:
Based on my use case, maybe about 20 VMs simultaneously (not all at full load). Around 2~3 VMs at heavy CPU and GPU load simultaneously.
  1. Can I skip using ZIL(SLOG) on all-SSD-VM-pool?
    because I don't see any performance gain using ZIL and it doubles the write wear on SSD or I have to buy separate Optane drives for ZIL.
  2. Should I use RAID10 or RAIDZ2 for all-SSD-VM-pool? I think RAID-Z2 has better redundancy but have heard that RAID-Z2 will have performance gotchas in small block file application (not sure if VM is considered as small block operation). If RAID10 has two drive fails within the same VDEV (especially during rebuilt), my pool falls apart.
  3. Should I use the hugh 1TB RAID1 boot drive as L2ARC for BulkDataPool?
  4. Does Proxmox wearout boot drive fast?
  5. Does ZFS on Proxmox support Trim for SSD?
  6. Since I only have 256GB RAM for the entire system, mostly for computations. It is sufficient if I only give VM-Pool a 8GB ARC cache and give BulkDataPool a 16GB ARC?
Any other suggestions?
 
Last edited:
1) Yes, however, there might some case you would want to, but then you are using the PCIe/NVMe Intel Optane in conjunction with the Slower SSDs over things SATA as an example. ZILs gains are with double writes to spinning disks.

2) The reason you'll consider RAID Z2/3 is when the excepted error rates are higher than the time to replace and recover the new device. This is especially relevant on >2TB spindle drives. The performance penalty you pay for RAIDZ2 is the insurance policy to be able to recover the 2nd disk failure from the same RAID 1 group in a RAID1+0 setup.

3) I would've split that [BootDrive] to my "usual" SSD+HDD setup:
- 30GB Boot disk + /
- 4GB ZIL
- 16GB "empty" (NEver use, so "help" with the ZIL's usage, but yes, that was the size for a 240GB/480GB SSD setup)
- 64-128GB L2ARC (Remember, the L2ARC needs system RAM, so the bigger you make it here, the more System RAM is consumed too)
- Remainder: [ExtraSSD-pool] I typically used these as the VM root disks, but depending on the actual VMs, I saw a case *once* where the VMs was hammering the consumer grade SSDs and I had to move those off to NVMe

4) well... ProxMox does write the RRDlogs at a rate, thus, there will be continous IO. But then, there are usually or could be a VM doing similar bad things. I've run servers for >3 years without SSD boot drive problems on ProxMox.. lately I'm prefering the ZFS roots and that haven't seen any problems either... I have a ~5year old system hammering ZFS on SSD like setup about in (3) but for 400GB SSDs, and they are still going along... it will also depend on the quality of the SSDs installed ;(

5) TRIM: Yes, ZFS does support that, BUT the biggest issue(s) will be when you are using a HW RAID controller in front of those SSDs, then TRIM will most probably NOT work ;( yes I know that as a fact with Samsung EVOs behind a HW RAID controller... in a "jbod" setup... still not able to get the TRIM commands to the DRIVE.

6) YEs, ram is a problem ;(
the ARC sizes you'll have to experiment with to see the performance impacts etc. but then remember, the L2ARC is also impacted by the smaller ARC
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!