[SOLVED] ECC or not to ECC

diverseprox

New Member
Apr 25, 2025
8
3
3
The title, is the question. To ECC, or not to ECC? In relation to the RAM.

I'm new to Proxmox and while I have set some things up to my liking, I have documented anything that I actually care about and I'm happy to start over sooner than later if need be.

It's important to note that some of the computers being back up, including the Windows VMs are used for a small business. Some of the data is somewhat valuable (to us) as we are a software development company.

The use cases:
- local Windows VMs that a few users remotely connect to (security is already in place)
- local Docker containers (various workloads), demo servers (non-production), etc.
- backup local Mac computers (utilizing borg -- just files)
- backup local Windows computers
- backup said VMs and containers

I finally received my SAS drives and I'm ready to implement ZFS. But, as I went to verify my setup, I received feedback that I likely should have bought ECC RAM. To add insult to injury on that side, my mobo does not support ECC and cannot be enabled in the BIOS.

It will cost about $300 more to go with the other mobo and ECC RAM after returning the current mobo and RAM. Is this the right option? Or will I be fine without the ECC RAM? The $300 isn't the end of the world. It's just a PITA and I was hoping to be done today.

Thanks in advance.

P.S. Not sure if this is really needed, but adding it anyways. My plan for ZFS backups:

1745782795082.png
 
in my opinion, from what i have read on ZFS, from experts in the subject, the devs, etc, the stance on ECC is not as serious as it is made out to be, odds are you will be absolutely fine without it, but there is a small chance it could lead to a corrupt file somewhere along the way not having it. depending on how critical the data you have and how you have ZFS setup, this could be either a huge problem or nothing at all for you. if it is setup in a raid properly ZFS is also self healing and stores multiple copies of metadata, so it will usually just fix the problem when it finds the file is bad by restoring it from another copy. no harm done. the way ZFS writes data also really prevents problems overall in most all cases. it is really built to be as solid as a rock for the most part.

ECC is better though, in my experience it is slightly more so stable than non-ECC and the risk of corruption is lower, the hardware supporting it is usually better aswell, so that will also lead to a better experience and less issues overall in the long run.

many would stick to the stance that ECC is a must with ZFS and you shouldn't be running without it, i would probably be inclined to lean towads that stance in a production environment/business setting, despite it not actually being quite that serious, as i said you will probably be fine without it but the question comes down to do you want to risk it.
 
From a pure server operation perspective, having ECC memory could prevent a system crash. Having good backups will help protect against that. Same for HA setups. But when it comes to storing, whether on your server or a separate NAS, ECC memory become more important, in my mind, to make sure you aren't storing corrupted data. Yes, the odds of either event are infinitesimally small, but still, the cost of ECC ram ($300?) seems to me to be cheap enough insurance. I would definitely go with the ECC RAM
 
Thanks for your response @zenowl77. Just for clarification, the ZFS self healing doesn't apply to corrupted data from RAM before the write -- isn't that the problem? Are you saying though that if there was a previous write related to a file, that if it is later corrupted in RAM, that it can pick that up and restore from a previous copy? You're doing daily (or every couple of hours) updates and it would just restore to that previous version?

Relating to RAID, I was thinking to just go with a ZFS RAID1 (mirror) for now.
 
Last edited:
From a pure server operation perspective, having ECC memory could prevent a system crash. Having good backups will help protect against that. Same for HA setups. But when it comes to storing, whether on your server or a separate NAS, ECC memory become more important, in my mind, to make sure you aren't storing corrupted data. Yes, the odds of either event are infinitesimally small, but still, the cost of ECC ram ($300?) seems to me to be cheap enough insurance. I would definitely go with the ECC RAM
Okay, I'm dreading it a little because I have to return the RAM and mobo now lol I guess it's better now than later though.

Since I went with the Ryzen 7 5700X, I am kind of limited on mobo options.. I'm leaning towards going with an ASRock Rack X570D4U, but it's a micro-ATX. Going from an Asus ROG Strix B550-F, which is a full ATX, but it doesn't support ECC..
 
I've suffered data corruption because of drives and memory. The latter also corrupted files (that I edited the most/recently) and backups over quite some time before I noticed the source of the weird instabilities.
Now that I can afford it, I want stuff to work and not worry about these things. I use ZFS because of the checksums to warn me about corruption and that also requires redundancy so I can buy a replacement drive while the system still works and/or ZFS can repair the damage using additional copies.
I also have more RAM than ever before and a silent bit flip therefore becomes more likely. I get ECC for free on consumer Ryzen CPUs (just avoid the motherboard brand that prevents it from working) and decided to buy more expensive (and slower) DIMMs to make sure that I provide ZFS with uncorrupted data.
I still use very regular backups to repair manual "corruption" like accidental deletions and other mistakes. PBS allows for quick off-site replication to mitigate a PC break-down or fire at home.

EDIT: If you want to go Ryzen and do PCI(e) passthrough, go for AM4 and a X570S chipset/motherboard (but not MSI as it does not do ECC) as it has the best IOMMU groups (all other chipsets combine most slots and devices in one group).
 
Last edited:
  • Like
Reactions: diverseprox
Thanks @leesteken.

I went with a DDR4 to save a bit and bought the Ryzen 7 5700X -- it does support ECC. But, the mobo and RAM I have does not support it. I'm thinking to switch to the ASRock Rack X570D4U and Crucial DDR4 ECC UDIMM 32GB 2Rx8 3200. I would be "downgrading" from a full ATX board to a micro-ATX, but switching to a better board overall. The RAM was also half the cost, CORSAIR VENGEANCE LPX DDR4 RAM 64GB (2x32GB) 3200MHz, but isn't ECC RAM.

From what I'm gathering with folks here, I probably should just bite the bullet now, as opposed to worrying about it later.
 
Thanks for your response @zenowl77. Just for clarification, the ZFS self healing doesn't apply to corrupted data from RAM before the write -- isn't that the problem? Are you saying though that if there was a previous write related to a file, that if it is later corrupted in RAM, that it can pick that up and restore from a previous copy? You're doing daily (or every couple of hours) updates and it would just restore to that previous version?

Relating to RAID, I was thinking to just go with a ZFS RAID1 (mirror) for now.
you're welcome and self healing is only once the data is already written and ZFS sees for example in raid 1 that file on drive 1 is corrupt so it then copies file from drive 2 to drive 1.

but i believe from what i have read on ZFS it does a kind of verification from file > ram > zfs to prevent writing the data wrong to zfs in the first place, which is also why ECC isn't quite as serious as it is made out to be.

you would have to read up more on zfs of the exact ways it handles data, i dont think it does anything like that from what i have read but i could be wrong.

raid 1 should be good enough, for the time being as long as you're making a copy the odds of both drives failing isnt too high, esp if they are new.
Okay, I'm dreading it a little because I have to return the RAM and mobo now lol I guess it's better now than later though.

Since I went with the Ryzen 7 5700X, I am kind of limited on mobo options.. I'm leaning towards going with an ASRock Rack X570D4U, but it's a micro-ATX. Going from an Asus ROG Strix B550-F, which is a full ATX, but it doesn't support ECC..
oof, that is a hard choice, that sounds like the asus board will be better in the long run but ecc should be more stable and secure for your data. i have personally always favored asus for motherboards though, they are usually far better than asrock.
 
  • Like
Reactions: diverseprox
I went with a DDR4 to save a bit and bought the Ryzen 7 5700X -- it does support ECC. But, the mobo and RAM I have does not support it. I'm thinking to switch to the ASRock Rack X570D4U and Crucial DDR4 ECC UDIMM 32GB 2Rx8 3200. I would be "downgrading" from a full ATX board to a micro-ATX, but switching to a better board overall. The RAM was also half the cost, CORSAIR VENGEANCE LPX DDR4 RAM 64GB (2x32GB) 3200MHz, but isn't ECC RAM.
I also use the same Crucial memory (4*32GB) on my Gigabyte X570S AERO G (because out can chose the boot GPU in BIOS) currently (with a 5950X). I bought it directly from the Crucial shop which was the cheapest (search for a 10% discount code). I use(d) older/slower Crucial ECC 4*16 2600 on my ASRock X470 Master SLI (with a 2700X). AM4 is getting old but was getting big discounts when AM5 was released.
 
Last edited:
  • Like
Reactions: diverseprox
Thanks @leesteken.

I went with a DDR4 to save a bit and bought the Ryzen 7 5700X -- it does support ECC. But, the mobo and RAM I have does not support it. I'm thinking to switch to the ASRock Rack X570D4U and Crucial DDR4 ECC UDIMM 32GB 2Rx8 3200. I would be "downgrading" from a full ATX board to a micro-ATX, but switching to a better board overall. The RAM was also half the cost, CORSAIR VENGEANCE LPX DDR4 RAM 64GB (2x32GB) 3200MHz, but isn't ECC RAM.

From what I'm gathering with folks here, I probably should just bite the bullet now, as opposed to worrying about it later.
yes, any way you look at it, going to ECC is the best choice.
 
  • Like
Reactions: diverseprox
Although it's a bit late (since OP already decided how to proceed) for future reference:
Matthew Ahrens (ZFS developer) on ZFS and ECC:
There's nothing special about ZFS that requires/encourages the use of ECC RAM more so than any other filesystem. If you use UFS, EXT, NTFS, btrfs, etc without ECC RAM, you are just as much at risk as if you used ZFS without ECC RAM. Actually, ZFS can mitigate this risk to some degree if you enable the unsupported ZFS_DEBUG_MODIFY flag (zfs_flags=0x10). This will checksum the data while at rest in memory, and verify it before writing to disk, thus reducing the window of vulnerability from a memory error.

I would simply say: if you love your data, use ECC RAM. Additionally, use a filesystem that checksums your data, such as ZFS.
https://arstechnica.com/civis/threa...esystem-on-linux.1235679/page-4#post-26303271
Ahrens wrote this in a thread on a file system article on arstechnica, the author (Jim Salter) of that article also wrote a longer piece on ZFS and ECC RAM:
https://jrs-s.net/2015/02/03/will-zfs-and-non-ecc-ram-kill-your-data/


So in the end it boils down whether the additional protection of ECC RAM is worth the additional costs for you. In case of a business-critical application ECC RAM and it's cost are not that bad compared to the potential consequences of lost data. For a home user this might be a different story though: Heise (the company behind Germanys most regarded computer magacines C'T, IX etc) did a piece how to build your own NAS PC, they decided against using ECC RAM, since they deemed it unnecessary for home use. On the other hand fellow forum user @Dunuin declared in multiple threads here, that he always use ECC RAM if he can since he lost some important data to bit rot with non-ecc ram.
So depending on your requirements, the decision can be very different. It has nothing to do with using ZFS or not though.
 
Last edited:
Thank you for sharing that @Johannes S. The $300 difference at the end of the day is worth the peace of mind for the (business) data being stored. I learned a good lesson here for future consideration as well. Hopefully anyone that has the same question can learn a thing or two if they find this thread.
 
The title, is the question. To ECC, or not to ECC? In relation to the RAM.
It's important to note that some of the computers being back up, including the Windows VMs are used for a small business. Some of the data is somewhat valuable (to us) as we are a software development company.
- backup local Mac computers (utilizing borg -- just files)
- backup local Windows computers

Mainly a rhetorical question:
Do those computers (also/already) have ECC RAM?

I find it quite "interesting", that people usually ever only think about ECC RAM or not on the target machine (e.g.: server), but (almost) never on the source machine (e.g.: client).
Sure, it is better to have it on one machine than on none at all, but... ;)
 
  • Like
Reactions: Johannes S