Please Help. SSD's

eddi1984

Active Member
Oct 8, 2014
46
0
26
Hi there,

a little background. I have been running proxmox 3.x for 2+ years now in a SMB environment with success. I had 3.4 installed since it came out. I had 2 HDD's in mirror with VM's stored on that mirror (3ware raid controller).
Problem is speed, VMs take sometimes A VVVEEERRRYYY long time to do a simple task and that's because of the HDD bottleneck.

Now I installed proxmox 4 and got 2 SSD's (Samsung 850 pro 512GB) in mirror. I am not using the 3ware controller, instead I connected the SSD's direct to the MB and used ZFS RAID1 for the mirror. Speed is now MUCH MUCH better.

I have one problem. TRIM is not supported on ZFS at this time. I also read, that I should use SCSI and VIRTO-IO. That will work for some of my VM's, however, I have also Win2k3 VM (and no, I cannot upgrade to 2008R2 since the software that is running on them, does not support anything but 2003 ... I know, terrible). Win2k3 has no TRIM support, so even with changing to SCSI and VIRT-IO, still not trim.

I also was reading about LVM (never used it, have no exposure to it, so not familiar with it). LVM supports TRIM (ext4). How does it work and how can I set it up?

I need to maintain the mirror (as a minimum, was planning to go to RAID10 eventually). LVM is not an option when installing proxmox from scratch. How can I set it up?

I am lost. I LOVE ZFS (and use it on FreeNAS), but I LOVE my money more, and dont want to buy new HDD's after a short time.

Problem 2:
Since there is no trim, SWAP partitions are notorious to kill SSD's. Can I disable SWAP on proxmox, or is that not recommended?

EDIT:
I have to restore backups to the fresh proxmox 4.0 installation. All the backups use IDE as their BUS, and changing this to SCSI seems to be a problem, specially on Win2k3 ... Any suggestions there?

Please help.

Thank you!
 
Last edited:
If you have a electrical power backup plan try to disable sync in ZFS. ZIL love to eat SSD

If electrical power has backup and its not a problem there is no risk to use sync disabled in ZFS ? Can it cause any problem to VM-s that run in Windows NFS file system or linux ext3 ext 4 ?
 
ZFS is COW file system and it protect from data corruption. In the past years I had ~10 electrical power lost but didn't saw any corruption in OpenVZ/KVM (PVE 2.0-3.4) and LXC/KVM (PVE 4.0). Of course I cant tell you about data lost because didn't notice it.

To save SSD you can use HDD for ZIL.
 
ZFS is COW file system and it protect from data corruption. In the past years I had ~10 electrical power lost but didn't saw any corruption in OpenVZ/KVM (PVE 2.0-3.4) and LXC/KVM (PVE 4.0). Of course I cant tell you about data lost because didn't notice it.

To save SSD you can use HDD for ZIL.

So in my case with ZFS I have 2x2TB Drives mirrored as zfs tank and mirrored zil log from 2 ssds with 50GB space and reach cache from one SSD and my server is protected from electrical power lost with backup electrical etc. So its safe to set zfs set sync=disabled in the zfs mirrored tank of this 2x2TB Drives ?

I saw that you have more experience here so what would you recommend ( disabling sync increases fsync etc to a bigger number rather than using standard )

Sorry if I am going off topic but from what I see we have same case here
 
Looking into ZFS strategy ZIL can boost sync writes but it can slow down due slow devices.

For example if you uploading/saving huge file and in that period server become unreachable/power lost it will make no point of ZIL protection. But if you do something with small files very active you can lose some of them.

I`m running Mysql in LXC now (prev. OpenVZ) and did not saw any Innodb or other table corruption.
 
Hi,

I am not sure, if I am comfortable to disable ZFS Sync in an production environment. I would gladly switch to ext4, but I cannot use a HW RAID card, because they do not pass TRIM. I need some sort of HDD redundancy (yes I am doing regular backup/and snapshots). I dont want to run the OS one one SSD and the VM's on a second one without RAID.

Does LVM offer HDD redundancy, like RAID Mirror? Is there a way to setup LVM right at setup.

Thanks.
 
Ask yourself what do you want to protect?

Hdd crash? Data crash ? Or both ?

Raid controllers / Traditional software raid - protect only HDD lost.

ext2/3/4 and others - no HDD or data protection

ZFS - HDD protection (in mirror or RAIDZ1/2/3 mode), bad block protection (on volume setting copy=2/3)
ZFS - data corruption protection ( COW - Copy-On-Write) on power lost
ZFS - data protection in - silence corruption, data checksumming and others things
 
Well, short answer is both.

Long answer:
I want to protect the SSD's (will be 4 in total, cost is over $1000 CAD or $700 USD). So you can see, that I don't want to buy new SSD's in 12 month. I am hoping that they will last for at least 5 years.
But data protection IS more important than HDD's, since the business can shut down if they melt down and backups are lost (trying hard to avoid any of these scenarios). That is why I selected ZFS, I love it and its the best out there by far.

Anybody here that has used SSD's with ZFS in proxmox environment in production for a while already? What is the life expectancy for my setup (I just installed 4 SSD's and than installed proxmox and selected RAID10, otherwise I did not change anything).
How can I optimise the setup. Any suggestions about best practices.

I was reading about the SCSI/VIRT-IO setup, problem is, that Win2k3 will need the Samsung Magician Software. But the Magician Software needs to recognise the HDD, like in direct attached. So there is no way for me to get TRIM to the SSD's.

I hope I can get some good pointers or suggestions or experience reports from other users, that have used and are using SSD's in production.

Thanks.
 
I will point you to one thing. ZFS doest support TRIM. So don`t waste your time on it. BTW TRIM is only delayed unnecessary block cleanup. If you want to save write cycles then look at SSD with smart controllers.
 
• The samsung 850 Pro's are rated for 300 TBW (Terabyte written)
Question is, if you are going anywhere near that. (4 years means 75 TBW/year or around 200 GBW/day) in Raid-10 you can probably double it.

• Do you know if your Proxmox node even utilises its swap space ? --> http://www.cyberciti.biz/faq/linux-check-swap-usage-command/

We do run our Production VM's with minimal Swap (32-128 MB) and give em enough Ram instead. Safes writes/reads to the Storage-Backend.

• With the ZFS-Option i can not help i'm not using ZFS (besides ZFS Mirror on 2x SSD's for Proxmox-OS).

• We do not really care for TRIM to extend the SSD life-time, if its labour intensive to achieve. But if Trim is really such issue, you could try to fix it on the VM-side rather the FS-side. software like e.g. FancyCache or passing a physical HDD through to that VM to use as Swap-Drive. Just spitballing.



• But you asked about SDD's in production.
I'm not a small SMB and i we do not use ZFS on scale, but we do run around 70 Proxmox-Nodes with Ceph for around 500 SSD's used as Ceph Cache-Tier. For us its simple: We plan for multiple redundancies, make sure we can handle 20+ SSD's failing at the same time. We use the cheapest SSD's we can get (Performance/€), then just buy spares, monitor for S.M.A.R.T-Failures, replace em if drive has failed/is about to fail and hen just RMA em (if applicable). We don't bother with dishing out 200€ (or even 800€ for DC SSD's) if we get the same performance with 75€ SSD's.
I've talked about that here:
https://forum.proxmox.com/threads/24302-Urgent-Proxmox-Ceph-Support-Needed?p=122792#post122792

In total we have around 3k+ SSD's in our IT-environment. Since roughly 4 years we keep track of Failure Rates. They average around 2% after a year and about 8% after 2 years. Thats mostly Hardware Failures, not TBW-exceeded issues. After that the failure rates skyrocket (TBW issues). Thats why after 2 Years we start phasing em out of critical Services and use em as replacement for non mission critical stuff. I can say the difference between vendors is only marginal at best and mostly relates to being able to tell if a drive is about to fail via S.M.A.R.T.



The point i am trying to make is as follows:
If your Infrastructure / Data is THAT critical for Business continuity, then plan for multiple Failures ( Raid 10 allows for 1 Drive to Fail and still maintain a copy - I'd have sleepless nights - every night with that)
Rather then using 1000 CAD for 4 expensive SSD's - i'd use 6-8 cheaper SSD's and stick the rest of the money into spares and a propper backup strategy. Because Raid != Backup
 
Last edited:
Thanks for your input.

I cannot find that ZFS (on linux) supports TRIM. On FreeBSD, it does, but the documents don't show that ZoL does (read in this forum).
Do you have a recommendation what SSD's have smart controllers? (I take it, that the Samsung 850 PRO does not have a smart controller)

Thanks.
 
• The samsung 850 Pro's are rated for 300 TBW (Terabyte written)
Question is, if you are going anywhere near that. (4 years means 75 TBW/year or around 200 GBW/day) in Raid-10 you can probably double it.


The point i am trying to make is as follows:
If your Infrastructure / Data is THAT critical for Business continuity, then plan for multiple Failures ( Raid 10 allows for 1 Drive to Fail and still maintain a copy - I'd have sleepless nights - every night with that)
Rather then using 1000 CAD for 4 expensive SSD's - i'd use 6-8 cheaper SSD's and stick the rest of the money into spares and a propper backup strategy. Because Raid != Backup

I never get close to the 300 TBW. Ever. Not in 10 years. That way, I have small data volume. Data is critical, but not a lot of it. I am thinking, that I should rethink the RAID setup and go with ZFS RaidZ2.

I would love to use more than 4 SSD's for redundancy, however, the server is 1U and only hat slots for 4 HDD's.

High failure raid for HDDs 2 years and older is quite high. Spinning disks could last longer than that. This is actually quite surprising to be, but on the other hand, you have high data volume, that may explain it.

Thanks for your input, appreciate you sharing your experience.
 
[...]

High failure rate for SSDs 2 years and older is quite high. Spinning disks could last longer than that. This is actually quite surprising to me, but on the other hand, you have high data volume, that may explain it. [...]

It's definitely TBW related after 2 years. Most of em do not get RMA'ed because they either passed their warranty time period or are outside the scope of warranty (TBW). I doubt manufacturers actually go check why the SSD has failed for every SSD send in for RMA and only do spot-checks once a while.

Regarding spinners: Yes they last a lot longer without failures. I personally have 10+ year old Sata 1 Drives that have been running 24/7 without issues. At work tho, they normally do not last past 3-4 years as the ( Yearly-Powerconsumption per TB) + remaining resale value (wiped drives on ebay) + increased failure risk makes us buy new drives.

regarding 1U Case: there is the option of external HBAs + external storage Cages ... probably not worth it. if not already done: I'd seriously consider a proper backup strategy (e.g. NAS (different room/building) + Coldstorage [Spinner (HDD) / DVD / BlueRay] in a fireproof safe).
 
Last edited:
Regarding spinners: Yes they last a lot longer without failures. I personally have 10+ year old Sata 1 Drives that have been running 24/7 without issues. At work tho, they normally do not last past 3-4 years as the ( Yearly-Powerconsumption per TB) + remaining resale value (wiped drives on ebay) + increased failure risk makes us buy new drives.

I have been thinking, of returning the SSD's and getting 300GB 15K HDDs. All SSD related issues would be solved. However, I wanted to use SSD's in first place because of the speed gain, especially the IO.


regarding 1U Case: there is the option of external HBAs + external storage Cages ... probably not worth it. if not already done: I'd seriously consider a proper backup strategy (e.g. NAS (different room/building) + Coldstorage [Spinner (HDD) / DVD / BlueRay] in a fireproof safe).

This actually gave me an idea, on howto add more SSD's to the setup. I need to upgrade the NAS Server case anyway. Why not get a bigger case (24 Bay) and use 4 bays for proxmox, giving me a total of 8 HDD's. I could utelize 2 HDD's (spinning) for OS and ISO's maybe backup, and 6 in RAIDZ2 on ZFS on cheap SSD's.

Do you care to share the brand(s) of cheap SSD you are using in your Enterprise environment?
 
we use as a majority currently
• ADATA Premier SP550

We are phasing em slowly out as availability (we normaly buy 1-200 at a time) has become an issue for us and the newer candidates have higher TBW per power consumption.
We are currently evaluating:
• Kingston HyperX Savage SSD 240GB
• SanDisk Ultra II 240GB

we get em for 60€ (85CAD) per piece both right now.

There might be different options in your region. We normally look at price search engines (filter Sata 3 + min 75k Random Writes + release date not older then 2 years + min 240Gb + then sort by price + then filter by lowest power consumptions , read reviews and keep increasing power consumption until we land at the desired performance). We can tell a lot by atto benchmarks for our use cases. We then buy a couple and verify our results before buying charges of 100 at a time.
Sidenote: we have a person that does nothing else then constantly evaluate Hardware (including SSD's / SSHD's / HDD's ) so we can verify there are no issues and can deploy it at scale.


We are also evaluating:
M2 to PCIEX4 Connectors + PCIE X16 to 4xPCIEx4 Expander cables
+ Samsung SSD 950 Pro 256GB
+ Samsung SSD SM951-NVMe 256GB
+ Samsung SSD 950 Pro 512GB

as a low cost performance booster for our ceph cache, but thats a different story.

ps.: i have written about that here https://forum.proxmox.com/threads/24302-Urgent-Proxmox-Ceph-Support-Needed?p=122827#post122827 and here http://forum.proxmox.com/threads/24388-NAS-Solution-for-backup?p=122791#post122791
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!