Hey guys. I posted this on proxmox subreddit and someone mentioned I should post here as well. I setup my Proxmox server a few months ago around December of 2022 and I've started running into issues with not being able to access the GUI, or my vms, containers, and host all being unreachable. I looked into the logs and found IO errors which then lead me to my 1TB Samsung 860 EVO 2.5" SSD. Looking at it I see it is at 99% wear and that makes sense for the types of errors I was receiving, but still unsure why it would be that high. I pulled this from 3 other 1TB SSDs from my main PC to use and they each have about 34k power on hours and 5.5TB, 7.8TB, and almost 16TB written to each. I checked the smart data on this proxmox drive and it has almost 1.95PB of data written to it according to the "241 Total_LBAs_Written" section. I'm unsure of how this could have happened as it was only running for about 3 months. Any thoughts? Here's what we've looked at so far through reddit:
- Not an enterprise drive - although it's only supported on enterprise drives I do still feel like there's something else happening as 1.95PB of data is insanity.
- I was running the pve-ha-lrm & pve-ha-crm services even though I don't use HA or Clustering. - Disabled these as others said they had high disk wear while using these but the damage is already done
- I am not running ZFS as far as I'm aware - Running zpool iostat gives me a "no pools available" message. My main volume for my VMs I believe was LVM
- I am not running a RAID controller - Just using a single SSD.
- I believe I setup proxmox as the "ext4" filesystem when installing.
- I setup Proxmox with the 7.3-3 version but have since upgraded to 7.3-6.
- I have not setup any sort of metrics or monitoring systems but will to see if hopefully it can show anything through historical data
- I only use 6-7 VMs along with 2-3 containers. The VMs are mostly Ubuntu Linux VMs with one Kali Linux and 1 Windows 11 VM. They do all write to the same drive as I was only using 1x 1TB SSD (I know bad practice but this was mostly homelab type stuff) but anything that did write data that I needed to keep (mostly linux vms) was sent to my Synology NAS.
- I am running docker in one of my linux servers and it is running Portainer, Prowlarr, Overseerr, Radarr, Sonarr, Qbittorrent, Plex, Nginx Reverse Proxy, Bazarr, HomeBridge, and a few other containers for testing.
- While all my VMs and Containers were running I did not notice anything hitting swap but that doesn't mean it wasn't.