Proxmox VE unbearably slow

mind.overflow

New Member
May 6, 2022
9
0
1
Hello everyone!
I'm running Proxmox VE on a server that has the following specs: Xeon E3 1240v2, 16GB RAM, 2x Crucial SSDs in RAID 1.
This server is definitely not the sharpest, but the specs are more than enough for my basic home server needs. I'm running just a few VMs with services like Nextcloud, Gitea, Sonatype Nexus... services that, for a long time, I've ran on an underpowered T7800 with 6GB of RAM.

However, Proxmox has become so slow that pretty much any action I take on it freezes the whole system for a few seconds up to minutes. Example: I pull a docker container on a VM. The download starts, then it gets to "extracting" and it sticks to 100% for 15 minutes, while everything else goes down and starts timing out (I have an uptime tracker). Another example: I start streaming something on Jellyfin. It loads instantly but only 30 seconds, then the server freezes, I watch 30 seconds of a movie, and the stream stops until the server is alive again.

I have noticed IO delay going up exponentially in Proxmox's Web UI, sometimes up to a point where even Proxmox itself stops recording it. However:
- I did not allocate more RAM or CPU cores than what I have to the VMs,
- I left 2GB for the system itself,
- I am using SSDs that, while not server-grade, I have tested extensively with full writes before using them on the server and their performance was more than good enough.

io-delay.png

Moreover, the disks still have more than enough space. And after thinking that RAID could be the issue, I moved Nextcloud and Jellyfin to a separate SSD that I mounted by itself without any kind of mirroring, and the problem didn't change. In fact, I find it very weird that this is a disk issue, because if the external SSD was too slow, it should not affect VMs that are running on internal RAID1 SSDs. The fact that this issue affects the whole system no matter how much I try to separate the resources, really bothers me because then why am I even virtualizing them. Especially because some services that are running on docker get automatically killed after being unresponsive for 60s, and I often find myself waking up to a half-dead server overnight because my phone uploaded some photos to Nextclouod automatically.

I am at a loss - I replaced one of the two SSDs, I tried moving VMs to external SSDs, I tried changing resource allocation - I even tried changing disk cache profiles from default (no cache) to write back, but it brought no change - and the fact that I used to run the same services on a much less powerful 2-core CPU really makes me wonder why this is happening. Also because, when I installed Proxmox on this server, everything was butter smooth for a few months/weeks. It's just been getting steadily worse with time.

I even tried running a game server and it brought the whole Proxmox server down, including host itself.

What could the issue be? Does anyone have any clue? If there are things that I can run and test I'll be more than happy to try.

Thank you all very much!
 
Last edited:
Maybe your setup has something in common with the threads in this post?
Looks like it. Thank you! Unfortunately, however, I can't verify it because this is a headless server and I'm running a ZFS RAID1 on boot volumes, with systemd-boot - which means that I don't have an easy way to revert to 5.11 kernel. I tried uninstalling 5.13 and 5.15 but of course it also prompts to uninstall proxmox-ve and I don't think that'd be very nice. I was on 5.13 and I tried 5.15 with no differences.

UPDATE: I manually mounted boot partitions and edited the loader.conf file to only boot 5.11. Proxmox VE is now on 5.11, however, things don't seem to be noticeably better. I'm still struggling to pull simple docker images and stream videos. IO delay still goes over the roof. Even just using nano or cat to open files sometimes takes 30 seconds to open.

io-delay-2.png
 
Last edited:
Hi,
Did have you checked your BIOS version? This reminds me of a too old BIOS version problem on the motherboard.

Cordially,
 
Hi,
Did have you checked your BIOS version? This reminds me of a too old BIOS version problem on the motherboard.

Cordially,
Hello,
thanks for the suggestion. I have a Supermicro X9SCL-F and I flashed the latest BIOS versions a few months ago. I already has issues with the previous BIOS as it would crash with dates older than 2021 (Y2K v2??), so it should be good.
 
I'm running just a few VMs with services like Nextcloud, Gitea, Sonatype Nexus...
Hi,

Could you define how many is "few VM" for you?

Also could you enumerate all your services/applications do you have on each VM ?

Also as a note, your server RAM is very low even for a single VM, without zfs. Zfs will use by default 50% of RAM, Proxmox itself need around 1 Gb Ram, so, is will remain aprox. 7 Gb for all your few VMs.

Good luck / Bafta
 
Hi,

Could you define how many is "few VM" for you?

Also could you enumerate all your services/applications do you have on each VM ?

Also as a note, your server RAM is very low even for a single VM, without zfs. Zfs will use by default 50% of RAM, Proxmox itself need around 1 Gb Ram, so, is will remain aprox. 7 Gb for all your few VMs.

Good luck / Bafta
Of course!

I have 3 VMs:
- Development VM (Gitea, Drone CI, Sonatype Nexus) - 4GB
- Services VM (Nginx -> Ghost CMS, 2 Discord bots, Vaultwarden, Keycloak, Outline Wiki) - 7GB
- Media VM (Jellyfin, Nextcloud) - 2GB

And 2 LXCs for a VPN and unbound (128MB each).

I read on the Proxmox Wiki that, for ZFS, you should leave 2GB plus 1GB for each TB of data - but since I'm only running ZFS for two 256GB SSDs, I thought that 3GB for the host were more than enough. I'm keeping my media and development builds on other ext4-formatted SSDs that I have attached via SATA and that are not in a ZFS pool, since I can afford to lose that stuff (and I'm backing it up remotely anyway).

I know I don't have a particularly powerful server, but this is all I can afford as a college student and if I knew my resources weren't enough, I wouldn't even have considered posting this. However, I was running the same things on way worse hardware before, and moreover, I used LVM for a few months without any kind of issue - things started slowly getting worse only when I moved to ZFS.
 
Last edited:
I read on the Proxmox Wiki that, for ZFS, you should leave 2GB plus 1GB for each TB of data

Hi,

This was for zfs verson < 2.x, so you are OK .

Yes, LVM is faster compared with zfs, but you do not have the same benefitis like zfs.

It is possible to optimise some things in zfs, only if you have time to do it.

If you are willing to do it, start with posting your output for:

arc_summary

How do you make your dark-theme in Proxmox ?

Good luck / Bafta !
 
things started slowly getting worse only when I moved to ZFS.

Doesn't surprise me. Your main problems/bottlenecks (beside an over 10 year old hardware-platform) in combination with ZFS are:
Crucial SSDs
consumer SSDs and:
fully loaded with:
- Development VM (Gitea, Drone CI, Sonatype Nexus) - 4GB
- Services VM (Nginx -> Ghost CMS, 2 Discord bots, Vaultwarden, Keycloak, Outline Wiki) - 7GB
- Media VM (Jellyfin, Nextcloud) - 2GB

And 2 LXCs for a VPN and unbound (128MB each).
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!