After they replaced the boards with ASRockRack E3C252D4U-2T everything has been solid. Those Gigabyte boards are great so this is a real shame. They boot faster and have better / more stable IPMI. I'm thinking they have some sort of hardware fault though. I'm not sure what the heck else it could...
Hosting provider is swapping the mobos on the two faulty servers to ASRockRack E3C252D4U-2T. They did already on one and everything is now working as expected. All three servers will have the same mobos.
While I am waiting for them to swap the third, I pulled it from the cluster and rebuilt it...
Have you tried running Memtest86+ for a few rounds or preferably overnight?
I would also run something like sysbench --threads="$(nproc)" --time=0 cpu run and watch CPU temps (say with ipmitool sensor). (apt update && apt install sysbench ipmitool)
If that succeeds you could also try running...
Just ran Memtest86+ and both servers pass.
I keep thinking this has to be a hardware issue but it ONLY happens when I try to restore a Proxmox backup. I have the hosting provider looking into it now.
I recently ordered three new servers for a staging cluster (hosting provider with dedicated hardware). They provided me with the following:
01 has a ASRockRack model E3C252D4U-2T mobo.
02 and 03 have GIGABYTE model MX33-BSA-V1 mobos.
All three have:
Intel Xeon-E 2386G - 6c/12t - 3.5 GHz/4.7...
Are you getting any errors in the logs? Either on the PVE server or the backup server? You'll want to post those here.
Hopefully you have put your SAS drives back to RAID 1 (with a spare perhaps?). Also, I would highly recommend adding another SSD and using RAID 1 for Proxmox, especially if you...
I opened a ticket and in the days between opening the ticket and getting a response, this issue appears to have corrected itself. I haven't made any changes so I'm not sure what happened there.
Doesn't look like that fixed this. I'm still seeing "Too many open files" errors. Also seeing:
Mar 29 18:42:41 daemon.err vmh03 pveproxy[15181]: got inotify poll request in wrong process - disabling inotify
Fresh Proxmox VE 6.1 install on a three node cluster. Started seeing this after building several VMs (no containers):
Mar 29 16:03:00 daemon.info vmh03 systemd[1]: Starting Proxmox VE replication runner...
Mar 29 16:03:00 daemon.info vmh03 systemd[1]: pvesr.service: Succeeded.
Mar 29 16:03:00...
Here is the full list of commands to update zfs_arc_max if you're using EFI boot for anyone Googling this in the future:
# cat /etc/modprobe.d/zfs.conf
# 8GB
options zfs zfs_arc_max=8589934592
# update-initramfs -u
update-initramfs: Generating /boot/initrd.img-5.0.15-1-pve
# pve-efiboot-tool...
OK sounds good, that's really all I wanted to know. My main concern was that running an NFS server on the host would invalidate support for everything else.
Yes, other clients. I wouldn't be running VMs off this pool or over NFS at all.
That wasn't me but there are tons of threads just like...
Just for future reference, running an NFS server on the Proxmox host isn't supported. Running FreeNAS / FreeBSD on ZFS in a VM isn't recommended or supported either (at least not without some serious caveats) so it looks like we'll be sticking with FreeBSD directly installed on the system then.
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.