I am also having issues with the Xeon Scalable Gen1. I don't have any good traces at this time, yet when mine dies, the server doesn't freeze. It seems like the NIC's loose carrier (the Intel 500 Series). Then my whole dashboard just shows blank machines. Back on the 6.5 kernel, and the...
Thanks; I'm hoping it's just out of branch drivers. These boxes have run for 2 years; I've messed with them quite a bit, and they need a refresh. I'm adding a cluster node and converting from ZFS to CEPH, so a full overhaul. I don't use the onboard NIC's; rather, I have Intel 500 series in all...
Same here, running a 2-node cluster on Supermicro. Boxes are both Intel Gen 1 scalable processors. I may just hold off as I'm about to rebuild the cluster, regardless. However, I can't presently run on the 6.8 branch for more than a few hours. Yet to isolate errors, yet quite a few...
If there is no remediation, it would be great if this feature could be considered for the roadmap, as it seems like it would be quite useful. For now, I've had to remove the less critical clients from HA, but this isn't a perfect solution since that means they will fail to come back up on the...
Hello everyone,
I'm encountering an issue with two of our nodes that both use the same Host Bus Adapter (HBA). Recently, they have begun to consistently log error messages, which are cluttering the journal. Here are examples of the repeated error messages:
```
Apr 04 02:28:05 pve kernel...
Hey everyone,
I'm in the midst of setting up my HA (High Availability) setup and aiming to get it just right for my needs. I've got a couple of key guests that I need to automatically failover in case of issues, while there are others that aren't as crucial, which I'd prefer to shut down...
I think this is supposed to work now, interestingly enough I still get some complaints about unsupported configs if I use Overlay2, it doesn't SEEM to cause issues, just makes me curious as to why it's still throwing errors.
Hi All,
I've noticed that every time I upgrade my kernel to the latest prod PVE version the ID's on my NVIDIA cards change. I have to alter the configuration file for the LXC's that use video cards manually every time after a ls /dev/nvidia* and ls /dev/dri* to match the changes. Is there a...
Hi all,
I'm working on my first setup at Hetzner and I can't find any examples of what I am attempting to do. I'm hoping for some insight, an interfaces config or step by step if it's available would both be very helpful! While there's plenty of instructions with regard to hosting OpnSense...
Hi All,
Coming from 20 years as a professional in Windows (2 years of hands-on Linux) I'm having issues with fully grasping cgroups. I recently upgraded my server cluster and am I encountering OOM kills regularly (a few times per day). I have a single cgroup (just out-of-the-box config). My...
Thanks for that, so it's broken in Proxmox. Do the devs happen to have a fix planned? Seems like a really large issue that HA/failover is completely broken for anyone who's running a blocksize larger than 128K. It's good practice to match your workload. Looking forward to the fix! :)
All...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.