flow:
1 servers had reboot due to power maintenance,
2 (after the reboot) i noticed one server had bad clock sync - fixing the issue and another reboot solved it)
the
3. after time sync fixed cluster started to load and rebalance,
4 it hang at error state (data looks ok and everything stable and...
We have a setup of around 30 servers, 4 of them with ceph storage,
Unfortunately we have many power outages in our building and the backup battery does not last for long periods , casing entire cluster crash, (server, switches, storages)
Most of the time the entire cluster turn up when the...
i am trying to reboot lxc host as part of an ansible script , however i cannot make it work using ansible,
default reboot module did not work: https://docs.ansible.com/ansible/latest/collections/ansible/builtin/reboot_module.html
it printed Socket exception: Connection reset by peer (104) and...
unfortunately i rolled back to kernel 5.15 on all hosts with vms (after rollback no issues at all)
we use the servers in our production so i cannot risk another downtime.
the affect is only for VM.not for lxc
settings mitgation off did not solve the issue, just reduce the occurrence due to more efficient kernel ,
i have around 10 Ubuntu vms that the error occurs repeatedly under load (while running all nodes at 70% cpu capacity ) in less then an 1 hour i had the error at least on one of the nodes...
Setting mitigation off reduced the amount of the errors (still the bigger the load the more errors),
but going back to kernel 5.15 removed the issue entirely
I want to try it on a new a new node (fresh installed, and not upgraded from 7.4)
the node dont have kernel 5.15.116-1-pve installed
is it the flow:
proxmox-boot-tool kernel add 5.15.116-1-pve
proxmox-boot-tool pin 5.15.116-1-pve
proxmox-boot-tool refresh
after reboot does the kernet...
i have stability issues on nodes with high cpu load and i would move back the to kernel was on 7.4.
what is the best approach ?
i am on pve 8.0.4 kernel 6.2.16-14
sure here:
both host and vm have mitigations=off in grub (error was more frequent before settings this configuration)
my vm is host for high cpu load when the error occurs
proxmox host:
version:
proxmox-ve: 8.0.2 (running kernel: 6.2.16-14-pve)
pve-manager: 8.0.4 (running version...
i was investigation another issue, and left an open ssh conncetion to one of the vms:
i have the same error:
Message from syslogd@kube-node-11 at Sep 26 15:13:24 ...
kernel:[426489.912429] watchdog: BUG: soft lockup - CPU#31 stuck for 22s! [Engine_Simulato:1069183]
Message from...
Hey, so we've been using Proxmox in our small business since it was at version 3, and now we're on version 8. I don't know a ton about ESXi, but I've seen folks from other places slowly moving away from it and some hopping onto Proxmox.
To be real, Proxmox isn't exactly plug-and-play, and...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.