I am facing this issue in multiple nodes.
Node shows grey in cluster, but ssh works and *the proxy service it is not active*
When I start proxy i get the message
"start failed - can't aquire lock '/var/run/pveproxy/pveproxy.pid.lock' - Resource temporarily unavailable"
Only solution is to...
I faced another issue with Promxox 5.1-46 and windows 2003.
After few hours it shows blue screen.
I tried all possible combinations of CPU, Drive and other settings, still same issue.
It was a running KVM on Proxmox 3.4, issue started after migrating to 5.1.46.
Any suggestions?
There was no error messages.
Running service pve-cluster restart cleared the grey issue.
No reboot was required.
All 4 nodes are on same CISCO 1000 mbps switch.
I also started facing same issue with 4.15.10 with KVM guests.
I have 4 nodes, only one shows green, rest all three grey.
But all nodes and guests pinging fine.
System crashed and restarted time 20:02:00
(I have 25 live nodes, 1 or 2 nodes crashes like this everyday. )
Log file.
Mar 29 19:39:42 Q172 pvedaemon[2841]: <root@pam> successful auth for user 'root@pam'
Mar 29 19:39:56 Q172 pvedaemon[9167]: <root@pam> successful auth for user 'root@pam'
Mar...
With default settings, node crashes when I start the 4th guest.
With changed settings node does not crash when i start 4th guest, since it starts KSM early enough.
I think it is more like LXC related issues than a bug.
In KVM I didn't face any issues.
pve-kernel-4.15.10-1-pve also has the above KSM sharing issue.
If you have plenty of memory, you will not see it.
I have 25+ nodes, and i don't have plenty of memory, so i see it often.
But after setting the % of KSM thresh hold., i d didn't face any issue.
No that is not the issue.
Suppose I already started 3 guests and node is at 75% memory usage, it will not start KSM sharing.
And when i start 4th guest, the node crashes and restarts.
I reproduced the same error multiple times. Every time node crashed.
Then I changed the KSM threshhold to...
In some case you can try different CPU types, that may solve the issue. Try to select the closest CPU as the node actually has.
I have Westmere CPU, and I selected Westmere for KVM and AMD64 for LXC, and it has better performance.
I don't have AMD based nodes. So i didn't test it.
But I can confirm that for LXC, there are still many bugs causing node restart which are very annoying
I have 4.15.3 running without any issues.
What is the error related to? SSL or KSM or just node becomes grey and one LXC guest not starting?
I faced few node restart issue due to KSM not starting.
I had to manually set KSM starting memory usage % to 75% to avoid issue and did systemctl...