100% cpu at node, pvecluster restart fixed this

whoim

Member
Jul 20, 2015
32
0
6
Hi!
Have proxmox 3.4/6 at AMD-based node. One time per day have 100% load cpu and loadavg is up.
Restart "pvecluster" service from webgui fixed this.
Network & cpu at each KVM guest is up.
jnAy8VBCQLjw2Z.jpg

this is node stats, after restart pvecluster loadavg parameter - down.
wtf? thx!
ps sorry my english )

BLmGDx8tYXJn2l.jpg

aD2PQW4IR9keA3.jpg
 
Last edited:
Anything strange in the log around the time this happens?

Updating the system would be also nice 3.4/6 isn't the latest version from Proxmox 3.4

Code:
apt-get update
apt-get dist-upgrade
 
in logs at this time (begin loadavg up) only open task "console", im open one guest console via vncproxy..

Be sure to check the logs from the PVE host not the guests. Also if that happens again log in to your server and look the "top" program which process causes the high load. You can use any other resource monitory (htop, ...) but top is already pre installed on PVE, simply execute:
Code:
top

in the host shell. If your not familiar with top i give you a quick info: You can quit with 'q' and change the sorted column with the '<' and '>' keys.
 
"top" program which process causes the high load. You can use any other resource monitory (htop, ...)
im using top and htop, 4 cores 100% used, but no overload processes. 10% each VM, and i think, internal network adapter on VM (software, virtio) create this cpu overload. Net on each vm up to. All vm on iptraf monitor send and recieve many kbytes traf, but node`s ETH - have small traffic..
 
currenty i create script

Code:
#!/bin/bash

loadavg=`uptime | grep -o 'load average.*' | cut -c 15-18`
maxavg=1
fileflag="/tmp/loadavg_email_sended"
email="myemail@email.ru"

if [ $(echo "$loadavg > $maxavg" | bc) -eq 1 ] && [ ! -f $fileflag  ]
then
    echo "!!! loadavg at $(echo `hostname`) is BIG: $loadavg" | mail -s "!!! loadavg at $(echo `hostname`) is BIG: $loadavg" $email
    echo "1" >$fileflag
    echo "!!! loadavg at $(echo `hostname`) is BIG: $loadavg"
    service pve-cluster restart
fi

if [ $(echo "$loadavg <= $maxavg" | bc) -eq 1 ] && [ -f $fileflag  ]
then
    echo "loadavg at $(echo `hostname`) is normal: $loadavg" | mail -s "loadavg at $(echo `hostname`) is normal now: $loadavg" $email
    rm -f $fileflag
    echo "loadavg at $(echo `hostname`) is normal: $loadavg"
fi

but not work cron every minute. /etc/crontab:
Code:
*/1 *   * * *   root    cd / && run-parts --report /etc/cron.1minute
/etc/cron.1minute is folder, have this script, but not working
have service cron restart normally
 
But the network and pve-cluster are not directly connected, so when you say restarting the pve-cluster service helps it's strange.

Can you describe your setup?

Also since when does this problem happen? Did you changed something?

Also use
Code:
crontab -e
to add a cronjob.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!