I have a OVH server too and that didn't work for me neither. I rebooted my host a few weeks ago because of a new kernel (so my system is up to date) and since then the second IP has become unreachable. Everything is correctly configured (in fact, the configuration hasn't changed since the time...
Hello,
I read in the PVE wiki that in order to add a node to a cluster, it shouldn't run any VM, because of potential VMID conflict.
But, thinking that I could join the cluster at anytime, I was very careful about giving a unique VMID to all my CTs, regardless of the node they was running on...
I assumed that I was running the latest version
~# pveversion --verbose
proxmox-ve: 4.1-39 (running kernel: 4.2.8-1-pve)
pve-manager: 4.1-15 (running version: 4.1-15/8cd55b52)
pve-kernel-4.2.6-1-pve: 4.2.6-36
pve-kernel-4.2.8-1-pve: 4.2.8-39
pve-kernel-4.2.2-1-pve: 4.2.2-16...
Hello,
I just saw on the web UI that 2 of my CTs was not running. On the node, 'pct list' show them as stopped. So:
~# pct shutdown 166
CT 166 not running
~# pct start 166
lxc-start: lxc_start.c: main: 279 Container is already running.
Actually, CTs are running, but what should I do if I...
It happens again on another host.
It happened the first time yesterday at 18:30, I had to hard reboot. It happened again just now (first symptoms: Icinga going full red). Hard reboot.
Same thing:
So, what about this?
It appeared before the first signs of something going wrong, around noon:
And later, everything stopped working:
In fact, I lost all the metrics in Icinga at exactly the time of the first error (08:54). More precisely, all the NRPE requests on all CTs stopped working...
pveversion -v
I can't upload gzip files or raw files > 1Mb so here are links to get them. I've greped lines from the day of the crash).
kern.log.gz http://dl.free.fr/m5cG7O2WQ
messages.gz http://dl.free.fr/q6j4ks3SO
syslog.gz http://dl.free.fr/kib9VztqO
It seems more CPU bound than memory bound, looking at Icinga reports (which is a CT running on the crashed host, I know, bad move). The lack of data on the right side started when everything stopped responding.
It happened again this sunday :
And so on...
All those messages showed up between output like these:
This time, even the host was unreachable. I had to hard reboot it from the OVH web manager.
Is it a kernel problem? A PVE problem? Is this caused by a CT behaving weirdly? Is it possible...
Hello,
Today a weird thing happened: all of a sudden Icinga started pushing dozens of alerts. All the CTs was unreachable. While still pingable, I couldn't ssh into them nor pct enter from the host (the command was hanging). The web UI show nothing special, the CTs was up ands running.
It's a...
Running lxcfs 2.0.0-pve1 and after rebooting it's worse than before :
You see, it took five hours to lxc-freeze to give up on freezing the VT. After that, the host went completely south, all the CT was shutdown, I couldn't even SSH the host. I think the RAM was totally overloaded (64GB!), I...
But this is not the number of CPUs presented to the CT. I set cpu limit = 2 on a container and htop display all the CPUs of the host, all of them working every now and then. cat /proc/cpuinfo show all CPUs, not the number assign by cpu limit. That can be misleading about the actual power...
Hello,
I just installed Proxmox 4 and I have a few question about a setup I thought of.
I have a few servers with a lot of disk space (hardware RAID, showing 1 physical volume) and I am wondering how to share this space among the cluster, thinking about HA and space optimization. I thought...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.