There is another way.
Migrate to another node in the cluster and start it. Target node must be a node which never had the same issue. If not it will not work.
Not an good way, but a temporary solution. I have 5 nodes in each cluster, so i move around.
I have this result also.
root@Q172:~# grep copy_net_ns /proc/*/stack
/proc/10436/stack:[<ffffffff9cddfd1b>] copy_net_ns+0xab/0x220
/proc/10464/stack:[<ffffffff9cddfd1b>] copy_net_ns+0xab/0x220
/proc/11425/stack:[<ffffffff9cddfd1b>] copy_net_ns+0xab/0x220
/proc/11470/stack:[<ffffffff9cddfd1b>]...
I have 4.13.13-6-pve .
No NFS. Just ZFS.
I have 25 nodes live, everyday atelast 3 nodes going down with issue.
It is a nightmare.
root@Q172:~# systemctl status pve-container@103.service
● pve-container@103.service - PVE LXC Container: 103
Loaded: loaded...
yes, I also face same issue.
I am having sleepless nights for last one week.
Proxmox is a nightmare now, everyday 2 or 3 nodes crash for me. I have 25 nodes with LXC.
No reply from Proxmox till now.
I set c_max to 1GB, and running 10 guests per node. And ZFS work perfect for backups, and we can restore to same node.
You can also copy backup to another node and restore guest very easily.
If you have more number of guests it is better to set c_max to 4GB. You need memory for everything else.
I am using c_max with 1GB and running upto 10 hosts with 64GB RAM, swappiness to 0 and swap for LXC guests to 0.
1. I am reinstalling a failed node, so I have no backup. Second drive has all guest images.
2. I have /etc/pve/lxc/ folder files backup.
3. So from fresh Proxmox installation on first drive, I will add second drive then I will join the cluster again, then I will copy /etc/pve/lxc/ folder files...
I found a work around to avoid node restart to solve the issue in 4.13. kernel.
It is not a very good one, but it avoids node restart.
I will use this until next Proxmox comes.
(I posted this in another thread also)
I found a work around to avoid node restart to solve the issue.
It is not a very good one, but it avoids node restart.
I will use this until next Proxmox comes.
"Roughly and in generell yes."
So you answer was wrong? you said it is possible.
My original post clearly says node is not empty. Please read below.
1. I removed node from cluster
2. Reinstalled node, copied /etc/pve/lxc/ folder files to make all LXC guests come online. Node is not empty...
1. I removed node from cluster
2. Reinstalled node, copied /etc/pve/lxc/ folder files to make all LXC guests come online.
3. Now I can not add node back to cluster, we can not add a node with existing guests.
So what is the work around to add node back to cluster?
"If the IPMI is stopping to response you have a HW problem.
No OS can brake the IPMI."
In my experience yes it is possible.
In BIOS there is an option enable/disable IPMI for OS. Keep it disabled.
A node with 3 or more guests, if you see swap usage, that will slow down everything.
We need to avoid swap. vm.swappiness = 0 is the best way.
I even recommend adding more RAM and set swap to 0 in LXC guests.
1. I have a cluster with 2 nodes. Each node has 5 CTs each. Proxmox is version 5.1.46.
2. Each node has two drives with ZFS, first one holds Proxmox (local and local-zfs). Other drive is attached as data to hold the CTs.
3. I take backup of /etc/pve/lxc/ folder from both nodes to another...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.