[SOLVED] Proxmox 5.1.46 LXC cluster error Job for pve-container@101.service failed

Thanks for your testing, Vasu. I'm patiently waiting for this to move to stable, as I am running a mixed system with qemu and lxc. Good to know that the solution is around the corner!
 
Actually kernel 4.13 has 3 issues

1. LXC starting issues.
2. kworker using 100% CPU
3. Extra IP in LXC is gone after restart.

Kernel 4.15 solved issue some issues

For example issue 1. is solved with kernel 4.15.

I am still testing issue 2. with kernel 4.15 inb 25 live nodes, no issues so far.

Issue 3. is still there in kernel 4.15
 
Could you explain issue 3? I have multiple containers running multiple IPs, though they are on different configured interfaces (host has multiple VLANs configured and shares them as separate network cards to the containers).

Are you assigning multiple IPs from the same network interface / subnet to a container?
 
I am using this method now and it works fine.

LXC extra IP with different subnet adding steps.

If LXC has different subnet than extra IP........

First add interface eth0 via GUI in the Node for the guest

Then login to guest and create extra config file for each IP.

1] login to proxmox select CT XXX > Network > add one NIC with XXX.XXX.XXX.XXX/32 and gateway XXX.XXX.XXX.1

2.login to CT and navigate to cd /etc/sysconfig/network-scripts

3. Run ls and you can list the network config files and will see :-

ifcfg-eth0

4. Now we can add extra ip by creating a new file : vi ifcfg-eth0:0

DEVICE=eth0:0
ONBOOT=yes
BOOTPROTO=none
IPADDR=10.10.10.10
NETMASK=255.255.255.255
GATEWAY=10.10.10.1

and save it

5. like this create one file each with ip and gateway of the each extra ip and save it (vi ifcfg-eth0:1 vi ifcfg-eth0:2 vi ifcfg-eth0:3 vi ifcfg-eth0:4 so 0n.......)


6. Then run service network restart.

May take upto 1 -2 minutes to get it ping. so please wait.
 
On reboot LXC loses all extra IPs.

I am adding eth1 with new IP, but on reboot it is gone.

I'm a couple upgrades behind (5.1_43) - is this a regression? I've had no issues with multiple IPs on a container, or kworker CPU utilization. One container has three, and they persist with no special configuration through a container or server reboot (single node, no HA).
 
My apologies for the necrobump, but I upgraded to 5.3-6 over the weekend and am experiencing the same issue starting LXC containers again. Is there a known regression and/or a fix?
 
The issue is still present but less frequently encountered in the 4.15.x line. See: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1779678

I saw it as recently as 4.15.18-8-pve and moved to custom 4.18 and 4.19 kernels afterwards. As in the bug report, I haven't seen the issue on these kernels but they both break AppArmor at the version level packaged with Debian Stretch. You'll need to rebuild the apparmor_parser (and libapparmor due to dependency) from source or use third party packages if you want to do the same.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!