Proxmox 4.0 - Previously running LXC container won't start

JAZ013 · Nov 18, 2015

Got a bit of a problem with my shiny new LXC containers. When the system boots, everything starts up fine and is running very well.

However, when I shutdown a container from the web interface regardless of whether I make a change or not, I'm unable to restart the container.

Task output shows:

Code:

lxc-start: lxc_start.c: main: 344 The container failed to start.
lxc-start: lxc_start.c: main: 346 To get more details, run the container in foreground mode.
lxc-start: lxc_start.c: main: 348 Additional information can be obtained by setting the --logfile and --logpriority options.
TASK OK

Sorry, but task is decidedly NOT OK.

When trying to start the container in the foreground from an ssh session, I get this output:

Code:

root@destiny:~# lxc-start --name 101 --foreground
RTNETLINK answers: No buffer space available
Dump terminated
Use of uninitialized value $tag in concatenation (.) or string at /usr/share/perl5/PVE/Network.pm line 176.
unable to add vlan  to interface veth101i0
lxc-start: conf.c: run_buffer: 342 Script exited with status 25
lxc-start: conf.c: lxc_create_network: 3047 failed to create netdev
lxc-start: start.c: lxc_spawn: 954 failed to create the network
lxc-start: start.c: __lxc_start: 1211 failed to spawn '101'
lxc-start: lxc_start.c: main: 344 The container failed to start.
lxc-start: lxc_start.c: main: 348 Additional information can be obtained by setting the --logfile and --logpriority options.

If I reboot the host, everything starts working fine again. Also, there is nothing special about the containers. They are just Debian containers with some storage and a single network interface with an IPv4 address and no VLANs or anything like that.

Thoughts, anyone? Any help would be appreciated.

dietmar · Nov 19, 2015

What is the output of

# pveversion -v

make sure you run the latest version.

JAZ013 · Mar 1, 2016

This has become a bit more of an issue now. I'm now unable to start any new containers. Previously, I thought it was fixed in an update but turns out that rebooting just made it work for a bit longer than usual.

Now though, I can't start any containers that weren't started on boot.

Out put of pveversion -v

Code:

root@destiny:~# pveversion -v
proxmox-ve: 4.1-39 (running kernel: 4.2.8-1-pve)
pve-manager: 4.1-15 (running version: 4.1-15/8cd55b52)
pve-kernel-4.2.6-1-pve: 4.2.6-36
pve-kernel-4.2.8-1-pve: 4.2.8-39
pve-kernel-4.2.2-1-pve: 4.2.2-16
pve-kernel-4.2.3-2-pve: 4.2.3-22
lvm2: 2.02.116-pve2
corosync-pve: 2.3.5-2
libqb0: 1.0-1
pve-cluster: 4.0-33
qemu-server: 4.0-62
pve-firmware: 1.1-7
libpve-common-perl: 4.0-49
libpve-access-control: 4.0-11
libpve-storage-perl: 4.0-42
pve-libspice-server1: 0.12.5-2
vncterm: 1.2-1
pve-qemu-kvm: 2.5-8
pve-container: 1.0-46
pve-firewall: 2.0-18
pve-ha-manager: 1.0-23
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u1
lxc-pve: 1.1.5-7
lxcfs: 2.0.0-pve1
cgmanager: 0.39-pve1
criu: 1.6.0-1
zfsutils: 0.6.5-pve7~jessie

JAZ013 · Mar 1, 2016

Oh, further to this, I tried starting from the command line again and get the same output. So I removed the network device from via the web interface and the container started up fine. It's a bit useless to me without an ethernet interface but definitely points to a networking issue.

dietmar · Mar 2, 2016

JAZ013 said:
Oh, further to this, I tried starting from the command line again and get the same output.

Please can you re-post the output you get (I assume line number changed)?

# lxc-start --name 101 --foreground

JAZ013 · Mar 2, 2016

Output is:

Code:

RTNETLINK answers: No buffer space available
Dump terminated
unable to add default vlan tags to interface veth400i0
lxc-start: conf.c: run_buffer: 342 Script exited with status 25
lxc-start: conf.c: lxc_create_network: 3084 failed to create netdev
lxc-start: start.c: lxc_spawn: 954 failed to create the network
lxc-start: start.c: __lxc_start: 1211 failed to spawn '400'
lxc-start: lxc_start.c: main: 344 The container failed to start.
lxc-start: lxc_start.c: main: 348 Additional information can be obtained by setting the --logfile and --logpriority options.

wbumiller · Mar 2, 2016

Does `pct config 101` show anything unusual? Can you post the output?
EDIT: Okay the second error is less confusing but still weird... Is your iproute2 package up to date?

JAZ013 · Mar 2, 2016

Here's two:

Code:

root@destiny:~# pct config 101
arch: amd64
cpulimit: 2
cpuunits: 1024
description: Ansible Configuration Management%0A
hostname: young
memory: 1024
net0: name=eth0,hwaddr=5E:91:7C:9D:C7:F7,bridge=vmbr0,ip=10.13.1.1/24,gw=10.13.1.254
onboot: 1
ostype: debian
rootfs: LocalContainers:subvol-101-disk-1,size=20G
startup: order=99
swap: 2048

root@destiny:~# pct config 400
arch: amd64
cpulimit: 1
cpuunits: 1024
hostname: test
memory: 512
net0: bridge=vmbr0,hwaddr=66:38:66:64:36:62,ip=dhcp,name=eth0,type=veth
ostype: debian
rootfs: LocalContainers:subvol-400-disk-1,size=8G
swap: 512

All packages are up to date.

I only just realised that it may have something to do with it, but I'm using a bonded Ethernet interface. I have a dual port Intel gigabit Ethernet adapter that has both ports bonded using LACP. I might try and turn off bonding and see if the problem still exists to rule that out.

LasseKongo · Apr 16, 2016

Hi,

Any solution to this problem yet ? I´ve just ran into it as well, but I´m not running containers, only regular VM:s.
Tried with several powered off VMs, and none of them can start unless I remove the nic.

JAZ013 · May 11, 2016

Just thought I would add this. After upgrading to Proxmox 4.2 yesterday I don't appear to be having this problem anymore. I'm pretty sure it was being caused by the bonded Ethernet interface. Proxmox 4.2 brought with it kernel 4.4.8-1-pve which appears to have fixed things.

I won't be 100% sure for a little while but we shall call it 95% certain for now.

To stress test I actually added 6 network interfaces to a container, all with DHCP. Normally that would have been a guaranteed fail but the container started perfectly fine. I've also done all my other tests that usually killed the container like starting/stopping repeatedly, sucking in lots of network datas, etc. Nothing seems to be able to kill it anymore.

The real test will be leaving it up and running for a few days/weeks as that is usually when the problem really starts to persist. But for now.... great success!?

JAZ013 · May 30, 2016

Things are still happy after more than 2 weeks.

ufoshi · Aug 24, 2016

Hi I have this issue now for all of my cluster. I have version 4.1. Is any solution for this?

JAZ013 · Aug 25, 2016

ufoshi said:
Hi I have this issue now for all of my cluster. I have version 4.1. Is any solution for this?

Yes. Update to 4.2 as per my post above.

ufoshi · Aug 25, 2016

Maybe is solution for 4.1 becouse upgrade is not the best solution for me.

JAZ013 · Aug 26, 2016

ufoshi said:
Maybe is solution for 4.1 becouse upgrade is not the best solution for me.

It's not a major update though. So you can't just apt-get dist-upgrade and reboot??? I think you've got more than just this problem then.

ufoshi · Aug 30, 2016

this is solution for me: not upgrade but downgrade one packet:
apt-get install pve-qemu-kvm=2.4-21

this is the tread: https://forum.proxmox.com/threads/backup-and-migration-imposible-proxmox-4.25819/

Search

Search

Proxmox 4.0 - Previously running LXC container won't start

JAZ013

New Member

dietmar

Proxmox Staff Member

JAZ013

New Member

JAZ013

New Member

dietmar

Proxmox Staff Member

JAZ013

New Member

wbumiller

Proxmox Staff Member

JAZ013

New Member

LasseKongo

Renowned Member

JAZ013

New Member

JAZ013

New Member

ufoshi

Renowned Member

JAZ013

New Member

ufoshi

Renowned Member

JAZ013

New Member

ufoshi

Renowned Member

We value your privacy