unregistered_netdevice after stopping lxc-container

lethargos

Well-Known Member
Jun 10, 2017
128
4
58
74
I'm running proxmox 5-2.1 on a production server with several containers.
I tried to stop a container (id 508;Ubuntu 18.04; two network interfaces) and start another one (id 501; Ubuntu 16.04), which would substitute its role. So I needed to do it in a very short interval. I shutdown the container, but the task wouldn't finish. I tried to attach (lxc-attach) to the container, but I couldn't, so I drew the conclusion that it was stopped. So I also started the new container. This error showed up on the GUI:
Code:
Eunregister_netdevice: waiting for eth0 to become free. Usage count = 1
After a while all containers and icons on the respective server turned grey and I couldn't manage any of them.
I restarted the server completely and the first time it didn't boot up. I got the same error (above) which showed up 5-6 times. The server reboot by itself and after that eventually I started up. I've no idea why that is. I don't know why it happened the first time and I've no idea why it booted successfully afterwards.

This is part of /var/log/kern.log
Code:
Aug 30 14:57:01 svorng10 pvedaemon[24363]: <root@pam> end task UPID:svorng10:000063C1:2B5A73F7:5B87DB5A:vzshutdown:508:root@pam: unexpected status
Aug 30 14:57:07 svorng10 pvedaemon[29103]: <root@pam> starting task UPID:svorng10:000069FF:2B5A8A5A:5B87DB93:vzstop:508:root@pam:
Aug 30 14:58:03 svorng10 kernel: [7273511.453572] unregister_netdevice: waiting for eth0 to become free. Usage count = 1
Aug 30 14:59:04 svorng10 kernel: [7273572.284942] unregister_netdevice: waiting for eth0 to become free. Usage count = 1
Aug 30 15:01:33 svorng10 pve-guests[27771]: <root@pam> starting task UPID:svorng10:00006C85:2B5AF24A:5B87DC9D:stopall::root@pam:
Aug 30 15:01:33 svorng10 pve-guests[27781]: <root@pam> starting task UPID:svorng10:00006C87:2B5AF24B:5B87DC9D:vzshutdown:512:root@pam:
Aug 30 15:01:33 svorng10 pve-guests[27781]: <root@pam> starting task UPID:svorng10:00006C89:2B5AF24C:5B87DC9D:vzshutdown:509:root@pam:
Aug 30 15:01:34 svorng10 kernel: [7273721.971859] vmbr1: port 5(veth509i1) entered disabled state
Aug 30 15:01:43 svorng10 pve-guests[27781]: <root@pam> starting task UPID:svorng10:00006CFA:2B5AF635:5B87DCA7:vzshutdown:504:root@pam:
Aug 30 15:01:44 svorng10 pve-guests[27781]: <root@pam> starting task UPID:svorng10:00006CFC:2B5AF636:5B87DCA7:vzshutdown:503:root@pam:
Aug 30 15:01:44 svorng10 pve-guests[27781]: <root@pam> starting task UPID:svorng10:00006CFF:2B5AF638:5B87DCA8:vzshutdown:501:root@pam:
Aug 30 15:01:44 svorng10 pve-guests[27781]: <root@pam> starting task UPID:svorng10:00006D0B:2B5AF639:5B87DCA8:vzshutdown:500:root@pam:

I believe this is the last part of the failed boot process:
Aug 30 15:01:46 svorng10 kernel: [7273734.043208] unregister_netdevice: waiting for eth0 to become free. Usage count = 1
Aug 30 15:01:48 svorng10 pve-guests[27771]: <root@pam> end task UPID:svorng10:00006C85:2B5AF24A:5B87DC9D:stopall::root@pam: OK
Aug 30 15:01:56 svorng10 kernel: [7273744.123110] unregister_netdevice: waiting for eth0 to become free. Usage count = 1
Aug 30 15:02:06 svorng10 kernel: [7273754.202989] unregister_netdevice: waiting for eth0 to become free. Usage count = 1
Aug 30 15:02:16 svorng10 kernel: [7273764.282879] unregister_netdevice: waiting for eth0 to become free. Usage count = 1
Aug 30 15:02:26 svorng10 kernel: [7273774.362770] unregister_netdevice: waiting for eth0 to become free. Usage count = 1
Aug 30 15:02:36 svorng10 kernel: [7273784.442649] unregister_netdevice: waiting for eth0 to become free. Usage count = 1
Aug 30 15:02:46 svorng10 kernel: [7273794.522539] unregister_netdevice: waiting for eth0 to become free. Usage count = 1
Aug 30 15:02:56 svorng10 kernel: [7273804.602426] unregister_netdevice: waiting for eth0 to become free. Usage count = 1
Aug 30 15:03:06 svorng10 kernel: [7273814.682313] unregister_netdevice: waiting for eth0 to become free. Usage count = 1
Aug 30 15:03:17 svorng10 kernel: [7273824.762181] unregister_netdevice: waiting for eth0 to become free. Usage count = 1
And this is the start of the following boot after 4 minutes:
Code:
Aug 30 15:07:46 svorng10 kernel: [    0.000000] Linux version 4.15.17-2-pve (tlamprecht@evita) (gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1)) #1 SMP PVE 4.15.17-10 (Tue, 22 May 2018 11:15:44 +0200) ()
Aug 30 15:07:46 svorng10 kernel: [    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.15.17-2-pve root=/dev/mapper/pve-root ro quiet
Aug 30 15:07:46 svorng10 kernel: [    0.000000] KERNEL supported cpus:
Aug 30 15:07:46 svorng10 kernel: [    0.000000]   Intel GenuineIntel
Aug 30 15:07:46 svorng10 kernel: [    0.000000]   AMD AuthenticAMD
Aug 30 15:07:46 svorng10 kernel: [    0.000000]   Centaur CentaurHauls
Aug 30 15:07:46 svorng10 kernel: [    0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
Aug 30 15:07:46 svorng10 kernel: [    0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
Aug 30 15:07:46 svorng10 kernel: [    0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
There's nothing in the last booting process that stands out.
Any ideas what could have happened?
 
Hi,

can you post the config of the two containers?
 
This is the container that I couldn't shut down. This is the first time I've seen the config file and it seems that two configuration files are overlapped somehow in the same file. Of course, I've never manually edited this file, so the fact that it ended up like that only by making changes in the gui is dubious, to say the the least:
Code:
arch: amd64
cores: 3
hostname: varnish5OFF
memory: 15360
nameserver: 1.1.1.1
net0: name=eth0,bridge=vmbr0,gw=pub_ip,hwaddr=DE:06:F1:58:59:7D,ip=pub_ip1/24,type=veth
net1: name=eth1,bridge=vmbr1,hwaddr=0A:6D:44:25:52:64,ip=10.10.10.204/24,type=veth
ostype: ubuntu
parent: varnish5
rootfs: local-lvm:vm-508-disk-1,size=60G
searchdomain: domain.com
swap: 10240

[varnish5]
#varnish 5 with site.com h2
arch: amd64
cores: 2
hostname: varnish8
memory: 4096
nameserver: 1.1.1.1
net0: name=eth0,bridge=vmbr0,gw=pub_ip,hwaddr=DE:06:F1:58:59:7D,ip=pub_ip2/24,type=veth
net1: name=eth1,bridge=vmbr1,hwaddr=0A:6D:44:25:52:64,ip=10.10.10.220/24,type=veth
ostype: ubuntu
rootfs: local-lvm:vm-508-disk-1,size=50G
searchdomain: domain.com
snaptime: 1530786894
swap: 4096


This is the container that substituted the first container. Nothing wrong with it or so it seems:
Code:
arch: amd64
cores: 3
hostname: varnish5
memory: 15360
nameserver: 1.1.1.1 8.8.8.8
net0: name=eth0,bridge=vmbr0,gw=pub_ip,hwaddr=12:5A:7E:02:09:87,ip=pub_ip/24,type=veth
onboot: 1
ostype: ubuntu
rootfs: local-lvm:vm-501-disk-1,size=50G
searchdomain: domain.com
swap: 10240

Now that I think of it, could that be due to a snapshot? I've got the same type of configuration on another container. However, as far as I knew, the rootfs should have been different and another lvm disk should have been created. And that's not the case, and it's also not the case with the other container I'm talking about. Here it is:
Code:
arch: amd64
cores: 1
hostname: graph.domain.com
memory: 2048
nameserver: 8.8.8.8
net0: name=eth0,bridge=vmbr0,gw=193.230.142.1,hwaddr=DA:36:56:5A:E0:9F,ip=ipb_pub/24,type=veth
onboot: 1
ostype: ubuntu
parent: before_snmp
rootfs: local-lvm:vm-503-disk-1,size=15G
searchdomain: 1.1.1.1
swap: 2048

[before_snmp]
arch: amd64
cores: 1
hostname: graph.domain.com
memory: 2048
nameserver: 8.8.8.8
net0: name=eth0,bridge=vmbr0,gw=ip_pub,hwaddr=DA:36:56:5A:E0:9F,ip=server_ip/24,type=veth
onboot: 1
ostype: ubuntu
rootfs: local-lvm:vm-503-disk-1,size=15G
searchdomain: 1.1.1.1
snaptime: 1534511328
swap: 2048
 
Last edited:
This is the first time I've seen the config file and it seems that two configuration files are overlapped somehow in the same file
This is because you made a snapshot. This is Ok and the way it should be.

Please try to run the failing Container in the foreground to get debug information.

Code:
lxc-start -n 508 -F -o logfile.txt
 
Well, I'm not sure how I'll go around this. Given what happened last time, I'm a bit relucant to do that. I'll try it some time, out-of-hours, as it were.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!