Hi,
I've got some weird issues and I'm hoping someone might recognize some symptoms and possibly be able to comment.
This is on a recently patched-to-latest proxmox 5.4.13 host. It has been running proxmox for >1 year and has been patched 'gradually'. (ie, every few months). I'm pretty sure most recent patches were applied with no reboot yet. (because it is in production).
For this environment, normally I deploy a new LXC VM from a backup of a 'good copy of an existing host that does the same thing' basically. Spin up the new copy after adjusting its hostname, MAC address, IP address. Then carry on with it, some small internal customization, all is well. The base starting point is a 'priviledged' LXC VM which was built using a pretty standard CentOS 6.X template.
Yesterday, I tried to do 'the normal routine' and - first weird thing - is that the LXC VM when it boots up, it does not have any IP address.
I can manually 'ifup eth0' and bring up the interface. Then can ping out; or from the outside, can ping in.
But things still don't work well after this - I think maybe various services which depend on network fail to start properly. There are 'many small problems' I think. I can manually start httpd for example, but then from outside cannot actually connect to port80/443 as I should be normally able to do so.
I did some various tests. Tried to spin up the same VM on second Proxmox host which more spare resources - not a resource issue on proxmox; same outcome.
Tried to restore it as an 'unpriviledged' restore container. No change/improvement. Tried to enable 'nesting' as a workaround, after finding this thread discussion which seems to have similar? symptoms/issues?
https://forum.proxmox.com/threads/privileged-lxc-container-cant-get-ip-apparmor.58912/
Tried a copy of it on a Prox6.X.Latest / recently rebooted box. Same outcome / behaviour exactly.
So far the net result is that I can't get this container to start up normally.
I can create a new LXC container from scratch if I use a stock CentOS LXC template, and this starts up just fine / no weird behaviour.
So I am kind of baffled, what is different / why it is so unhappy.
Dmesg Logs on the proxmox node, tend to show this kind of stuff:
inside the VM, the messages are ~messy / more things are visible than I really want to see (ie, messages from other LXC Containers, not relevant but confounding; but visible due to shared kernel IIRC). No clear smoking gun.
I'm curious if this sounds vaguely familiar to anyone, in any way, and if there are any hints/suggestions on ways I can proceed to try to debug this further.
Ultimately I would like to be able to .. make copy .. spin up copy VM .. without this kind of drama.
Thanks,
Tim
I've got some weird issues and I'm hoping someone might recognize some symptoms and possibly be able to comment.
This is on a recently patched-to-latest proxmox 5.4.13 host. It has been running proxmox for >1 year and has been patched 'gradually'. (ie, every few months). I'm pretty sure most recent patches were applied with no reboot yet. (because it is in production).
For this environment, normally I deploy a new LXC VM from a backup of a 'good copy of an existing host that does the same thing' basically. Spin up the new copy after adjusting its hostname, MAC address, IP address. Then carry on with it, some small internal customization, all is well. The base starting point is a 'priviledged' LXC VM which was built using a pretty standard CentOS 6.X template.
Yesterday, I tried to do 'the normal routine' and - first weird thing - is that the LXC VM when it boots up, it does not have any IP address.
I can manually 'ifup eth0' and bring up the interface. Then can ping out; or from the outside, can ping in.
But things still don't work well after this - I think maybe various services which depend on network fail to start properly. There are 'many small problems' I think. I can manually start httpd for example, but then from outside cannot actually connect to port80/443 as I should be normally able to do so.
I did some various tests. Tried to spin up the same VM on second Proxmox host which more spare resources - not a resource issue on proxmox; same outcome.
Tried to restore it as an 'unpriviledged' restore container. No change/improvement. Tried to enable 'nesting' as a workaround, after finding this thread discussion which seems to have similar? symptoms/issues?
https://forum.proxmox.com/threads/privileged-lxc-container-cant-get-ip-apparmor.58912/
Tried a copy of it on a Prox6.X.Latest / recently rebooted box. Same outcome / behaviour exactly.
So far the net result is that I can't get this container to start up normally.
I can create a new LXC container from scratch if I use a stock CentOS LXC template, and this starts up just fine / no weird behaviour.
So I am kind of baffled, what is different / why it is so unhappy.
Dmesg Logs on the proxmox node, tend to show this kind of stuff:
Code:
[14315910.210979] EXT4-fs (loop30): mounted filesystem with ordered data mode. Opts: (null)
[14315910.231488] IPv6: ADDRCONF(NETDEV_UP): veth135i0: link is not ready
[14315910.594864] vmbr1: port 32(veth135i0) entered blocking state
[14315910.594866] vmbr1: port 32(veth135i0) entered disabled state
[14315910.594934] device veth135i0 entered promiscuous mode
[14315910.648355] eth0: renamed from vethVAXL1M
[14316038.426798] EXT4-fs (loop31): mounted filesystem with ordered data mode. Opts: (null)
[14316193.080213] EXT4-fs (loop31): mounted filesystem with ordered data mode. Opts: (null)
[14316448.851152] EXT4-fs (loop31): mounted filesystem with ordered data mode. Opts: (null)
[14316449.201362] audit: type=1400 audit(1588187659.550:541): apparmor="STATUS" operation="profile_load" profile="/usr/bin/lxc-start" name="lxc-136_</var/lib/lxc>" pid=2747 comm="apparmor_parser"
[14316449.203285] IPv6: ADDRCONF(NETDEV_UP): veth136i0: link is not ready
[14316449.631160] vmbr1: port 33(veth136i0) entered blocking state
[14316449.631162] vmbr1: port 33(veth136i0) entered disabled state
[14316449.631237] device veth136i0 entered promiscuous mode
[14316450.060041] eth0: renamed from vethCIUV9Q
[14316480.066097] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[14316480.066104] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[14316480.066131] vmbr1: port 33(veth136i0) entered blocking state
[14316480.066132] vmbr1: port 33(veth136i0) entered forwarding state
[14316595.560979] audit: type=1400 audit(1588187805.909:542): apparmor="STATUS" operation="profile_remove" profile="/usr/bin/lxc-start" name="lxc-136_</var/lib/lxc>" pid=8489 comm="apparmor_parser"
[14316595.971627] vmbr1: port 33(veth136i0) entered disabled state
[14316595.975248] device veth136i0 left promiscuous mode
[14316595.975251] vmbr1: port 33(veth136i0) entered disabled state
[14316610.585643] EXT4-fs (loop31): mounted filesystem with ordered data mode. Opts: (null)
[14316610.884154] audit: type=1400 audit(1588187821.229:543): apparmor="STATUS" operation="profile_load" profile="/usr/bin/lxc-start" name="lxc-136_</var/lib/lxc>" pid=9520 comm="apparmor_parser"
[14316610.937981] IPv6: ADDRCONF(NETDEV_UP): veth136i0: link is not ready
[14316611.352387] vmbr1: port 33(veth136i0) entered blocking state
[14316611.352389] vmbr1: port 33(veth136i0) entered disabled state
[14316611.352457] device veth136i0 entered promiscuous mode
[14316611.439309] eth0: renamed from vethOSDKX5
[14316648.181608] audit: type=1400 audit(1588187858.529:544): apparmor="STATUS" operation="profile_remove" profile="/usr/bin/lxc-start" name="lxc-136_</var/lib/lxc>" pid=10975 comm="apparmor_parser"
[14316648.681869] vmbr1: port 33(veth136i0) entered disabled state
[14316648.685830] device veth136i0 left promiscuous mode
[14316648.685833] vmbr1: port 33(veth136i0) entered disabled state
root@prox1:/var/lib/vz/dump#
inside the VM, the messages are ~messy / more things are visible than I really want to see (ie, messages from other LXC Containers, not relevant but confounding; but visible due to shared kernel IIRC). No clear smoking gun.
I'm curious if this sounds vaguely familiar to anyone, in any way, and if there are any hints/suggestions on ways I can proceed to try to debug this further.
Ultimately I would like to be able to .. make copy .. spin up copy VM .. without this kind of drama.
Thanks,
Tim