Hello
I have an issue where after around 1 hours since server reboot I can no longer start CT.
Creating a CT works fine and no errors are printed
Starting it however gives the following output
systemctl status pve-container@2430.service
If I stop an already running CT and try to start it again, it will also fail
I also noticed this issue only started happening after I passed 1k CT deployed. I wonder if there's some limit or memory increase that needs to be set?
My system itself is fine. 10 load average, 500GB free memory. There's nothing in dmesg
It also gets stuck when I try to SSH into an already existing CT. Like nothing past this loads
accessing it via the proxmox console works fine and everything is very responsive in there
I have an issue where after around 1 hours since server reboot I can no longer start CT.
Creating a CT works fine and no errors are printed
Starting it however gives the following output
Code:
failed to connect to monitor socket: Connection refused
systemctl status pve-container@2430.service
Code:
Apr 28 13:05:56 node1 systemd[1]: Started PVE LXC Container: 2430.
Apr 28 13:05:59 node1 systemd[1]: pve-container@2430.service: Main process exited, code=exited, status=1/FAILURE
Apr 28 13:05:59 node1 systemd[1]: pve-container@2430.service: Failed with result 'exit-code'.
Code:
root@node1:~# lxc-start 2445 --logfile /test.log
lxc-start: 2445: ../src/lxc/lxccontainer.c: wait_on_daemonized_start: 878 Received container state "ABORTING" instead of "RUNNING"
lxc-start: 2445: ../src/lxc/tools/lxc_start.c: main: 306 The container failed to start
lxc-start: 2445: ../src/lxc/tools/lxc_start.c: main: 309 To get more details, run the container in foreground mode
lxc-start: 2445: ../src/lxc/tools/lxc_start.c: main: 311 Additional information can be obtained by setting the --logfile and --logpriority options
root@node1:~# cat /test.log
lxc-start 2445 20230428140019.584 ERROR conf - ../src/lxc/conf.c:run_buffer:322 - Script exited with status 2
lxc-start 2445 20230428140019.627 ERROR network - ../src/lxc/network.c:lxc_create_network_priv:3427 - No such device - Failed to create network device
lxc-start 2445 20230428140019.627 ERROR start - ../src/lxc/start.c:lxc_spawn:1840 - Failed to create the network
lxc-start 2445 20230428140019.627 ERROR lxccontainer - ../src/lxc/lxccontainer.c:wait_on_daemonized_start:878 - Received container state "ABORTING" instead of "RUNNING"
lxc-start 2445 20230428140019.627 ERROR lxc_start - ../src/lxc/tools/lxc_start.c:main:306 - The container failed to start
lxc-start 2445 20230428140019.627 ERROR lxc_start - ../src/lxc/tools/lxc_start.c:main:309 - To get more details, run the container in foreground mode
lxc-start 2445 20230428140019.627 ERROR lxc_start - ../src/lxc/tools/lxc_start.c:main:311 - Additional information can be obtained by setting the --logfile and --logpriority options
lxc-start 2445 20230428140019.628 ERROR start - ../src/lxc/start.c:__lxc_start:2107 - Failed to spawn container "2445"
If I stop an already running CT and try to start it again, it will also fail
Code:
root@node1:/etc/pve/lxc# pct stop 1399
root@node1:/etc/pve/lxc# pct start 1399
failed to connect to monitor socket: Connection refused
I also noticed this issue only started happening after I passed 1k CT deployed. I wonder if there's some limit or memory increase that needs to be set?
My system itself is fine. 10 load average, 500GB free memory. There's nothing in dmesg
It also gets stuck when I try to SSH into an already existing CT. Like nothing past this loads
Code:
PTY allocation request failed on channel 0
Welcome to Ubuntu 20.04.6 LTS (GNU/Linux 5.15.102-1-pve x86_64)
* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage
accessing it via the proxmox console works fine and everything is very responsive in there
Last edited: