Troubles upgrading CTs to Debian 11 resulting in lost connection

Status
Not open for further replies.

Corwin

Renowned Member
Jan 26, 2016
37
0
71
FR
Hello,
Running PVE 7.0, I'm trying to upgrade several CTs from Debian 10 to 11, but I'm stuck as it drops the connection during the upgrade process, then the CT stops.
After restarting it, I cannot connect via ssh or even using the console (amazing... it justs indicates connected, the screen is completely dark, no keyboard), so I'm completely blocked here !!!
These CTs were originally created on PVE 6. The Nesting feature is set.

It seems the issue is related to the configuration step of systemd 247 over 241, here is the output:
Code:
Setting up systemd (247.3-6) ...
Installing new version of config file /etc/systemd/journald.conf ...
Installing new version of config file /etc/systemd/logind.conf ...
Installing new version of config file /etc/systemd/networkd.conf ...
Installing new version of config file /etc/systemd/resolved.conf ...
Installing new version of config file /etc/systemd/system.conf ...
Installing new version of config file /etc/systemd/user.conf ...
Created symlink /etc/systemd/system/sysinit.target.wants/systemd-pstore.service → /lib/systemd/system/systemd-pstore.service.

It takes a while then I got the following before the CT just hangs:
Failed to stop systemd-networkd.socket: connection reset by peer

That's the only input I have as I cannot log in at all now...

Here is the host syslog:
Code:
Sep 06 15:42:00 gerard kernel: systemd invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
Sep 06 15:42:00 gerard kernel: CPU: 4 PID: 184987 Comm: systemd Tainted: P           O      5.11.22-4-pve #1
Sep 06 15:42:00 gerard kernel: Hardware name: Intel Corporation S1200RP/S1200RP, BIOS S1200RP.86B.03.04.0006.030520181328 03/05/2018
Sep 06 15:42:00 gerard kernel: Call Trace:
Sep 06 15:42:00 gerard kernel:  dump_stack+0x70/0x8b
Sep 06 15:42:00 gerard kernel:  dump_header+0x4f/0x1f6
Sep 06 15:42:00 gerard kernel:  oom_kill_process.cold+0xb/0x10
Sep 06 15:42:00 gerard kernel:  out_of_memory+0x1cf/0x520
Sep 06 15:42:00 gerard kernel:  mem_cgroup_out_of_memory+0x139/0x150
Sep 06 15:42:00 gerard kernel:  try_charge+0x750/0x7b0
Sep 06 15:42:00 gerard kernel:  mem_cgroup_charge+0x8a/0x280
Sep 06 15:42:00 gerard kernel:  __add_to_page_cache_locked+0x34b/0x3a0
Sep 06 15:42:00 gerard kernel:  ? scan_shadow_nodes+0x30/0x30
Sep 06 15:42:00 gerard kernel:  add_to_page_cache_lru+0x4d/0xd0
Sep 06 15:42:00 gerard kernel:  pagecache_get_page+0x161/0x3b0
Sep 06 15:42:00 gerard kernel:  filemap_fault+0x6ce/0xa10
Sep 06 15:42:00 gerard kernel:  ? xas_load+0x9/0x80
Sep 06 15:42:00 gerard kernel:  ? xas_find+0x17a/0x1d0
Sep 06 15:42:00 gerard kernel:  __do_fault+0x3c/0xe0
Sep 06 15:42:00 gerard kernel:  handle_mm_fault+0x12c9/0x1a70
Sep 06 15:42:00 gerard kernel:  do_user_addr_fault+0x1a0/0x450
Sep 06 15:42:00 gerard kernel:  ? exit_to_user_mode_prepare+0x75/0x190
Sep 06 15:42:00 gerard kernel:  exc_page_fault+0x69/0x150
Sep 06 15:42:00 gerard kernel:  ? asm_exc_page_fault+0x8/0x30
Sep 06 15:42:00 gerard kernel:  asm_exc_page_fault+0x1e/0x30
Sep 06 15:42:00 gerard kernel: RIP: 0033:0x7fcdd392a950
Sep 06 15:42:00 gerard kernel: Code: Unable to access opcode bytes at RIP 0x7fcdd392a926.
Sep 06 15:42:00 gerard kernel: RSP: 002b:00007ffdd4aeee08 EFLAGS: 00010202
Sep 06 15:42:00 gerard kernel: RAX: 0000000000000000 RBX: 000055f7a5ca5176 RCX: 0000000000000000
Sep 06 15:42:00 gerard kernel: RDX: 000000000000000d RSI: 0000000000000073 RDI: 00007fcdd3ce1a48
Sep 06 15:42:00 gerard kernel: RBP: 0000000000000000 R08: 00007ffdd4aeee80 R09: 00007fcdd398dbe0
Sep 06 15:42:00 gerard kernel: R10: 0000000000000070 R11: 0000000000000020 R12: 0000000000000000
Sep 06 15:42:00 gerard kernel: R13: 00007ffdd4aeee80 R14: 0000000000000073 R15: 0000000000000000
Sep 06 15:42:00 gerard kernel: memory: usage 512000kB, limit 512000kB, failcnt 212308
Sep 06 15:42:00 gerard kernel: swap: usage 0kB, limit 512000kB, failcnt 0
Sep 06 15:42:00 gerard kernel: Memory cgroup stats for /lxc/100:
Sep 06 15:42:00 gerard kernel: anon 507285504
file 3649536
kernel_stack 1277952
pagetables 3379200
percpu 2582784
sock 155648
shmem 2838528
file_mapped 3514368
file_dirty 0
file_writeback 0
anon_thp 0
file_thp 0
shmem_thp 0
inactive_anon 510529536
active_anon 135168
inactive_file 303104
active_file 479232
unevictable 0
slab_reclaimable 1354184
slab_unreclaimable 4114616
slab 5468800
workingset_refault_anon 0
workingset_refault_file 223344
workingset_activate_anon 0
workingset_activate_file 123552
workingset_restore_anon 0
workingset_restore_file 118338
workingset_nodereclaim 165
pgfault 1907174
pgmajfault 247512
pgrefill 9885202
pgscan 11712293
pgsteal 247531
pgactivate 9642967
pgdeactivate 9761326
pglazyfree 0
pglazyfreed 0
thp_fault_alloc 0
thp_collapse_alloc 0
Sep 06 15:42:00 gerard kernel: Tasks state (memory values in pages):
Sep 06 15:42:00 gerard kernel: [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
Sep 06 15:42:00 gerard kernel: [ 184987] 100000 184987    93311    89988   782336        0             0 systemd
Sep 06 15:42:00 gerard kernel: [ 185237] 100000 185237     2528      790    53248        0             0 login
Sep 06 15:42:00 gerard kernel: [ 187450] 100000 187450     1984     1002    49152        0             0 bash
Sep 06 15:42:00 gerard kernel: [ 189907] 100000 189907    16924     5358   180224        0             0 apt
Sep 06 15:42:00 gerard kernel: [ 199502] 100000 199502     2793      879    57344        0             0 dpkg
Sep 06 15:42:00 gerard kernel: [ 201500] 100000 201500      604       49    40960        0             0 systemd.postins
Sep 06 15:42:00 gerard kernel: [ 201543] 100000 201543     2503      130    57344        0             0 systemctl
Sep 06 15:42:00 gerard kernel: [ 201544] 100000 201544     3395      469    61440        0             0 systemd-tty-ask
Sep 06 15:42:00 gerard kernel: [ 185239] 100000 185239     1348      380    49152        0             0 agetty
Sep 06 15:42:00 gerard kernel: [ 185173] 100000 185173    31981     1311   299008        0             0 systemd-journal
Sep 06 15:42:00 gerard kernel: [ 185228] 100000 185228    39048      731    73728        0             0 rsyslogd
Sep 06 15:42:00 gerard kernel: [ 185301] 100108 185301   444792    23645   487424        0             0 mysqld
Sep 06 15:42:00 gerard kernel: [ 185233] 100107 185233     2300      813    53248        0             0 dbus-daemon
Sep 06 15:42:00 gerard kernel: [ 185304] 100000 185304    57419     4795   188416        0             0 apache2
Sep 06 15:42:00 gerard kernel: [ 185314] 100033 185314    57498     2847   180224        0             0 apache2
Sep 06 15:42:00 gerard kernel: [ 185315] 100033 185315    57498     2847   180224        0             0 apache2
Sep 06 15:42:00 gerard kernel: [ 185316] 100033 185316    57498     2847   180224        0             0 apache2
Sep 06 15:42:00 gerard kernel: [ 185317] 100033 185317    57498     2847   180224        0             0 apache2
Sep 06 15:42:00 gerard kernel: [ 185318] 100033 185318    57498     2847   180224        0             0 apache2
Sep 06 15:42:00 gerard kernel: [ 185235] 100000 185235     4828     1213    77824        0             0 systemd-logind
Sep 06 15:42:00 gerard kernel: [ 185241] 100000 185241   321069     4788   249856        0             0 fail2ban-server
Sep 06 15:42:00 gerard kernel: [ 185238] 100000 185238     1348      395    49152        0             0 agetty
Sep 06 15:42:00 gerard kernel: [ 185643] 100000 185643     4963      490    77824        0             0 exim4
Sep 06 15:42:00 gerard kernel: [ 195333] 100000 195333     4909      280    77824        0             0 exim4
Sep 06 15:42:00 gerard kernel: [ 195355] 100000 195355     1644      137    49152        0             0 cron
Sep 06 15:42:00 gerard kernel: [ 201481] 100000 201481     3743      678    69632        0             0 sshd
Sep 06 15:42:00 gerard kernel: oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=ns,mems_allowed=0,oom_memcg=/lxc/100,task_memcg=/lxc/100/ns/init.scope,task=systemd,pid=184987,uid=100000
Sep 06 15:42:00 gerard kernel: Memory cgroup out of memory: Killed process 184987 (systemd) total-vm:373244kB, anon-rss:358552kB, file-rss:1400kB, shmem-rss:0kB, UID:100000 pgtables:764kB oom_score_adj:0
Sep 06 15:42:00 gerard kernel: oom_reaper: reaped process 184987 (systemd), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

Systemd has been killed :(
When I browsed the upgrade documentation and the forum I saw cgroups issues with systemd <231 but this is not my case I believe.

Any advice ?
 
hi,

it looks like you're running out of memory?
 
mmmmh.... this is strange, as the CT only uses 130 Mb for many months. During upgrade it eats all available memory (500MB was set). I never saw this behavior...
But yes, I just increased memory to 2GB and 2GB swap and I get a slight difference : there is no hangs up anymore, the upgrade process seems to continue but it takes so long, minute after minute. Full memory is being used, and I cannot shutdown or stop the CT !
I believe there is something wrong
 
Here are the last details. Upgrade seems to be completed but as I cannot shutdown the CT I prefered to reboot the host.
Now, when I start the CT I got the status 'running', but I cannot log in using the console, CPU takes 40% but only 15 MB memory, not good at all...
The host syslog only says that:
Sep 07 08:59:29 gerard audit[383391]: AVC apparmor="STATUS" operation="profile_load" profile="/usr/bin/lxc-start" name="lxc-100_</var/lib/lxc>" pid=383391 comm="apparmor_parser" Sep 07 08:59:29 gerard kernel: audit: type=1400 audit(1630997969.032:20): apparmor="STATUS" operation="profile_load" profile="/usr/bin/lxc-start" name="lxc-100_</var/lib/lxc>" pid=383391 comm="apparmor_parser" Sep 07 08:59:29 gerard systemd-udevd[383527]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable. Sep 07 08:59:29 gerard systemd-udevd[383527]: Using default interface naming scheme 'v247'. Sep 07 08:59:29 gerard kernel: vmbr1: port 1(veth100i0) entered blocking state Sep 07 08:59:29 gerard kernel: vmbr1: port 1(veth100i0) entered disabled state Sep 07 08:59:29 gerard kernel: device veth100i0 entered promiscuous mode Sep 07 08:59:29 gerard systemd-udevd[383526]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable. Sep 07 08:59:29 gerard systemd-udevd[383526]: Using default interface naming scheme 'v247'. Sep 07 08:59:30 gerard systemd-udevd[383526]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable. Sep 07 08:59:30 gerard systemd-udevd[383526]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable. Sep 07 08:59:30 gerard systemd-udevd[383527]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
What can I do ? I'm a bit lost...
 
Status
Not open for further replies.

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!