ssh broken after new drives added (internal SSD and external USB drives)

bruno.dogancic

New Member
Dec 3, 2022
6
0
1
Hey all!
I had a perfectly working PVE7 not so long ago. Today I went to ssh into my LXC container managed by Portainer to edit some conifig file (for Frigate NVR) only to see that I cannot ssh into it.

I have tried FileZilla and PuttY and get "ECONNREFUSED - Connection refused by server". From the lxc console inside proxmox I cannot proceed past the username (root):

1670049431302.png

What I have checked so far is nano /var/log/auth.log from inside the lxc (I have used lxc-attach --name 103 to access):

Bash:
Nov 21 02:25:07 docker sudo:     root : TTY=pts/1 ; PWD=/ ; USER=root ; COMMAND=/usr/bin/intel_gpu_top
Nov 21 02:25:07 docker sudo: pam_unix(sudo:session): session opened for user root by (uid=0)
Nov 21 02:25:07 docker sudo: pam_unix(sudo:session): session closed for user root
Nov 21 02:28:39 docker sshd[35005]: Accepted password for root from 192.168.1.122 port 60417 ssh2
Nov 21 02:28:39 docker sshd[35005]: pam_unix(sshd:session): session opened for user root by (uid=0)
Nov 21 02:29:39 docker sshd[35005]: pam_unix(sshd:session): session closed for user root
Nov 21 02:35:04 docker sshd[35241]: Accepted password for root from 192.168.1.122 port 60589 ssh2
Nov 21 02:35:04 docker sshd[35241]: pam_unix(sshd:session): session opened for user root by (uid=0)
Nov 21 02:36:47 docker sshd[35241]: pam_unix(sshd:session): session closed for user root
Nov 21 02:36:58 docker sshd[36189]: Accepted password for root from 192.168.1.122 port 60648 ssh2
Nov 21 02:36:58 docker sshd[36189]: pam_unix(sshd:session): session opened for user root by (uid=0)
Nov 21 02:38:43 docker sshd[36189]: pam_unix(sshd:session): session closed for user root
Nov 21 04:22:22 docker sshd[9003]: pam_unix(sshd:session): session closed for user root
Dec  3 05:32:35 docker login[245]: PAM unable to dlopen(pam_unix.so): /lib/security/pam_unix.so: cannot open shared object file: No such file or directory
Dec  3 05:32:35 docker login[245]: PAM adding faulty module: pam_unix.so
Dec  3 05:32:39 docker login[245]: FAILED LOGIN (1) on '/dev/tty1' FOR 'root', Authentication failure
Dec  3 05:34:56 docker login[109300]: PAM unable to dlopen(pam_unix.so): /lib/security/pam_unix.so: cannot open shared object file: No such file or directory
Dec  3 05:34:56 docker login[109300]: PAM adding faulty module: pam_unix.so
Dec  3 05:34:58 docker login[109300]: FAILED LOGIN (1) on '/dev/tty1' FOR 'root', Authentication failure
Dec  3 05:38:10 docker login[109360]: PAM unable to dlopen(pam_unix.so): /lib/security/pam_unix.so: cannot open shared object file: No such file or directory
Dec  3 05:38:10 docker login[109360]: PAM adding faulty module: pam_unix.so
Dec  3 05:38:12 docker login[109360]: FAILED LOGIN (1) on '/dev/tty1' FOR 'root', Authentication failure
Dec  3 05:38:19 docker login[109360]: FAILED LOGIN (2) on '/dev/tty1' FOR 'root', Authentication failure
Dec  3 05:45:29 docker login[247]: PAM unable to dlopen(pam_unix.so): /lib/security/pam_unix.so: cannot open shared object file: No such file or directory
Dec  3 05:45:29 docker login[247]: PAM adding faulty module: pam_unix.so
Dec  3 05:45:33 docker login[247]: FAILED LOGIN (1) on '/dev/tty1' FOR 'root', Authentication failure
Dec  3 05:59:37 docker login[1402]: PAM unable to dlopen(pam_unix.so): /lib/security/pam_unix.so: cannot open shared object file: No such file or directory
Dec  3 05:59:37 docker login[1402]: PAM adding faulty module: pam_unix.so
Dec  3 05:59:40 docker login[1402]: FAILED LOGIN (1) on '/dev/tty1' FOR 'root', Authentication failure
Dec  3 06:06:04 docker login[1807]: PAM unable to dlopen(pam_unix.so): /lib/security/pam_unix.so: cannot open shared object file: No such file or directory
Dec  3 06:06:04 docker login[1807]: PAM adding faulty module: pam_unix.so
Dec  3 06:06:07 docker login[1807]: FAILED LOGIN (1) on '/dev/tty1' FOR 'root', Authentication failure
Dec  3 06:12:01 docker passwd[2103]: PAM unable to dlopen(pam_unix.so): /lib/security/pam_unix.so: cannot open shared object file: No such file or directory
Dec  3 06:12:01 docker passwd[2103]: PAM adding faulty module: pam_unix.so
Dec  3 06:36:40 docker login[1981]: PAM unable to dlopen(pam_unix.so): /lib/security/pam_unix.so: cannot open shared object file: No such file or directory
Dec  3 06:36:40 docker login[1981]: PAM adding faulty module: pam_unix.so
Dec  3 06:36:43 docker login[1981]: FAILED LOGIN (1) on '/dev/tty1' FOR 'root', Authentication failure

I'm not sure what to do or where to start to fix the problem. Already spent few hours searching around. I do have moderate knowledge in Linux, but this is me experimenting with Proxmox so the user input error could be something vague. :rolleyes:

Only thing that comes to mind is that I have upgraded storage on my machine. I have added one SSD inside and an USB SSD.

For adding storage I have followed these (I basically used binding mount points):
https://www.youtube.com/watch?v=w9X94bAm3dI
https://www.youtube.com/watch?v=65woVFSmEGE
https://www.youtube.com/watch?v=ATuUBocesmA

Best regards,
Bruno
 
Last edited:
Modern Linux versions (like Proxmox based on Debian) use the PCI ID of the network device as part of the network device name (as used in /etc/network/interfaces). Because you added a PCIe device, all PCI(e) devices after it now have a higher PCI ID. It is quite common that this breaks the network configuration.
I'm not sure that this is your problem as you appear to have issues with Docker and not SSH/GUI to Proxmox, Maybe your Docker VM has a similar issue?
 
Modern Linux versions (like Proxmox based on Debian) use the PCI ID of the network device as part of the network device name (as used in /etc/network/interfaces). Because you added a PCIe device, all PCI(e) devices after it now have a higher PCI ID. It is quite common that this breaks the network configuration.
I'm not sure that this is your problem as you appear to have issues with Docker and not SSH/GUI to Proxmox, Maybe your Docker VM has a similar issue?
Thank you for your response.

Yes, as stated before, I can normally access my PVE instance and through host console access the LXC but the SSH connection cannot be established.

Below is a screenshot of of my instance. The docker is set up inside debian LXC using this, under Docker/ Docker LXC.

1670791723468.png

I have also tried to upgrade Docker using apt update && apt upgrade -y as per installation script suggestion (available on the provided link. While doing so I get also unresolved dependencies and broken install errors...


Bash:
root@docker ~# apt update && apt upgrade -y
Get:1 http://security.debian.org buster/updates InRelease [34.8 kB]
Get:2 https://download.docker.com/linux/debian buster InRelease [54.0 kB]                                                                                               
Get:3 http://deb.debian.org/debian testing InRelease [164 kB]                                                                                                           
Hit:4 http://deb.debian.org/debian buster InRelease                                                                                                                     
Get:5 https://download.docker.com/linux/debian buster/stable amd64 Packages [30.5 kB]                 
Get:6 https://packages.cloud.google.com/apt coral-edgetpu-stable InRelease [6,722 B]
Ign:7 http://archive.turnkeylinux.org/debian buster-security InRelease
Get:8 http://security.debian.org buster/updates/main amd64 Packages [412 kB]
Get:9 http://security.debian.org buster/updates/main Translation-en [222 kB]
Get:10 http://deb.debian.org/debian testing/main amd64 Packages.diff/Index [63.6 kB]
Ign:11 http://archive.turnkeylinux.org/debian buster InRelease                                 
Get:12 http://deb.debian.org/debian testing/main Translation-en.diff/Index [63.6 kB]
Get:13 http://deb.debian.org/debian testing/non-free amd64 Packages.diff/Index [63.3 kB]     
Get:14 http://deb.debian.org/debian testing/non-free Translation-en.diff/Index [63.3 kB]
Hit:15 http://archive.turnkeylinux.org/debian buster-security Release           
Get:17 http://deb.debian.org/debian testing/main amd64 Packages T-2022-12-11-1403.31-F-2022-12-04-0205.35.pdiff [694 kB]
Get:18 http://deb.debian.org/debian testing/main Translation-en T-2022-12-11-1403.31-F-2022-12-04-0205.35.pdiff [65.3 kB]
Get:19 http://deb.debian.org/debian testing/non-free amd64 Packages T-2022-12-11-0214.36-F-2022-12-04-0205.35.pdiff [17.9 kB]
Get:17 http://deb.debian.org/debian testing/main amd64 Packages T-2022-12-11-1403.31-F-2022-12-04-0205.35.pdiff [694 kB]
Get:18 http://deb.debian.org/debian testing/main Translation-en T-2022-12-11-1403.31-F-2022-12-04-0205.35.pdiff [65.3 kB]
Get:19 http://deb.debian.org/debian testing/non-free amd64 Packages T-2022-12-11-0214.36-F-2022-12-04-0205.35.pdiff [17.9 kB]
Get:20 http://deb.debian.org/debian testing/non-free Translation-en T-2022-12-11-0214.36-F-2022-12-07-0208.41.pdiff [2,680 B]
Get:20 http://deb.debian.org/debian testing/non-free Translation-en T-2022-12-11-0214.36-F-2022-12-07-0208.41.pdiff [2,680 B]
Hit:21 http://archive.turnkeylinux.org/debian buster Release                                                         
Fetched 1,957 kB in 4s (510 kB/s)                                                                                     
Reading package lists... Done
Building dependency tree       
Reading state information... Done
440 packages can be upgraded. Run 'apt list --upgradable' to see them.
Reading package lists... Done
Building dependency tree       
Reading state information... Done
You might want to run 'apt --fix-broken install' to correct these.
The following packages have unmet dependencies:
 libc-bin : Depends: libc6 (< 2.29) but 2.36-5 is installed
 locales : Depends: libc-bin (> 2.36) but 2.28-10+deb10u1 is installed
           Depends: libc-l10n (> 2.36) but 2.28-10+deb10u1 is installed
 openssh-server : Depends: openssh-client (= 1:9.1p1-1) but 1:7.9p1-10+deb10u2 is installed
                  Depends: runit-helper (>= 2.14.0~) but it is not installed
                  Depends: libcrypt1 (>= 1:4.1.0) but it is not installed
                  Depends: libselinux1 (>= 3.1~) but 2.8-1+b1 is installed
                  Depends: libssl3 (>= 3.0.7) but it is not installed
E: Unmet dependencies. Try 'apt --fix-broken install' with no packages (or specify a solution).
root@docker ~# apt --fix-broken install
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Correcting dependencies... Done
The following additional packages will be installed:
  libc-bin libc-l10n libcbor0.8 libcrypt1 libfido2-1 libpam0g libselinux1 libssl3 openssh-client openssh-sftp-server runit-helper
Suggested packages:
  libpam-doc keychain libpam-ssh monkeysphere ssh-askpass
Recommended packages:
  manpages xauth
The following NEW packages will be installed:
  libcbor0.8 libcrypt1 libfido2-1 libssl3 runit-helper
The following packages will be upgraded:
  libc-bin libc-l10n libpam0g libselinux1 openssh-client openssh-sftp-server
6 upgraded, 5 newly installed, 0 to remove and 434 not upgraded.
3 not fully installed or removed.
Need to get 1,276 kB/4,724 kB of archives.
After this operation, 7,556 kB of additional disk space will be used.
Do you want to continue? [Y/n] Y
Get:1 http://deb.debian.org/debian testing/main amd64 libc-bin amd64 2.36-6 [604 kB]
Get:2 http://deb.debian.org/debian testing/main amd64 libc-l10n all 2.36-6 [672 kB]
Fetched 1,276 kB in 0s (4,143 kB/s)
perl: error while loading shared libraries: libcrypt.so.1: cannot open shared object file: No such file or directory
/usr/bin/perl: error while loading shared libraries: libcrypt.so.1: cannot open shared object file: No such file or directory
Setting up libc6:amd64 (2.36-5) ...
/usr/bin/perl: error while loading shared libraries: libcrypt.so.1: cannot open shared object file: No such file or directory
dpkg: error processing package libc6:amd64 (--configure):
 installed libc6:amd64 package post-installation script subprocess returned error exit status 127
Errors were encountered while processing:
 libc6:amd64
perl: error while loading shared libraries: libcrypt.so.1: cannot open shared object file: No such file or directory
E: Sub-process /usr/bin/dpkg returned an error code (1)

Seems like the bind mounting point has made some of the files required for normal operations unavailable.
 
To me, this looks quite messy.

Not only are there Debian repositories from two different branches (10/Buster and Testing) configured, but also from TurnKey (beside the obvious Docker ones). Google Coral stuff comes on top of it too.
 This must not be a problem, if one knows what he is doing (not saying that you do not) and specifically takes care of the Apt-priorities and what gets installed/updated from which repository.
So, why are the Debian Testing repositories configured? And for what are the TurnKey repositories in this case?

Then you installed, at least the base, via a script. So one would need to check and understand this script and what it does (and uses) first...

It could also be a storage problem (which may have lead to corrupted/missing data) and/or maybe even a mount point problem, like you said...

How does the LXC-config (pct config 103) look like?

Do all the Docker containers inside still run and work properly?

To analyse the whole situation for an outsider (given the appropriate knowledge (that I do not have)), this might still be (really) difficult and could take a lot of time.
This might also be the reason, why you did not get much response yet...

So to be honest: I would simply use a known working backup or maybe even better, set this clean up again from scratch. For the latter, you might consider doing it inside a VM instead of a LXC, since this is the recommended way for running Docker:
If you want to run application containers, for example, Docker images, it is recommended that you run them inside a Proxmox Qemu VM.
https://pve.proxmox.com/wiki/Linux_Container
But depending on your setup, this might be problematic or even not practicable (in your specific environment) at all, in regard to storage access and passthrough-stuff.

In addition: Setting up Docker is not that big of a deal. Simply rollout/install a basic Debian 11/Bullseye (netinstall: [1]) and follow the official Docker installation instructions: [2] for it.
No need for a third-party script at all...

Just my 2 cents.

[1] https://www.debian.org/distrib/netinst.en.html (Small CDs or USB sticks)
[2] https://docs.docker.com/engine/install/debian (Install using the repository)
 
Thank you for your kind and informative response.

This is the output of the
Code:
pct config 103
arch: amd64
cores: 2
features: keyctl=1,nesting=1
hostname: docker
memory: 4098
mp0: Storage:103/vm-103-disk-0.raw,mp=/mnt/frigate_nvr,size=300G
net0: name=eth0,bridge=vmbr0,firewall=1,hwaddr=C6:7D:60:10:03:34,ip=dhcp,type=veth
onboot: 1
ostype: debian
parent: working_frigate_igpu_not
rootfs: local-lvm:vm-103-disk-0,size=32G
swap: 512
lxc.cgroup2.devices.allow: c 226:0 rwm
lxc.cgroup2.devices.allow: c 226:128 rwm
lxc.cgroup2.devices.allow: c 29:0 rwm
lxc.cgroup2.devices.allow: c 189:* rwm
lxc.apparmor.profile: unconfined
lxc.cgroup2.devices.allow: a
lxc.mount.entry: /dev/dri/renderD128 dev/dri/renderD128 none bind,optional,create=file 0, 0
lxc.mount.entry: /dev/bus/usb/002 dev/bus/usb/002 none bind,optional,create=dir 0, 0
lxc.cap.drop:
lxc.mount.auto: cgroup:rw
lxc.cgroup.devices.allow: c 226:0 rwm
lxc.cgroup.devices.allow: c 226:128 rwm
lxc.cgroup.devices.allow: c 29:0 rwm
lxc.mount.entry: /dev/dri dev/dri none bind,optional,create=dir
lxc.mount.entry: /dev/fb0 dev/fb0 none bind,optional,create=file

Do all the Docker containers inside still run and work properly?
Everything else works, all the docker containers inside this 103 LXC, as well other LXCs and one VM. I do not, however, ssh into other LXCs (ssh not set up as it isn't needed) so I don't know if that works there.
For the latter, you might consider doing it inside a VM instead of a LXC, since this is the recommended way for running Docker:
https://pve.proxmox.com/wiki/Linux_Container
But depending on your setup, this might be problematic or even not practicable (in your specific environment) at all, in regard to storage access and passthrough-stuff.
This seemed more logical to me, but according to this post, LXC has an edge over VM and the CORAL passthrough was quite easy to achieve.

So to be honest: I would simply use a known working backup or maybe even better, set this clean up again from scratch.
I will probably go this route and setup everything from scratch with this new hardware, as I don't plan on upgrading anything else anytime soon. It isn't a large cluster and I will be more cautious when setting it up.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!