[SOLVED] Can't log in to lxc containers w/ AD creds after joining to AD

starkruzr

Renowned Member
Sometime during a -x release of 5.4, I think, I started having this problem where after joining containers to Active Directory with SSSD, the containers would no longer allow AD users to log in, either on the console or via SSH. I was hopeful upgrading to 6.0-7 would resolve this issue, but no such luck. The strange thing is that existing containers that were joined before this problem developed don't have this issue. I can't nail down what this problem is owed to and was hoping someone else used lxc containers joined to directory services via SSSD and had seen similar issues.

Here's what I get from /var/log/auth.log:
Code:
Oct  7 04:41:43 TestMe login[396]: pam_unix(login:auth): authentication failure; logname=LOGIN uid=0 euid=0 tty=/dev/tty1 ruser= rhost=  user=jtd
Oct  7 04:41:43 TestMe login[396]: pam_sss(login:auth): authentication failure; logname=LOGIN uid=0 euid=0 tty=/dev/tty1 ruser= rhost= user=jtd
Oct  7 04:41:43 TestMe login[396]: pam_sss(login:auth): received for user jtd: 4 (System error)
Oct  7 04:41:47 TestMe login[396]: FAILED LOGIN (1) on '/dev/tty1' FOR 'jtd', Authentication failure

Here's what I get from /var/log/sssd/sssd_RTECH.RTI.log after turning the debug level up in sssd.conf:
Code:
(Mon Oct  7 04:41:43 2019) [sssd[be[RTECH.RTI]]] [dp_pam_handler] (0x0100): Got request with the following data
(Mon Oct  7 04:41:43 2019) [sssd[be[RTECH.RTI]]] [pam_print_data] (0x0100): command: SSS_PAM_AUTHENTICATE
(Mon Oct  7 04:41:43 2019) [sssd[be[RTECH.RTI]]] [pam_print_data] (0x0100): domain: RTECH.RTI
(Mon Oct  7 04:41:43 2019) [sssd[be[RTECH.RTI]]] [pam_print_data] (0x0100): user: jtd@rtech.rti
(Mon Oct  7 04:41:43 2019) [sssd[be[RTECH.RTI]]] [pam_print_data] (0x0100): service: login
(Mon Oct  7 04:41:43 2019) [sssd[be[RTECH.RTI]]] [pam_print_data] (0x0100): tty: /dev/tty1
(Mon Oct  7 04:41:43 2019) [sssd[be[RTECH.RTI]]] [pam_print_data] (0x0100): ruser:
(Mon Oct  7 04:41:43 2019) [sssd[be[RTECH.RTI]]] [pam_print_data] (0x0100): rhost:
(Mon Oct  7 04:41:43 2019) [sssd[be[RTECH.RTI]]] [pam_print_data] (0x0100): authtok type: 1
(Mon Oct  7 04:41:43 2019) [sssd[be[RTECH.RTI]]] [pam_print_data] (0x0100): newauthtok type: 0
(Mon Oct  7 04:41:43 2019) [sssd[be[RTECH.RTI]]] [pam_print_data] (0x0100): priv: 1
(Mon Oct  7 04:41:43 2019) [sssd[be[RTECH.RTI]]] [pam_print_data] (0x0100): cli_pid: 396
(Mon Oct  7 04:41:43 2019) [sssd[be[RTECH.RTI]]] [pam_print_data] (0x0100): logon name: not set
(Mon Oct  7 04:41:43 2019) [sssd[be[RTECH.RTI]]] [krb5_auth_send] (0x0100): Home directory for user [jtd@rtech.rti] not known.
(Mon Oct  7 04:41:43 2019) [sssd[be[RTECH.RTI]]] [fo_resolve_service_send] (0x0100): Trying to resolve service 'AD'
(Mon Oct  7 04:41:43 2019) [sssd[be[RTECH.RTI]]] [ad_resolve_callback] (0x0100): Constructed uri 'ldap://galactica.rtech.rti'
(Mon Oct  7 04:41:43 2019) [sssd[be[RTECH.RTI]]] [ad_resolve_callback] (0x0100): Constructed GC uri 'ldap://galactica.rtech.rti'
(Mon Oct  7 04:41:43 2019) [sssd[be[RTECH.RTI]]] [parse_krb5_child_response] (0x0020): message too short.
(Mon Oct  7 04:41:43 2019) [sssd[be[RTECH.RTI]]] [krb5_auth_done] (0x0040): The krb5_child process returned an error. Please inspect the krb5_child.log file or the journal for more information
(Mon Oct  7 04:41:43 2019) [sssd[be[RTECH.RTI]]] [krb5_auth_done] (0x0040): Could not parse child response [22]: Invalid argument
(Mon Oct  7 04:41:43 2019) [sssd[be[RTECH.RTI]]] [krb5_auth_queue_done] (0x0040): krb5_auth_recv failed with: 22
(Mon Oct  7 04:41:43 2019) [sssd[be[RTECH.RTI]]] [child_sig_handler] (0x0020): child [416] failed with status [255].
(Mon Oct  7 04:53:22 2019) [sssd[be[RTECH.RTI]]] [child_sig_handler] (0x0100): child [513] finished successfully.

Not super helpful, I know.
 
Is there any solution for someone, who wants to stay with unpriviledged containers?

UIDs in AD are pretty high number as

getent passwd user.joe@example.com user.joe@example.com:*:1333222111:1333222111:Joe User:/home/user.joe@example.com:/bin/bash
 
This is reproducible in 8.3.3

I was able to work around the problem by configuring my server like:

Code:
root@hyv-host:~# cat /etc/subuid
root:100000:1000000000
root@hyv-host:~# cat /etc/subgid
root:100000:1000000000
root@hyv-host:~# cat /etc/pve/lxc/157.conf
....
lxc.idmap: u 0 100000 1000000000
lxc.idmap: g 0 100000 1000000000

There was a bunch of trial an error involved, so I think I copied all of the relevant things I changed, with a healthy dose of errors like lxc 20250316003319.912 ERROR idmap_utils - ../src/lxc/idmap_utils.c:lxc_map_ids:245 - newuidmap failed to write mapping "newuidmap: subuid overflow detected.": newuidmap

EDIT 2025-04-19: Depending on the UID <-> RID range mapping SSSD uses for your domain, these numbers may not be high enough and can still cause the container to run out of UIDs/GIDs - you may have to bump the upper value (on all three files) as needed
 
Last edited:
  • Like
Reactions: DaniDD