[SOLVED] Authentication issue (Login failed. Please try again)

Smanux

Renowned Member
Jun 29, 2009
39
4
73
Hi,

I have a 3 nodes cluster with Proxmox 5.2 that has been running fine for 9 months and for some reason I'm now unable to log into the manager. I keep getting the message "Login failed. Please try again" when using the root password and the PAM realm on any node. I'm sure the username/password is correct because an invalid entry triggers an "authentication failure" message from pvedaemon in journalctl. I observe that when the username/password is correct the login failed message appears almost immediately (and nothing is added to the journal), but when the password is incorrect the "Please wait" message remains 2 seconds before the error is displayed. I also have an account in the PVE realm and the login fails as well.

I tried rebooting the nodes (sequentially and simultaneously), I checked the system time on the nodes, I ran 'pvecm updatecerts' and rebooted, but the issue is still there. The file /etc/pve/user.cfg looks ok and hasn't changed. There is no firewall.

What did I miss?

Here are the versions currently installed:

Code:
proxmox-ve: 5.2-2 (running kernel: 4.15.18-9-pve)
pve-manager: 5.2-11 (running version: 5.2-11/13c2da63)
pve-kernel-4.15: 5.2-12
pve-kernel-4.15.18-9-pve: 4.15.18-30
pve-kernel-4.15.18-5-pve: 4.15.18-24
corosync: 2.4.4-pve1
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.1-1
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-41
libpve-guest-common-perl: 2.0-18
libpve-http-server-perl: 2.0-11
libpve-storage-perl: 5.0-30
libqb0: 1.0.3-1~bpo9
lvm2: 2.02.168-pve6
lxc-pve: 3.0.2+pve1-3
lxcfs: 3.0.2-2
novnc-pve: 1.0.0-2
proxmox-widget-toolkit: 1.0-20
pve-cluster: 5.0-30
pve-container: 2.0-29
pve-docs: 5.2-10
pve-edk2-firmware: 1.20181023-1
pve-firewall: 3.0-14
pve-firmware: 2.0-6
pve-ha-manager: 2.0-5
pve-i18n: 1.0-6
pve-libspice-server1: 0.14.1-1
pve-qemu-kvm: 2.12.1-1
pve-xtermjs: 1.0-5
qemu-server: 5.0-40
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.12-pve1~bpo1

Thanks a lot for your help.
 
Last edited:
  • Like
Reactions: NabiKAZ
We hade the same issue after updating the Proxmox packages a few days ago.

One of the updated packages was libpve-access-control (5.1-1). After downgrading that package to 5.0-8 we were able to log in again.

pveversion -v output looks exactly like yours (except for libpve-access-control being 5.0-8 after the downgrade).
 
can you post/share your user.cfg ? afaics, the package libpve-access-control had no changes which should break anything from 5.0-8 to 5.1-1 ...
 
My user.cfg file looks like this:

Code:
user:jdoe@pve:1:0:John:Doe:jdoe@example.com:::
user:root@pam:1:0:::admin@example.com:::




acl:1:/:jdoe@pve:Administrator:
 
I tried creating a new @pve account and I'm still unable to log in on the nodes with libpve-access-control 5.1-1.
 
do you have any special characters in your password ? could you try to change it to another one (e.g. with only ascii characters) and try again? this way we can see if that is the problem (altough it should not be)
 
The passwords only have ASCII characters ([0-9a-zA-Z] and punctuation characters).
 
I ran this command:

Code:
curl -k -d "username=jdoe@pve&password=secret"  https://localhost:8006/api2/json/access/ticket

On a node with libpve-access-control 5.1-1 I get this:

Code:
{"data":null}

and on a node with libpve-access-control 5.0-8 I get:

Code:
{
  "data": {
    "CSRFPreventionToken": "5BFBEF58:xGPcPqdPshTtP5tiKnxyopcPewM",
    "username": "jdoe@pve",
    "ticket": "PVE:jdoe@pve:5BFBEF58::Fvrbot4ZWJJ9iO0fYRAngycK...",
    "cap": {
      "access": {
        "Group.Allocate": 1,
        "Permissions.Modify": 1,
        "User.Modify": 1
      },
      "dc": {
        "Sys.Audit": 1
      },
      "vms": {
        "VM.Console": 1,
        "VM.Backup": 1,
        "VM.Config.Memory": 1,
        "VM.Audit": 1,
        "VM.Config.CDROM": 1,
        "VM.Config.Disk": 1,
        "VM.Clone": 1,
        "VM.Snapshot.Rollback": 1,
        "VM.Monitor": 1,
        "VM.Config.Network": 1,
        "VM.Snapshot": 1,
        "Permissions.Modify": 1,
        "VM.Allocate": 1,
        "VM.Config.CPU": 1,
        "VM.Config.HWType": 1,
        "VM.PowerMgmt": 1,
        "VM.Config.Options": 1,
        "VM.Migrate": 1
      },
      "storage": {
        "Datastore.AllocateSpace": 1,
        "Datastore.Audit": 1,
        "Permissions.Modify": 1,
        "Datastore.AllocateTemplate": 1,
        "Datastore.Allocate": 1
      },
      "nodes": {
        "Sys.Console": 1,
        "Permissions.Modify": 1,
        "Sys.PowerMgmt": 1,
        "Sys.Syslog": 1,
        "Sys.Audit": 1,
        "Sys.Modify": 1
      }
    }
  }
}
 
can you run the curl with '-v' ? (with 5.1-1)
 
Here is the output with -v :

Code:
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8006 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* TLSv1.2 (OUT), TLS header, Certificate Status (22):
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
* ALPN, server did not agree to a protocol
* Server certificate:
*  subject: CN=node.example.com
*  start date: Nov 20 02:23:04 2018 GMT
*  expire date: Feb 18 02:23:04 2019 GMT
*  issuer: C=US; O=Let's Encrypt; CN=Let's Encrypt Authority X3
*  SSL certificate verify ok.
> POST /api2/json/access/ticket HTTP/1.1
> Host: localhost:8006
> User-Agent: curl/7.52.1
> Accept: */*
> Content-Length: 37
> Content-Type: application/x-www-form-urlencoded
>
* upload completely sent off: 37 out of 37 bytes
< HTTP/1.1 500 section 'member' already exists and not marked as array!
< Cache-Control: max-age=0
< Connection: close
< Date: Mon, 26 Nov 2018 13:28:50 GMT
< Pragma: no-cache
< Server: pve-api-daemon/3.0
< Content-Length: 13
< Content-Type: application/json;charset=UTF-8
< Expires: Mon, 26 Nov 2018 13:28:50 GMT
<
* Curl_http_done: called premature == 0
* Closing connection 0
* TLSv1.2 (OUT), TLS alert, Client hello (1):
{"data":null}
 
can you please post your /etc/pve/corosync.conf
 
Here is the corosync configuration:

Code:
logging {
  debug: on
  logfile: /var/log/corosync/corosync.log
  timestamp: on
  to_logfile: yes
  to_syslog: yes
}

nodelist {
  node {
    name: node1
    nodeid: 3
    quorum_votes: 1
    ring0_addr: node1
  }
  node {
    name: node2
    nodeid: 2
    quorum_votes: 1
    ring0_addr: node2
  }
  node {
    name: node3
    nodeid: 1
    quorum_votes: 1
    ring0_addr: node3
  }
}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: universe
  config_version: 10
  interface {
    ringnumber: 0
    member {
      memberaddr: 10.1.1.1
    }
    member {
      memberaddr: 10.2.2.2
    }
    member {
      memberaddr: 10.3.3.3
    }
  }
  ip_version: ipv4
  secauth: on
  token: 30000
  transport: udpu
  version: 2
}

Is this an issue with the 'member' blocks?
 
Is this an issue with the 'member' blocks?
It's rather outdated, it should be enough to set the ring0_addr on each nodelist entry respectively, as the corosync.conf man page states. (either use the plain IP or ensure that each node can resolve all names here to the respective IP) Then you could remove the member list completely.

Anyway, I proposed a patch to just warn on errors at this point, that make sense in general.
 
Thank you for the hint. I'll give it a try the next time I reinstall the cluster. It's a bit odd that an element in the corosync configuration causes the authentication to fail though (well, maybe @pve but not @pam at least).
 
It looks like this commit is responsible for the regression with my configuration:

https://git.proxmox.com/?p=pve-access-control.git;a=commitdiff;h=e842fec5

Reverting the change fixes the issue.

I guess Proxmox has its own parser for the Corosync configuration and it needs an update? Also logging the errors would be a good idea (but maybe I didn't look in the right log/journal).

As upstream corosync moved away from the member list around 2012 I'd like to not add support for this, as even if we parse it we cannot do anything for sure, it's legacy stuff, people should fix there config instead. But yes erroring on this in the login path was definitively not ideal for such setups, sorry about that. The patch which ensures to catch this issue was tested and applied now, so in a coming update for access control this should be resolved.

If anything, we could add a 'transport' method to pvecm so that one can setup udpu with our tooling (which already writes out the nodelist correctly in a modern way), but first I'd like to evaluate kronosnet, the "new" transport method for corosync 3.0 (not yet released as stable).
 
Thank you for the quick fix. I agree better support for unicast would be great, I struggled a bit to get the cluster to work.
 
aynı sorunu yaşıyor, şu anda 11 sunucu yaşıyor sorunum var

root@pve1:~# pveversion -v proxmox-ve: 7.1-1 (çalışan çekirdek: 5.13.19-6-pve) pve-manager: 7.1-10 (çalışan sürüm: 7.1-10/6ddebafe) pve-çekirdek yardımcısı: 7.1-13 pve-kernel-5.13: 7.1-9 pve-kernel-5.11: 7.0-10 pve-çekirdek-5.13.19-6-pve: 5.13.19-14 pve-çekirdek-5.13.19-4-pve: 5.13.19-9 pve-çekirdek-5.13.19-2-pve: 5.13.19-4 pve-kernel-5.13.19-1-pve: 5.13.19-3 pve-çekirdek-5.11.22-7-pve: 5.11.22-12 pve-çekirdek-5.11.22-6-pve: 5.11.22-11 pve-çekirdek-5.11.22-5-pve: 5.11.22-10 pve-çekirdek-5.11.22-4-pve: 5.11.22-9 cep sigortası: 15.2.14-pve1 corosync: 3.1.5-pve2 kritik: 3.15-1+pve-1 glusterfs-istemcisi: 9.2-1 ifupdown2: 3.1.0-1+pmx3 ksm-kontrol-arka plan programı: 1.4-1 libjs-extjs: 7.0.0-1 libknet1: 1.22-pve2 libproxmox-acme-perl: 1.4.1 libproxmox-backup-qemu0: 1.2.0-1 libpve-erişim kontrolü: 7.1-6 libpve-apiclient-perl: 3.2-1 libpve-ortak-perl: 7.1-3 libpve-guest-common-perl: 4.1-1 libpve-http-sunucu-perl: 4.1-1 libpve-storage-perl: 7.1-1 libspice-sunucu1: 0.14.3-2.1 lvm2: 2.03.11-2.1 lxc-pve: 4.0.11-1 lxcfs: 4.0.11-pve1 novnc-pve: 1.3.0-2 proxmox-yedek-istemcisi: 2.1.5-1 proxmox-yedekleme-dosya geri yükleme: 2.1.5-1 proxmox-mini-günlük okuyucu: 1.3-1 proxmox-widget-araç seti: 3.4-7 pve-kümesi: 7.1-3 pve-konteyner: 4.1-4 pve-belgeleri: 7.1-2 pve-edk2-firmware: 3.20210831-2 pve güvenlik duvarı: 4.2-5 pve üretici yazılımı: 3.3-6 pve-ha-yöneticisi: 3.3-3 pve-i18n: 2.6-2 pve-qemu-kvm: 6.2.0-2 pve-xtermjs: 4.16.0-1 qemu-sunucu: 7.1-4 akıllı montools: 7.2-1 baharat terimi: 3.2-2 swtpm: 0.7.1~bpo11+1 vncterm: 1.7-1 zfsutils-linux: 2.1.2-pve1
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!