Cluster problem - passwordless ssh

Fortel

Renowned Member
May 4, 2009
18
0
66
Northern California
www.fortel.us
First, I find Proxmox to be fabulous. Before posting, I realized I had never donated, so I just did that, and encourage everyone to do the same.

We have 3 physical servers in a data center, all running Proxmox 1.3 and set up as a cluster. One is designated "Master," and the other two are "Slaves." So it's Proxmox1 (Master,) Proxmox2, and Proxmox3.

The master and one of the slaves are in production, and working beautifully, no issues. The third slave won't sync correctly. I've read a bit about the possible problems, and have tried all the suggestions I could find- deleting the configuration and recreating, clearing the SSH keys, Known Hosts, etc.

The problem seems to do with the passwordless SSH into the one slave. So, from one of the two "good" servers, I try to SSH IP address into the "bad" server, and it just sits there. There's no password prompt or anything. But when I do the same procedure from the "bad" server, I can successfully enter the other two servers.

Here's a snippet from the Master's /var/log/auth.log following a passwordless ssh attempt into the "bad" server...

Jan 11 09:43:35 proxmoxwf sshd[9599]: Accepted password for root from xx.xx.xxx.xx port 61337 ssh2
Jan 11 09:43:35 proxmoxwf sshd[9599]: pam_env(sshd:setcred): Unable to open env file: /etc/default/locale: No such file or directory
Jan 11 09:43:35 proxmoxwf sshd[9599]: pam_unix(sshd:session): session opened for user root by (uid=0)
Jan 11 09:43:35 proxmoxwf sshd[9601]: pam_env(sshd:setcred): Unable to open env file: /etc/default/locale: No such file or directory
Jan 11 09:44:37 proxmoxwf sshd[9627]: Accepted password for root from xx.xx.xxx.xx port 61340 ssh2
Jan 11 09:44:37 proxmoxwf sshd[9627]: pam_env(sshd:setcred): Unable to open env file: /etc/default/locale: No such file or directory
Jan 11 09:44:37 proxmoxwf sshd[9627]: pam_unix(sshd:session): session opened for user root by (uid=0)
Jan 11 09:44:37 proxmoxwf sshd[9627]: subsystem request for sftp
Jan 11 09:44:37 proxmoxwf sshd[9629]: pam_env(sshd:setcred): Unable to open env file: /etc/default/locale: No such file or directory

If the problem server were local, I'd re-install Proxmox. I'm sure open to suggestions...

Thanks,

Peter
 
I think all you have to do is:
Code:
echo -n "" >> /etc/default/locale
Ive seen somewhere debian systems wont create that file by default...
 
Okay, I've added the /etc/default/locale file, and as expected, the auth.log no longer complains about the missing file. But I still can't get the master to load the cluster table from the "bad" slave.

The master can ssh, without a password, into the one good slave.
The good slave can ssh, without a password, into the master.
The bad slave can ssh, without a password, into the master.

Problem: The Master can not ssh, at all, into the "bad" slave. There is no prompt or anything- it just sits there...


I have read some on passwordless ssh, but so far, nothing I've tried has been able to remedy the problem.

Any other ideas?

Peter
 
Okay, I've added the /etc/default/locale file, and as expected, the auth.log no longer complains about the missing file. But I still can't get the master to load the cluster table from the "bad" slave.

The master can ssh, without a password, into the one good slave.
The good slave can ssh, without a password, into the master.
The bad slave can ssh, without a password, into the master.

Problem: The Master can not ssh, at all, into the "bad" slave. There is no prompt or anything- it just sits there...


I have read some on passwordless ssh, but so far, nothing I've tried has been able to remedy the problem.

Any other ideas?

Peter

all nodes in the same network? same version on all nodes? debug via tcpdump.
 
Re: Cluster problem - passwordless ssh - solved!

Thanks, guys. This is a case of Mea Culpa!

I'd forgotten about installing IP tables on this 3rd server (which was intended to be a backup...) So the SSH login would not work from the Master or secondary Slave servers, but worked fine from other machines.

All is well now!

Have I mentioned what a great project this is, and that all should donate?!

Thanks again,

Peter