Proxmox 5 beta -> proxmox 5.0 release issues with keys and ceph.

tomtom13

Well-Known Member
Dec 28, 2016
84
5
48
42
Hi.

Before somebody calls me silly ... I know that this trick shall not be done on production systems ... but I needed some of the features of 5.0 so I jumped the ship early.

Anyway I've got 3 systems with proxmox 5 beta 1 & ceph runing as a small cluster and everything was ok.
After some time I've added fourth system at the time the 5.0 final was released to this cluster and I started getting some odd behaviour.

Essentially I've got ceph monitor running on 2 machines of original 3 (I didn't wanted to hinder performance of third sine it has no disks and it's working purelly as a work horse). Now I get these problem:
- when I try to add a monitor on fourth machine I get error in the loop telling me that it can't get keys
- why I try to migrate VM from second machine to fourth it goes OK, but when I try to migrate from third to fourth it fails with "Error: migration aborted (duration 00:00:00): Can't connect to destination address using public key"

I shall add that:
- fourth machine got 4 OSD created on it without any hickup and those are part of cluster and all is singing and dancing.
- fourth machine has a ceph monitor service running (id: 3), but it's not included in ceph.conf and everything is ok as well ... when I try to include this monitor to ceph.conf - whole ceph just grinds to a halt - even "ceph -s" fails to run. When I try to create a monitor it gives me a lot of errors but the it seems to pass through, monitor with id 4 is created and it gets included into ceph.conf as id 3 (wtf ?) then ceph grinds to a halt ... when replacing 3 with 4 in ceph.conf it does not help either ....


map look this way
1 - just 1 vm in local
2 - 4 vm on ceph rbd + 4 osd + mon
3 - 1 vm on ceph ebd + 4 osd + mon
4 - just 4 osd (monitor service running but not part of ceph.conf)

I'm in way of adding 5th machine here but I would like to iron out all the stuff before adding more problems to the mix.

I shall add that yes I did:
Code:
ssh-keyscan -t rsa proxmox-dl180-8bay-1 proxmox-dl180-14bay-1 proxmox-dl180-14bay-2 proxmox-dl180-14bay-3 >> /etc/pve/priv/known_hosts
# proxmox-dl180-14bay-3:22 SSH-2.0-OpenSSH_7.4p1 Debian-10+deb9u1
# proxmox-dl180-8bay-1:22 SSH-2.0-OpenSSH_7.4p1 Debian-10+deb9u1
# proxmox-dl180-14bay-1:22 SSH-2.0-OpenSSH_7.4p1 Debian-10+deb9u1
# proxmox-dl180-14bay-2:22 SSH-2.0-OpenSSH_7.4p1 Debian-10+deb9u1


Edit:
I've managed to fix up the migration problem. third machine did not seem to have a link in /etc/ssh/ssh_known_hosts pointing to known_hosts in pve dir ... it was just a standard file. why this got this way I don't know.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!