Ipcc send rec failed: Connection refused at OVH Cloud

EuroDomenii

Renowned Member
Sep 30, 2016
145
32
68
Slatina
www.domenii.eu
There are plenty of post forum with this kind of error, usually triggered by wrong configuration of /etc/hosts https://pve.proxmox.com/wiki/Instal...d_an_.2Fetc.2Fhosts_entry_for_your_IP_address or network

Mine seems to be correct:


Code:
root@lon:~# cat /etc/hosts
127.0.0.1       localhost.localdomain   localhost
51.89.228.11    lon.debu.eu     lon.proxmox.com lon
root@lon:~# hostname --ip-address
51.89.228.11

Code:
root@lon:~# cat /etc/network/interfaces
# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

# The loopback network interface
auto lo
iface lo inet loopback

# The normal eth0
#allow-hotplug eth0
#iface eth0 inet dhcp

auto vmbr0
iface vmbr0 inet dhcp
        bridge_ports eth0
        bridge_stp off
        bridge_fd 0


# Additional interfaces, just in case we're using
# multiple networks
allow-hotplug eth1
iface eth1 inet dhcp

allow-hotplug eth2
iface eth2 inet dhcp

# Set this one last, so that cloud-init or user can
# override defaults.
source /etc/network/interfaces.d/*

root@lon:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master vmbr0 state UP group default qlen 1000
    link/ether fa:16:3e:ef:aa:ea brd ff:ff:ff:ff:ff:ff
    inet 51.89.228.11/32 brd 51.89.228.11 scope global dynamic eth0
       valid_lft 70416sec preferred_lft 70416sec
3: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether fa:16:3e:ef:aa:ea brd ff:ff:ff:ff:ff:ff
    inet 51.89.228.11/32 brd 51.89.228.11 scope global dynamic vmbr0
       valid_lft 70416sec preferred_lft 70416sec
    inet6 fe80::f816:3eff:feef:aaea/64 scope link
       valid_lft forever preferred_lft forever
root@lon:~# route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
default         51.89.228.1     0.0.0.0         UG    0      0        0 vmbr0
51.89.228.1     0.0.0.0         255.255.255.255 UH    0      0        0 vmbr0

Attached, pvereport
 

Attachments

Last edited:
even if it's just a 'dummy' instance, posting the password here is not a good idea. what does 'journalctl -b -u pve-cluster' say?
 
even if it's just a 'dummy' instance, posting the password here is not a good idea. what does 'journalctl -b -u pve-cluster' say?

Code:
root@lon:~# journalctl -b -u pve-cluster
-- Logs begin at Fri 2020-06-12 05:04:58 UTC, end at Fri 2020-06-12 05:56:30 UTC. --
-- No entries --

I've removed also the ip and port
 
Code:
root@lon:~# systemctl status pve-cluster pveproxy pvedaemon
● pve-cluster.service - The Proxmox VE cluster filesystem
   Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Fri 2020-06-12 04:14:40 UTC; 1h 44min ago
  Process: 10088 ExecStart=/usr/bin/pmxcfs (code=exited, status=255/EXCEPTION)

Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.

● pveproxy.service - PVE API Proxy Server
   Loaded: loaded (/lib/systemd/system/pveproxy.service; enabled; vendor preset: enabled)
   Active: active (running) since Fri 2020-06-12 03:28:31 UTC; 2h 30min ago
  Process: 7318 ExecStartPre=/usr/bin/pvecm updatecerts --silent (code=exited, status=111)
  Process: 7320 ExecStart=/usr/bin/pveproxy start (code=exited, status=0/SUCCESS)
 Main PID: 7325 (pveproxy)
    Tasks: 3 (limit: 2273)
   Memory: 125.6M
   CGroup: /system.slice/pveproxy.service
           ├─ 7325 pveproxy
           ├─15047 pveproxy worker
           └─15048 pveproxy worker

Jun 12 05:59:22 lon pveproxy[7325]: starting 1 worker(s)
Jun 12 05:59:22 lon pveproxy[7325]: worker 15047 started
Jun 12 05:59:22 lon pveproxy[15047]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5
Jun 12 05:59:23 lon pveproxy[15045]: worker exit
Jun 12 05:59:23 lon pveproxy[15046]: worker exit
Jun 12 05:59:23 lon pveproxy[7325]: worker 15045 finished
Jun 12 05:59:23 lon pveproxy[7325]: starting 1 worker(s)
Jun 12 05:59:23 lon pveproxy[7325]: worker 15048 started
Jun 12 05:59:23 lon pveproxy[7325]: worker 15046 finished
Jun 12 05:59:23 lon pveproxy[15048]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5

● pvedaemon.service - PVE API Daemon
   Loaded: loaded (/lib/systemd/system/pvedaemon.service; enabled; vendor preset: enabled)
   Active: active (running) since Fri 2020-06-12 01:19:42 UTC; 4h 39min ago
  Process: 779 ExecStart=/usr/bin/pvedaemon start (code=exited, status=0/SUCCESS)
 Main PID: 942 (pvedaemon)
    Tasks: 4 (limit: 2273)
   Memory: 133.6M
   CGroup: /system.slice/pvedaemon.service
           ├─942 pvedaemon
           ├─943 pvedaemon worker
           ├─944 pvedaemon worker
           └─945 pvedaemon worker

Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
 
sounds to me like pve-cluster cannot start. what does the log say when you restart it?
 
  • Like
Reactions: EuroDomenii
Code:
root@lon:~# systemctl start pve-cluster
Job for pve-cluster.service failed because the control process exited with error code.
See "systemctl status pve-cluster.service" and "journalctl -xe" for details.

root@lon:~# systemctl status pve-cluster.service
● pve-cluster.service - The Proxmox VE cluster filesystem
   Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Fri 2020-06-12 06:31:35 UTC; 3s ago
  Process: 16735 ExecStart=/usr/bin/pmxcfs (code=exited, status=255/EXCEPTION)

Jun 12 06:31:34 lon systemd[1]: pve-cluster.service: Control process exited, code=exited, status=255/EXCEPTION
Jun 12 06:31:34 lon systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
Jun 12 06:31:34 lon systemd[1]: Failed to start The Proxmox VE cluster filesystem.
Jun 12 06:31:35 lon systemd[1]: pve-cluster.service: Service RestartSec=100ms expired, scheduling restart.
Jun 12 06:31:35 lon systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 5.
Jun 12 06:31:35 lon systemd[1]: Stopped The Proxmox VE cluster filesystem.
Jun 12 06:31:35 lon systemd[1]: pve-cluster.service: Start request repeated too quickly.
Jun 12 06:31:35 lon systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
Jun 12 06:31:35 lon systemd[1]: Failed to start The Proxmox VE cluster filesystem.


Code:
root@lon:~# tail -10 /var/log/syslog
Jun 12 06:32:54 lon pveproxy[7325]: starting 1 worker(s)
Jun 12 06:32:54 lon pveproxy[7325]: worker 16802 started
Jun 12 06:32:54 lon pveproxy[16802]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1727.
Jun 12 06:32:55 lon pveproxy[16800]: worker exit
Jun 12 06:32:55 lon pveproxy[16801]: worker exit
Jun 12 06:32:55 lon pveproxy[7325]: worker 16801 finished
Jun 12 06:32:55 lon pveproxy[7325]: starting 1 worker(s)
Jun 12 06:32:55 lon pveproxy[7325]: worker 16803 started
Jun 12 06:32:55 lon pveproxy[7325]: worker 16800 finished
Jun 12 06:32:55 lon pveproxy[16803]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1727.
root@lon:~#


Code:
root@lon:~# cat /etc/pve/local/pve-ssl.key
cat: /etc/pve/local/pve-ssl.key: No such file or directory

root@lon:/# pvecm updatecerts -f
ipcc_send_rec[1] failed: Connection refused
ipcc_send_rec[2] failed: Connection refused
ipcc_send_rec[3] failed: Connection refused
Unable to load access control list: Connection refused
 
Last edited:
please include the full log - if you just look at the last 10 lines for services that auto-restart you will only see that they don't auto-restart anymore because they attempted to do so too often ;)
 
  • Like
Reactions: EuroDomenii
You were right, the “real error” is “Unable to get local IP address”, and “pve-cluster.service: Start request repeated too quickly” Is just the consequence, with restart counter is at 1 to restart counter is at 5.

See the relevant part of logs
Code:
Jun 11 23:11:20 lon systemd[1]: Starting The Proxmox VE cluster filesystem...
Jun 11 23:11:20 lon pmxcfs[1499]: [main] crit: Unable to get local IP address
Jun 11 23:11:20 lon pmxcfs[1499]: [main] crit: Unable to get local IP address
Jun 11 23:11:20 lon systemd[1]: pve-cluster.service: Control process exited, code=exited, status=255/EXCEPTION
Jun 11 23:11:20 lon systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
Jun 11 23:11:20 lon systemd[1]: Failed to start The Proxmox VE cluster filesystem.
Jun 11 23:11:20 lon systemd[1]: Condition check resulted in Corosync Cluster Engine being skipped.
Jun 11 23:11:20 lon systemd[1]: pve-cluster.service: Service RestartSec=100ms expired, scheduling restart.
Jun 11 23:11:20 lon systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 1.
Jun 11 23:11:20 lon systemd[1]: Stopped The Proxmox VE cluster filesystem.
Jun 11 23:11:20 lon systemd[1]: Starting The Proxmox VE cluster filesystem...
Jun 11 23:11:20 lon pmxcfs[1506]: [main] crit: Unable to get local IP address
Jun 11 23:11:20 lon pmxcfs[1506]: [main] crit: Unable to get local IP address
Jun 11 23:11:20 lon systemd[1]: pve-cluster.service: Control process exited, code=exited, status=255/EXCEPTION
Jun 11 23:11:20 lon systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
Jun 11 23:11:20 lon systemd[1]: Failed to start The Proxmox VE cluster filesystem.
Jun 11 23:11:20 lon systemd[1]: Condition check resulted in Corosync Cluster Engine being skipped.
Jun 11 23:11:20 lon pveproxy[1419]: worker exit
Jun 11 23:11:20 lon pveproxy[968]: worker 1419 finished
Jun 11 23:11:20 lon pveproxy[968]: starting 1 worker(s)
Jun 11 23:11:20 lon pveproxy[968]: worker 1519 started
Jun 11 23:11:20 lon pveproxy[1519]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1727.
Jun 11 23:11:20 lon systemd[1]: pve-cluster.service: Service RestartSec=100ms expired, scheduling restart.
Jun 11 23:11:20 lon systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 2.
Jun 11 23:11:20 lon systemd[1]: Stopped The Proxmox VE cluster filesystem.
Jun 11 23:11:20 lon systemd[1]: Starting The Proxmox VE cluster filesystem...
Jun 11 23:11:20 lon pmxcfs[1520]: [main] crit: Unable to get local IP address
Jun 11 23:11:20 lon pmxcfs[1520]: [main] crit: Unable to get local IP address
Jun 11 23:11:20 lon systemd[1]: pve-cluster.service: Control process exited, code=exited, status=255/EXCEPTION
Jun 11 23:11:20 lon systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
Jun 11 23:11:20 lon systemd[1]: Failed to start The Proxmox VE cluster filesystem.
Jun 11 23:11:20 lon systemd[1]: Starting PVE Status Daemon...
Jun 11 23:11:20 lon systemd[1]: Condition check resulted in Corosync Cluster Engine being skipped.
Jun 11 23:11:20 lon pveproxy[1420]: worker exit
Jun 11 23:11:21 lon pveproxy[1421]: worker exit
Jun 11 23:11:21 lon systemd[1]: pve-cluster.service: Service RestartSec=100ms expired, scheduling restart.
Jun 11 23:11:21 lon systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 3.
Jun 11 23:11:21 lon systemd[1]: Stopped The Proxmox VE cluster filesystem.
Jun 11 23:11:21 lon systemd[1]: Starting The Proxmox VE cluster filesystem...
Jun 11 23:11:21 lon pmxcfs[1522]: [main] crit: Unable to get local IP address
Jun 11 23:11:21 lon pmxcfs[1522]: [main] crit: Unable to get local IP address
Jun 11 23:11:21 lon systemd[1]: pve-cluster.service: Control process exited, code=exited, status=255/EXCEPTION
Jun 11 23:11:21 lon systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
Jun 11 23:11:21 lon systemd[1]: Failed to start The Proxmox VE cluster filesystem.
Jun 11 23:11:21 lon systemd[1]: Condition check resulted in Corosync Cluster Engine being skipped.
Jun 11 23:11:21 lon pveproxy[968]: worker 1420 finished
Jun 11 23:11:21 lon pveproxy[968]: starting 1 worker(s)
Jun 11 23:11:21 lon pveproxy[968]: worker 1523 started
Jun 11 23:11:21 lon pveproxy[968]: worker 1421 finished
Jun 11 23:11:21 lon pveproxy[968]: starting 1 worker(s)
Jun 11 23:11:21 lon pveproxy[968]: worker 1524 started
Jun 11 23:11:21 lon pveproxy[1523]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1727.
Jun 11 23:11:21 lon pveproxy[1524]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1727.
Jun 11 23:11:21 lon systemd[1]: pve-cluster.service: Service RestartSec=100ms expired, scheduling restart.
Jun 11 23:11:21 lon systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 4.
Jun 11 23:11:21 lon systemd[1]: Stopped The Proxmox VE cluster filesystem.
Jun 11 23:11:21 lon systemd[1]: Starting The Proxmox VE cluster filesystem...
Jun 11 23:11:21 lon pmxcfs[1525]: [main] crit: Unable to get local IP address
Jun 11 23:11:21 lon pmxcfs[1525]: [main] crit: Unable to get local IP address
Jun 11 23:11:21 lon systemd[1]: pve-cluster.service: Control process exited, code=exited, status=255/EXCEPTION
Jun 11 23:11:21 lon systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
Jun 11 23:11:21 lon systemd[1]: Failed to start The Proxmox VE cluster filesystem.
Jun 11 23:11:21 lon systemd[1]: Condition check resulted in Corosync Cluster Engine being skipped.
Jun 11 23:11:21 lon systemd[1]: pve-cluster.service: Service RestartSec=100ms expired, scheduling restart.
Jun 11 23:11:21 lon systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 5.
Jun 11 23:11:21 lon systemd[1]: Stopped The Proxmox VE cluster filesystem.
Jun 11 23:11:21 lon systemd[1]: pve-cluster.service: Start request repeated too quickly.
Jun 11 23:11:21 lon systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
Jun 11 23:11:21 lon systemd[1]: Failed to start The Proxmox VE cluster filesystem.
Jun 11 23:11:21 lon systemd[1]: Condition check resulted in Corosync Cluster Engine being skipped.
Jun 11 23:11:21 lon pvestatd[1521]: ipcc_send_rec[1] failed: Connection refused
Jun 11 23:11:21 lon pvestatd[1521]: ipcc_send_rec[1] failed: Connection refused
Jun 11 23:11:21 lon pvestatd[1521]: ipcc_send_rec[2] failed: Connection refused
Jun 11 23:11:21 lon pvestatd[1521]: ipcc_send_rec[2] failed: Connection refused
Jun 11 23:11:21 lon pvestatd[1521]: ipcc_send_rec[3] failed: Connection refused
Jun 11 23:11:21 lon pvestatd[1521]: ipcc_send_rec[3] failed: Connection refused


However, here’s the full log attached

I’ve also removed the line manage_etc_hosts from /etc/cloud/cloud.cfg and update_etc_hosts from section cloud-init

The culprit is somewhere at OVH cloud initialization settings, I guess.

Also, commands looks good


Code:
root@lon:~# getent hosts $(hostname)
51.89.228.11    lon.debu.eu lon.proxmox.com lon
root@lon:~# hostname --ip-address
51.89.228.11

What Proxmox is trying to do with "Unable to get local IP address" and OVH Cloud doesn't allow? I should look into code....

Here https://git.proxmox.com/?p=pve-clus...91828bd57ad5ccb00b12125361e6b61bed8ca;hb=HEAD

Code:
    if (!(cfs.ip = lookup_node_ip(cfs.nodename))) {
        cfs_critical("Unable to get local IP address");
        qb_log_fini();
        exit(-1);
    }



lookup_node_ip(const char *nodename)
{
    char buf[INET6_ADDRSTRLEN];
    struct addrinfo *ainfo;
    struct addrinfo ahints;
    char *res = NULL;
    memset(&ahints, 0, sizeof(ahints));

    if (getaddrinfo(nodename, NULL, &ahints, &ainfo))
        return NULL;

    if (ainfo->ai_family == AF_INET) {
        struct sockaddr_in *sa = (struct sockaddr_in *)ainfo->ai_addr;
        inet_ntop(ainfo->ai_family, &sa->sin_addr, buf, sizeof(buf));
        if (strncmp(buf, "127.", 4) != 0) {
            res = g_strdup(buf);
        }
    } else if (ainfo->ai_family == AF_INET6) {
        struct sockaddr_in6 *sa = (struct sockaddr_in6 *)ainfo->ai_addr;
        inet_ntop(ainfo->ai_family, &sa->sin6_addr, buf, sizeof(buf));
        if (strcmp(buf, "::1") != 0) {
            res = g_strdup(buf);
        }
    }

    freeaddrinfo(ainfo);

    return res;
}
 

Attachments

Last edited:
Some debugging
Code:
root@lon:~# perl -T -d /usr/bin/pvecm status

Loading DB routines from perl5db.pl version 1.53
Editor support available.

Enter h or 'h h' for help, or 'man perldebug' for more help.

IO::Socket::SSL::CODE(0x556ade4dc8a0)(/usr/share/perl5/IO/Socket/SSL.pm:260):
260:            INIT { init() }
  DB<1> b PVE::Cluster::cfs_update
  DB<2> n
main::(/usr/bin/pvecm:18):      };
  DB<2>
main::(/usr/bin/pvecm:20):      PVE::CLI::pvecm->run_cli_handler(prepare => $prepare, no_rpcenv => 1);
  DB<2>
PVE::Cluster::cfs_update(/usr/share/perl5/PVE/Cluster.pm:222):
222:        my ($fail) = @_;
  DB<2> l
222==>b     my ($fail) = @_;
223:        eval {
224:            my $res = &$ipcc_send_rec_json(CFS_IPC_GET_FS_VERSION);
225:            die "no starttime\n" if !$res->{starttime};
226
227:            if (!$res->{starttime} || !$versions->{starttime} ||
228                 $res->{starttime} != $versions->{starttime}) {
229                 #print "detected changed starttime\n";
230:                $vmlist = {};
231:                $clinfo = {};
  DB<2>
PVE::Cluster::cfs_update(/usr/share/perl5/PVE/Cluster.pm:223):
223:        eval {
  DB<2> l
223==>      eval {
224:            my $res = &$ipcc_send_rec_json(CFS_IPC_GET_FS_VERSION);
225:            die "no starttime\n" if !$res->{starttime};
226
227:            if (!$res->{starttime} || !$versions->{starttime} ||
228                 $res->{starttime} != $versions->{starttime}) {
229                 #print "detected changed starttime\n";
230:                $vmlist = {};
231:                $clinfo = {};
232:                $ccache = {};
  DB<2>
PVE::Cluster::cfs_update(/usr/share/perl5/PVE/Cluster.pm:224):
224:            my $res = &$ipcc_send_rec_json(CFS_IPC_GET_FS_VERSION);
  DB<2> s
PVE::Cluster::CODE(0x556add41a040)(/usr/share/perl5/PVE/Cluster.pm:139):
139:        my ($msgid, $data) = @_;
  DB<2> l
139==>      my ($msgid, $data) = @_;
140
141:        my $res = PVE::IPCC::ipcc_send_rec($msgid, $data);
142
143:        die "ipcc_send_rec[$msgid] failed: $!\n" if !defined($res) && ($! != 0);
144
145:        return decode_json($res);
146:    };
147
148     my $ipcc_get_config = sub {
  DB<2>
PVE::Cluster::CODE(0x556add41a040)(/usr/share/perl5/PVE/Cluster.pm:141):
141:        my $res = PVE::IPCC::ipcc_send_rec($msgid, $data);
  DB<2> s
PVE::Cluster::CODE(0x556add41a040)(/usr/share/perl5/PVE/Cluster.pm:143):
143:        die "ipcc_send_rec[$msgid] failed: $!\n" if !defined($res) && ($! != 0);
  DB<2> l
143==>      die "ipcc_send_rec[$msgid] failed: $!\n" if !defined($res) && ($! != 0);
144
145:        return decode_json($res);
146:    };
147
148     my $ipcc_get_config = sub {
149:        my ($path) = @_;
150
151:        my $bindata = pack "Z*", $path;
152:        my $res = PVE::IPCC::ipcc_send_rec(CFS_IPC_GET_CONFIG, $bindata);
  DB<2>
PVE::Cluster::cfs_update(/usr/share/perl5/PVE/Cluster.pm:237):
237:        my $err = $@;
  DB<2> x $err
0  undef
  DB<3>
PVE::Cluster::cfs_update(/usr/share/perl5/PVE/Cluster.pm:238):
238:        if ($err) {
  DB<3> x $err
0  'ipcc_send_rec[1] failed: Connection refused
'
  DB<4>
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!