[SOLVED] Proxmox pve-cluster is down

seeby

Member
Mar 27, 2017
4
0
6
23
Hallo zusammen,

sporadisch bekomme ich die Meldung während des wöchentlichen Backsup dass der pve-cluster down sei.
Ich würde gerne in Erfahrung bringen wie es zu dieser Meldung kommt und wie da dagegen vorgehen kann.
Die vzdumps werden auf ein synology NAS geworfen.
Zwei Server sind im Cluster mit drbd und eine Maschine ist separat

Auszug aus dem syslog:
Code:
Mar 25 00:06:49 leia vzdump[147076]: INFO: Backup job finished successfully
Mar 25 00:06:51 leia postfix/qmgr[3585]: 6C60E2411A1: removed
Mar 25 00:06:56 leia pmxcfs[175745]: [status] notice: received log
Mar 25 00:08:32 leia pmxcfs[175745]: [status] notice: received log
Mar 25 00:08:33 leia pmxcfs[175745]: [status] notice: received log
Mar 25 00:10:32 leia kernel: [8680370.193335] drbd vm-102-disk-1 luke: peer( Primary -> Secondary )
Mar 25 00:10:34 leia pmxcfs[175745]: [status] notice: received log
Mar 25 00:10:35 leia kernel: [8680373.229720] drbd vm-103-disk-1 luke: Preparing remote state change 2875499359 (primary_nodes=0, weak_nodes=0)
Mar 25 00:10:35 leia kernel: [8680373.236521] drbd vm-103-disk-1 luke: Committing remote state change 2875499359
Mar 25 00:10:35 leia kernel: [8680373.236532] drbd vm-103-disk-1 luke: peer( Secondary -> Primary )
Mar 25 00:12:12 leia kernel: [8680470.524975] drbd vm-103-disk-1 luke: peer( Primary -> Secondary )
Mar 25 00:12:14 leia pmxcfs[175745]: [status] notice: received log
Mar 25 00:12:15 leia kernel: [8680473.219115] drbd vm-104-disk-1 luke: Preparing remote state change 1536146581 (primary_nodes=0, weak_nodes=0)
Mar 25 00:12:15 leia kernel: [8680473.219422] drbd vm-104-disk-1 luke: Committing remote state change 1536146581
Mar 25 00:12:15 leia kernel: [8680473.219431] drbd vm-104-disk-1 luke: peer( Secondary -> Primary )
Mar 25 00:13:28 leia kernel: [8680546.209862] drbd vm-104-disk-1 luke: peer( Primary -> Secondary )
Mar 25 00:13:30 leia pmxcfs[175745]: [status] notice: received log
Mar 25 00:13:30 leia kernel: [8680548.991534] drbd vm-105-disk-1 luke: Preparing remote state change 2321903957 (primary_nodes=0, weak_nodes=0)
Mar 25 00:13:30 leia kernel: [8680548.991850] drbd vm-105-disk-1 luke: Committing remote state change 2321903957
Mar 25 00:13:30 leia kernel: [8680548.991858] drbd vm-105-disk-1 luke: peer( Secondary -> Primary )
Mar 25 00:16:37 leia kernel: [8680735.305760] drbd vm-105-disk-1 luke: peer( Primary -> Secondary )
Mar 25 00:16:40 leia pmxcfs[175745]: [status] notice: received log
Mar 25 00:17:01 leia CRON[149636]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Mar 25 00:17:13 leia pmxcfs[175745]: [status] notice: received log
Mar 25 00:21:13 leia pmxcfs[175745]: [status] notice: received log
Mar 25 00:23:30 leia kernel: [8681148.729074] drbd .drbdctrl luke: PingAck did not arrive in time.
Mar 25 00:23:30 leia kernel: [8681148.729129] drbd .drbdctrl luke: conn( Connected -> NetworkFailure ) peer( Secondary -> Unknown )
Mar 25 00:23:30 leia kernel: [8681148.729131] drbd .drbdctrl/0 drbd0 luke: pdsk( UpToDate -> DUnknown ) repl( Established -> Off )
Mar 25 00:23:30 leia kernel: [8681148.729132] drbd .drbdctrl/1 drbd1 luke: pdsk( UpToDate -> DUnknown ) repl( Established -> Off )
Mar 25 00:23:30 leia kernel: [8681148.729148] drbd .drbdctrl luke: ack_receiver terminated
Mar 25 00:23:30 leia kernel: [8681148.729150] drbd .drbdctrl luke: Terminating ack_recv thread
Mar 25 00:23:30 leia kernel: [8681148.761177] drbd .drbdctrl luke: Connection closed
Mar 25 00:23:30 leia kernel: [8681148.761189] drbd .drbdctrl luke: conn( NetworkFailure -> Unconnected )
Mar 25 00:23:30 leia kernel: [8681148.761202] drbd .drbdctrl luke: Restarting receiver thread
Mar 25 00:23:30 leia kernel: [8681148.761211] drbd .drbdctrl luke: conn( Unconnected -> Connecting )
Mar 25 00:23:31 leia kernel: [8681149.749625] drbd .drbdctrl luke: Handshake to peer 0 successful: Agreed network protocol version 112
Mar 25 00:23:31 leia kernel: [8681149.749628] drbd .drbdctrl luke: Feature flags enabled on protocol level: 0x7 TRIM THIN_RESYNC WRITE_SAME.
Mar 25 00:23:31 leia kernel: [8681149.749675] drbd .drbdctrl luke: Peer authenticated using 32 bytes HMAC
Mar 25 00:23:31 leia kernel: [8681149.749681] drbd .drbdctrl luke: Starting ack_recv thread (from drbd_r_.drbdctr [2872])
Mar 25 00:23:31 leia kernel: [8681149.773640] drbd .drbdctrl luke: Preparing remote state change 1672246982 (primary_nodes=0, weak_nodes=0)
Mar 25 00:23:31 leia kernel: [8681149.773776] drbd .drbdctrl luke: Committing remote state change 1672246982
Mar 25 00:23:31 leia kernel: [8681149.773796] drbd .drbdctrl luke: conn( Connecting -> Connected ) peer( Unknown -> Secondary )
Mar 25 00:23:31 leia kernel: [8681149.785050] drbd .drbdctrl/0 drbd0: current_size: 8112
Mar 25 00:23:31 leia kernel: [8681149.785052] drbd .drbdctrl/0 drbd0 luke: c_size: 8112 u_size: 0 d_size: 8112 max_size: 8112
Mar 25 00:23:31 leia kernel: [8681149.785054] drbd .drbdctrl/0 drbd0 luke: la_size: 8112 my_usize: 0 my_max_size: 8112
Mar 25 00:23:31 leia kernel: [8681149.785056] drbd .drbdctrl/0 drbd0 luke: node_id: 0 idx: 0 bm-uuid: 0x0 flags: 0x10 max_size: 8112 (DUnknown)
Mar 25 00:23:31 leia kernel: [8681149.785057] drbd .drbdctrl/0 drbd0: my node_id: 1
Mar 25 00:23:31 leia kernel: [8681149.785060] drbd .drbdctrl/0 drbd0 luke: calling drbd_determine_dev_size()
Mar 25 00:23:31 leia kernel: [8681149.785061] drbd .drbdctrl/0 drbd0 luke: node_id: 0 idx: 0 bm-uuid: 0x0 flags: 0x10 max_size: 8112 (DUnknown)
Mar 25 00:23:31 leia kernel: [8681149.785062] drbd .drbdctrl/0 drbd0: my node_id: 1
Mar 25 00:23:31 leia kernel: [8681149.785068] drbd .drbdctrl/0 drbd0 luke: drbd_sync_handshake:
Mar 25 00:23:31 leia kernel: [8681149.785070] drbd .drbdctrl/0 drbd0 luke: self 3689142F609A6160:0000000000000000:0000000000000000:0000000000000000 bits:0 flags:120
Mar 25 00:23:31 leia kernel: [8681149.785072] drbd .drbdctrl/0 drbd0 luke: peer 3689142F609A6160:0000000000000000:3F7163EF603EE318:0BB538FAFDAA6E1A bits:0 flags:120
Mar 25 00:23:31 leia kernel: [8681149.785073] drbd .drbdctrl/0 drbd0 luke: uuid_compare()=0 by rule 38
Mar 25 00:23:31 leia kernel: [8681149.785081] drbd .drbdctrl/0 drbd0 luke: pdsk( DUnknown -> UpToDate ) repl( Off -> Established )
Mar 25 00:23:31 leia kernel: [8681149.801073] drbd .drbdctrl/1 drbd1: current_size: 8112
Mar 25 00:23:31 leia kernel: [8681149.801078] drbd .drbdctrl/1 drbd1 luke: c_size: 8112 u_size: 0 d_size: 8112 max_size: 8112
Mar 25 00:23:31 leia kernel: [8681149.801086] drbd .drbdctrl/1 drbd1 luke: la_size: 8112 my_usize: 0 my_max_size: 8112
Mar 25 00:23:31 leia kernel: [8681149.801090] drbd .drbdctrl/1 drbd1 luke: node_id: 0 idx: 0 bm-uuid: 0x0 flags: 0x10 max_size: 8112 (DUnknown)
Mar 25 00:23:31 leia kernel: [8681149.801093] drbd .drbdctrl/1 drbd1: my node_id: 1
Mar 25 00:23:31 leia kernel: [8681149.801096] drbd .drbdctrl/1 drbd1 luke: calling drbd_determine_dev_size()
Mar 25 00:23:31 leia kernel: [8681149.804952] drbd .drbdctrl/1 drbd1 luke: node_id: 0 idx: 0 bm-uuid: 0x0 flags: 0x10 max_size: 8112 (DUnknown)
Mar 25 00:23:31 leia kernel: [8681149.804954] drbd .drbdctrl/1 drbd1: my node_id: 1
Mar 25 00:23:31 leia kernel: [8681149.804961] drbd .drbdctrl/1 drbd1 luke: drbd_sync_handshake:
Mar 25 00:23:31 leia kernel: [8681149.804962] drbd .drbdctrl/1 drbd1 luke: self D69FF5CD6C9C04F0:0000000000000000:0F0AD00825909574:0000000000000000 bits:0 flags:120
Mar 25 00:23:31 leia kernel: [8681149.804964] drbd .drbdctrl/1 drbd1 luke: peer D69FF5CD6C9C04F0:0000000000000000:0F0AD00825909574:D814E94C1B1F533E bits:0 flags:120
Mar 25 00:23:31 leia kernel: [8681149.804965] drbd .drbdctrl/1 drbd1 luke: uuid_compare()=0 by rule 38
Mar 25 00:23:31 leia kernel: [8681149.804972] drbd .drbdctrl/1 drbd1 luke: pdsk( DUnknown -> UpToDate ) repl( Off -> Established )
Mar 25 00:25:13 leia pvestatd[3631]: got timeout
Mar 25 00:28:28 leia systemd-timesyncd[2158]: interval/delta/delay/jitter/drift 2048s/-0.002s/0.014s/0.001s/+8ppm (ignored)
Mar 25 00:31:33 leia pvestatd[3631]: got timeout
Mar 25 00:33:41 leia pmxcfs[175745]: [dcdb] notice: data verification successful
Mar 25 00:39:46 leia pmxcfs[175745]: [status] notice: received log
Mar 25 00:40:12 leia pmxcfs[175745]: [status] notice: received log
Mar 25 00:42:23 leia pvestatd[3631]: got timeout
Mar 25 00:42:33 leia pvestatd[3631]: got timeout
Mar 25 00:43:11 leia pvestatd[3631]: status update time (29.557 seconds)
Mar 25 00:58:15 leia rrdcached[3428]: flushing old values
Mar 25 00:58:15 leia rrdcached[3428]: rotating journals
Mar 25 00:58:15 leia rrdcached[3428]: started new journal /var/lib/rrdcached/journal/rrd.journal.1490399895.519043
Mar 25 00:58:15 leia rrdcached[3428]: removing old journal /var/lib/rrdcached/journal/rrd.journal.1490392695.519038
Mar 25 01:02:37 leia systemd-timesyncd[2158]: interval/delta/delay/jitter/drift 2048s/-0.000s/0.010s/0.001s/+7ppm
Mar 25 01:03:21 leia pmxcfs[175745]: [status] notice: received log
Mar 25 01:10:03 leia pvestatd[3631]: got timeout
Mar 25 01:11:53 leia pmxcfs[175745]: [status] notice: received log
Mar 25 01:17:01 leia CRON[158855]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Mar 25 01:24:33 leia pvestatd[3631]: got timeout
Mar 25 01:24:43 leia pvestatd[3631]: got timeout
Mar 25 01:27:13 leia pvestatd[3631]: got timeout
Mar 25 01:27:23 leia pvestatd[3631]: got timeout
Mar 25 01:27:33 leia pvestatd[3631]: got timeout
Mar 25 01:29:23 leia pvestatd[3631]: got timeout
Mar 25 01:29:33 leia pvestatd[3631]: got timeout
Mar 25 01:29:43 leia pvestatd[3631]: got timeout
Mar 25 01:32:04 leia pvestatd[3631]: status update time (23.529 seconds)
Mar 25 01:33:41 leia pmxcfs[175745]: [dcdb] notice: data verification successful
Mar 25 01:36:45 leia systemd-timesyncd[2158]: interval/delta/delay/jitter/drift 2048s/+0.000s/0.011s/0.001s/+8ppm
Mar 25 01:38:28 leia pmxcfs[175745]: [status] notice: received log
Mar 25 01:39:55 leia pmxcfs[175745]: [status] notice: received log
Mar 25 01:48:55 leia pmxcfs[175745]: [status] notice: received log
Mar 25 01:52:27 leia pmxcfs[175745]: [status] notice: received log
Mar 25 01:52:28 leia kernel: [8686486.985666] drbd vm-120-disk-1 luke: Preparing remote state change 1311515997 (primary_nodes=0, weak_nodes=0)
Mar 25 01:52:28 leia kernel: [8686486.997400] drbd vm-120-disk-1 luke: Committing remote state change 1311515997
Mar 25 01:52:28 leia kernel: [8686486.997411] drbd vm-120-disk-1 luke: peer( Secondary -> Primary )
Mar 25 01:58:15 leia rrdcached[3428]: flushing old values
Mar 25 01:58:15 leia rrdcached[3428]: rotating journals
Mar 25 01:58:15 leia rrdcached[3428]: started new journal /var/lib/rrdcached/journal/rrd.journal.1490403495.519043
Mar 25 01:58:15 leia rrdcached[3428]: removing old journal /var/lib/rrdcached/journal/rrd.journal.1490396295.519037
Mar 25 02:10:53 leia systemd-timesyncd[2158]: interval/delta/delay/jitter/drift 2048s/-0.000s/0.011s/0.001s/+8ppm
Mar 25 02:17:01 leia CRON[168117]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Mar 25 02:18:33 leia kernel: [8688051.347341] drbd vm-120-disk-1 luke: peer( Primary -> Secondary )
Mar 25 02:18:38 leia pmxcfs[175745]: [status] notice: received log
Mar 25 02:22:01 leia CRON[169045]: (root) CMD (/usr/bin/pveupdate)
Mar 25 02:22:23 leia pmxcfs[175745]: [status] notice: received log
Mar 25 02:25:42 leia pmxcfs[175745]: [status] notice: received log
Mar 25 02:26:19 leia pmxcfs[175745]: [status] notice: received log
Mar 25 02:27:04 leia pmxcfs[175745]: [status] notice: received log
Mar 25 02:28:22 leia pmxcfs[175745]: [status] notice: received log
Mar 25 02:31:41 leia corosync[3614]:   [QB    ] HUP conn (3614-175745-21)
Mar 25 02:31:41 leia corosync[3614]:   [QB    ] qb_ipcs_disconnect(3614-175745-21) state:2
Mar 25 02:31:41 leia systemd[1]: pve-cluster.service: main process exited, code=killed, status=6/ABRT
Mar 25 02:31:41 leia systemd[1]: Unit pve-cluster.service entered failed state.
Mar 25 02:31:41 leia corosync[3614]:   [MAIN  ] cs_ipcs_connection_closed()
Mar 25 02:31:41 leia corosync[3614]:   [CPG   ] exit_fn for conn=0x5642314113b0
Mar 25 02:31:41 leia corosync[3614]:   [MAIN  ] cs_ipcs_connection_destroyed()
Mar 25 02:31:41 leia corosync[3614]:   [QB    ] Free'ing ringbuffer: /dev/shm/qb-cpg-response-3614-175745-21-header
Mar 25 02:31:41 leia corosync[3614]:   [QB    ] Free'ing ringbuffer: /dev/shm/qb-cpg-event-3614-175745-21-header
Mar 25 02:31:41 leia corosync[3614]:   [QB    ] Free'ing ringbuffer: /dev/shm/qb-cpg-request-3614-175745-21-header
Mar 25 02:31:41 leia corosync[3614]:   [QB    ] HUP conn (3614-175745-20)
Mar 25 02:31:41 leia corosync[3614]:   [QB    ] qb_ipcs_disconnect(3614-175745-20) state:2
Mar 25 02:31:41 leia corosync[3614]:   [MAIN  ] cs_ipcs_connection_closed()
Mar 25 02:31:41 leia corosync[3614]:   [CPG   ] exit_fn for conn=0x564231410620
Mar 25 02:31:41 leia corosync[3614]:   [MAIN  ] cs_ipcs_connection_destroyed()
Mar 25 02:31:41 leia corosync[3614]:   [QB    ] Free'ing ringbuffer: /dev/shm/qb-cpg-response-3614-175745-20-header
Mar 25 02:31:41 leia corosync[3614]:   [QB    ] Free'ing ringbuffer: /dev/shm/qb-cpg-event-3614-175745-20-header
Mar 25 02:31:41 leia corosync[3614]:   [QB    ] Free'ing ringbuffer: /dev/shm/qb-cpg-request-3614-175745-20-header
Mar 25 02:31:41 leia corosync[3614]:   [QB    ] HUP conn (3614-175745-19)
Mar 25 02:31:41 leia corosync[3614]:   [QB    ] qb_ipcs_disconnect(3614-175745-19) state:2
Mar 25 02:31:41 leia corosync[3614]:   [MAIN  ] cs_ipcs_connection_closed()
Mar 25 02:31:41 leia corosync[3614]:   [CMAP  ] exit_fn for conn=0x56423181a860
Mar 25 02:31:41 leia corosync[3614]:   [MAIN  ] cs_ipcs_connection_destroyed()
Mar 25 02:31:41 leia corosync[3614]:   [QB    ] Free'ing ringbuffer: /dev/shm/qb-cmap-response-3614-175745-19-header
Mar 25 02:31:41 leia corosync[3614]:   [QB    ] Free'ing ringbuffer: /dev/shm/qb-cmap-event-3614-175745-19-header
Mar 25 02:31:41 leia corosync[3614]:   [QB    ] Free'ing ringbuffer: /dev/shm/qb-cmap-request-3614-175745-19-header
Mar 25 02:31:41 leia corosync[3614]:   [QB    ] HUP conn (3614-175745-18)
Mar 25 02:31:41 leia corosync[3614]:   [QB    ] qb_ipcs_disconnect(3614-175745-18) state:2
Mar 25 02:31:41 leia corosync[3614]:   [MAIN  ] cs_ipcs_connection_closed()
Mar 25 02:31:41 leia corosync[3614]:   [QUORUM] lib_exit_fn: conn=0x56423140fab0
Mar 25 02:31:41 leia corosync[3614]:   [MAIN  ] cs_ipcs_connection_destroyed()
Mar 25 02:31:41 leia corosync[3614]:   [QB    ] Free'ing ringbuffer: /dev/shm/qb-quorum-response-3614-175745-18-header
Mar 25 02:31:41 leia corosync[3614]:   [QB    ] Free'ing ringbuffer: /dev/shm/qb-quorum-event-3614-175745-18-header
Mar 25 02:31:41 leia corosync[3614]:   [QB    ] Free'ing ringbuffer: /dev/shm/qb-quorum-request-3614-175745-18-header
Mar 25 02:31:41 leia corosync[3614]:   [CPG   ] got procleave message from cluster node 0x2 (r(0) ip(10.0.1.132) ) for pid 175745
Mar 25 02:31:41 leia corosync[3614]:   [CPG   ] got procleave message from cluster node 0x2 (r(0) ip(10.0.1.132) ) for pid 175745
Mar 25 02:31:42 leia pve-ha-crm[3663]: ipcc_send_rec failed: Transport endpoint is not connected
Mar 25 02:31:42 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:42 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:44 leia pvestatd[3631]: ipcc_send_rec failed: Transport endpoint is not connected
Mar 25 02:31:44 leia pvestatd[3631]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:44 leia pvestatd[3631]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:44 leia pvestatd[3631]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:44 leia pvestatd[3631]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:44 leia pvestatd[3631]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:44 leia pve-ha-lrm[3681]: ipcc_send_rec failed: Transport endpoint is not connected
Mar 25 02:31:44 leia pve-ha-lrm[3681]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:44 leia pve-ha-lrm[3681]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:47 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:47 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:47 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:49 leia pve-ha-lrm[3681]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:49 leia pve-ha-lrm[3681]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:49 leia pve-ha-lrm[3681]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:52 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:52 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:52 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:54 leia pvestatd[3631]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:54 leia pvestatd[3631]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:54 leia pvestatd[3631]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:54 leia pvestatd[3631]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:54 leia pvestatd[3631]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:54 leia pvestatd[3631]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:54 leia pve-ha-lrm[3681]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:54 leia pve-ha-lrm[3681]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:54 leia pve-ha-lrm[3681]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:57 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:57 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:57 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:59 leia pve-ha-lrm[3681]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:59 leia pve-ha-lrm[3681]: ipcc_send_rec failed: Connection refused
Mar 25 02:31:59 leia pve-ha-lrm[3681]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:01 leia cron[3602]: (*system*vzdump) CAN'T OPEN SYMLINK (/etc/cron.d/vzdump)
Mar 25 02:32:02 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:02 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:02 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:04 leia pvestatd[3631]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:04 leia pvestatd[3631]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:04 leia pvestatd[3631]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:04 leia pvestatd[3631]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:04 leia pvestatd[3631]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:04 leia pvestatd[3631]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:04 leia pve-ha-lrm[3681]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:04 leia pve-ha-lrm[3681]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:04 leia pve-ha-lrm[3681]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:07 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:07 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:07 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:09 leia pve-ha-lrm[3681]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:09 leia pve-ha-lrm[3681]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:09 leia pve-ha-lrm[3681]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:12 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:12 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:12 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:14 leia pve-ha-lrm[3681]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:14 leia pve-ha-lrm[3681]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:14 leia pve-ha-lrm[3681]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:14 leia pvestatd[3631]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:14 leia pvestatd[3631]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:14 leia pvestatd[3631]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:14 leia pvestatd[3631]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:14 leia pvestatd[3631]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:14 leia pvestatd[3631]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:17 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:17 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:17 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:19 leia pve-ha-lrm[3681]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:19 leia pve-ha-lrm[3681]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:19 leia pve-ha-lrm[3681]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:22 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:22 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:22 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:24 leia pve-ha-lrm[3681]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:24 leia pve-ha-lrm[3681]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:24 leia pve-ha-lrm[3681]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:24 leia pvestatd[3631]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:24 leia pvestatd[3631]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:24 leia pvestatd[3631]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:24 leia pvestatd[3631]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:24 leia pvestatd[3631]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:24 leia pvestatd[3631]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:27 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:27 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:27 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:29 leia pve-ha-lrm[3681]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:29 leia pve-ha-lrm[3681]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:29 leia pve-ha-lrm[3681]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:31 leia systemd[1]: Starting The Proxmox VE cluster filesystem...
Mar 25 02:32:31 leia corosync[3614]:   [QB    ] IPC credentials authenticated (3614-170666-18)
Mar 25 02:32:31 leia corosync[3614]:   [QB    ] connecting to client [170666]
Mar 25 02:32:31 leia corosync[3614]:   [QB    ] shm size:1048589; real_size:1052672; rb->word_size:263168
Mar 25 02:32:31 leia corosync[3614]:   [QB    ] shm size:1048589; real_size:1052672; rb->word_size:263168
Mar 25 02:32:31 leia corosync[3614]:   [QB    ] shm size:1048589; real_size:1052672; rb->word_size:263168
Mar 25 02:32:31 leia corosync[3614]:   [MAIN  ] connection created
Mar 25 02:32:31 leia corosync[3614]:   [QUORUM] lib_init_fn: conn=0x56423140fab0
Mar 25 02:32:31 leia corosync[3614]:   [QUORUM] got quorum_type request on 0x56423140fab0
Mar 25 02:32:31 leia corosync[3614]:   [QUORUM] got trackstart request on 0x56423140fab0
Mar 25 02:32:31 leia corosync[3614]:   [QUORUM] sending initial status to 0x56423140fab0
Mar 25 02:32:31 leia corosync[3614]:   [QUORUM] sending quorum notification to 0x56423140fab0, length = 60
Mar 25 02:32:31 leia corosync[3614]:   [QB    ] IPC credentials authenticated (3614-170666-19)
Mar 25 02:32:31 leia corosync[3614]:   [QB    ] connecting to client [170666]
Mar 25 02:32:31 leia pmxcfs[170666]: [status] notice: update cluster info (cluster name  am1, version = 3)
Mar 25 02:32:31 leia corosync[3614]:   [QB    ] shm size:1048589; real_size:1052672; rb->word_size:263168
Mar 25 02:32:31 leia corosync[3614]:   [QB    ] shm size:1048589; real_size:1052672; rb->word_size:263168
Mar 25 02:32:31 leia corosync[3614]:   [QB    ] shm size:1048589; real_size:1052672; rb->word_size:263168
Mar 25 02:32:31 leia corosync[3614]:   [MAIN  ] connection created
Mar 25 02:32:31 leia corosync[3614]:   [CMAP  ] lib_init_fn: conn=0x56423181a860
Mar 25 02:32:31 leia corosync[3614]:   [QB    ] IPC credentials authenticated (3614-170666-20)
Mar 25 02:32:31 leia corosync[3614]:   [QB    ] connecting to client [170666]
Mar 25 02:32:31 leia corosync[3614]:   [QB    ] shm size:1048589; real_size:1052672; rb->word_size:263168
Mar 25 02:32:31 leia corosync[3614]:   [QB    ] shm size:1048589; real_size:1052672; rb->word_size:263168
Mar 25 02:32:31 leia corosync[3614]:   [QB    ] shm size:1048589; real_size:1052672; rb->word_size:263168
Mar 25 02:32:31 leia corosync[3614]:   [MAIN  ] connection created
Mar 25 02:32:31 leia corosync[3614]:   [CPG   ] lib_init_fn: conn=0x564231414f20, cpd=0x56423181a154
Mar 25 02:32:31 leia corosync[3614]:   [QB    ] IPC credentials authenticated (3614-170666-21)
Mar 25 02:32:31 leia corosync[3614]:   [QB    ] connecting to client [170666]
Mar 25 02:32:31 leia corosync[3614]:   [QB    ] shm size:1048589; real_size:1052672; rb->word_size:263168
Mar 25 02:32:31 leia corosync[3614]:   [QB    ] shm size:1048589; real_size:1052672; rb->word_size:263168
Mar 25 02:32:31 leia corosync[3614]:   [QB    ] shm size:1048589; real_size:1052672; rb->word_size:263168
Mar 25 02:32:31 leia corosync[3614]:   [MAIN  ] connection created
Mar 25 02:32:31 leia corosync[3614]:   [CPG   ] lib_init_fn: conn=0x564231416b30, cpd=0x564231416404
Mar 25 02:32:31 leia corosync[3614]:   [CPG   ] got procjoin message from cluster node 0x2 (r(0) ip(10.0.1.132) ) for pid 170666
Mar 25 02:32:31 leia pmxcfs[170666]: [status] notice: node has quorum
Mar 25 02:32:31 leia pmxcfs[170666]: [dcdb] notice: members: 1/162117, 2/170666, 3/28291
Mar 25 02:32:31 leia pmxcfs[170666]: [dcdb] notice: starting data syncronisation
Mar 25 02:32:31 leia pmxcfs[170666]: [dcdb] notice: received sync request (epoch 1/162117/00000011)
Mar 25 02:32:31 leia corosync[3614]:   [CPG   ] got procjoin message from cluster node 0x2 (r(0) ip(10.0.1.132) ) for pid 170666
Mar 25 02:32:31 leia pmxcfs[170666]: [status] notice: members: 1/162117, 2/170666, 3/28291
Mar 25 02:32:31 leia pmxcfs[170666]: [status] notice: starting data syncronisation
Mar 25 02:32:31 leia pmxcfs[170666]: [status] notice: received sync request (epoch 1/162117/00000011)
Mar 25 02:32:31 leia pmxcfs[170666]: [dcdb] notice: received all states
Mar 25 02:32:31 leia pmxcfs[170666]: [dcdb] notice: leader is 1/162117
Mar 25 02:32:31 leia pmxcfs[170666]: [dcdb] notice: synced members: 1/162117, 3/28291
Mar 25 02:32:31 leia pmxcfs[170666]: [dcdb] notice: waiting for updates from leader
Mar 25 02:32:31 leia pmxcfs[170666]: [status] notice: received all states
Mar 25 02:32:31 leia pmxcfs[170666]: [status] notice: received sync request (epoch 1/162117/00000011)
Mar 25 02:32:31 leia pmxcfs[170666]: [dcdb] notice: received all states
Mar 25 02:32:31 leia pmxcfs[170666]: [dcdb] notice: leader is 1/162117
Mar 25 02:32:31 leia pmxcfs[170666]: [dcdb] notice: synced members: 1/162117, 3/28291
Mar 25 02:32:31 leia pmxcfs[170666]: [dcdb] notice: waiting for updates from leader
Mar 25 02:32:31 leia pmxcfs[170666]: [status] notice: received all states
Mar 25 02:32:31 leia pmxcfs[170666]: [status] notice: all data is up to date
Mar 25 02:32:31 leia pmxcfs[170666]: [dcdb] notice: update complete - trying to commit (got 5 inode updates)
Mar 25 02:32:31 leia pmxcfs[170666]: [dcdb] notice: all data is up to date
Mar 25 02:32:32 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:32 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:32 leia pve-ha-crm[3663]: ipcc_send_rec failed: Connection refused
Mar 25 02:32:32 leia systemd[1]: Started The Proxmox VE cluster filesystem.
Mar 25 02:33:01 leia cron[3602]: (*system*vzdump) RELOAD (/etc/cron.d/vzdump)
Mar 25 02:33:28 leia pmxcfs[170666]: [status] notice: received log
Mar 25 02:33:41 leia pmxcfs[170666]: [dcdb] notice: data verification successful
Mar 25 02:35:17 leia pmxcfs[170666]: [status] notice: received log
Mar 25 02:45:01 leia systemd-timesyncd[2158]: interval/delta/delay/jitter/drift 2048s/-0.000s/0.011s/0.001s/+7ppm
Mar 25 02:51:58 leia pmxcfs[170666]: [status] notice: received log
Mar 25 02:58:15 leia rrdcached[3428]: flushing old values
Mar 25 02:58:15 leia rrdcached[3428]: rotating journals
Mar 25 02:58:15 leia rrdcached[3428]: started new journal /var/lib/rrdcached/journal/rrd.journal.1490407095.519031
Mar 25 02:58:15 leia rrdcached[3428]: removing old journal /var/lib/rrdcached/journal/rrd.journal.1490399895.519043
Mar 25 03:00:03 leia pmxcfs[170666]: [status] notice: received log
Mar 25 03:05:16 leia pmxcfs[170666]: [status] notice: received log
Mar 25 03:05:16 leia kernel: [8690855.064176] drbd vm-18107-disk-1 luke: Preparing remote state change 3234369730 (primary_nodes=0, weak_nodes=0)
Mar 25 03:05:16 leia kernel: [8690855.064707] drbd vm-18107-disk-1 luke: Committing remote state change 3234369730
Mar 25 03:05:16 leia kernel: [8690855.064717] drbd vm-18107-disk-1 luke: peer( Secondary -> Primary )
Mar 25 03:09:23 leia kernel: [8691101.771363] drbd vm-18107-disk-1 luke: peer( Primary -> Secondary )
Mar 25 03:09:25 leia pmxcfs[170666]: [status] notice: received log
Mar 25 03:16:51 leia pmxcfs[170666]: [status] notice: received log

pveversion -v

Code:
proxmox-ve: 4.4-76 (running kernel: 4.4.35-1-pve)
pve-manager: 4.4-1 (running version: 4.4-1/eb2d6f1e)
pve-kernel-4.4.6-1-pve: 4.4.6-48
pve-kernel-4.4.35-1-pve: 4.4.35-76
pve-kernel-4.4.13-2-pve: 4.4.13-58
pve-kernel-4.4.21-1-pve: 4.4.21-71
lvm2: 2.02.116-pve3
corosync-pve: 2.4.0-1
libqb0: 1.0-1
pve-cluster: 4.0-48
qemu-server: 4.0-101
pve-firmware: 1.1-10
libpve-common-perl: 4.0-83
libpve-access-control: 4.0-19
libpve-storage-perl: 4.0-70
pve-libspice-server1: 0.12.8-1
vncterm: 1.2-1
pve-docs: 4.4-1
pve-qemu-kvm: 2.7.0-9
pve-container: 1.0-88
pve-firewall: 2.0-33
pve-ha-manager: 1.0-38
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u2
lxc-pve: 2.0.6-2
lxcfs: 2.0.5-pve1
criu: 1.6.0-1
novnc-pve: 0.5-8
smartmontools: 6.5+svn4324-1~pve80
zfsutils: 0.6.5.8-pve13~bpo80
drbdmanage: 0.97.3-1

Passiert nicht wöchentlich aber denn in unregelmäßigen Abständen.
Gibt es hierfür Abhilfe?

Vielen Dank und viele Grüße,
seeby
 
Hi,
klingt danach, als wenn Du Backup über die gleiche Netzwerkverbindung machst, die für corosync (cluster communication) genutzt wird und die durch zu hohe Latenzen aus'n Tritt kommt.
Abhilfe: Storage-Netzwerk und cluster trennen.

Udo
 
Hi,

vielen Dank, das ist wohl wahr. Dann muss ich das Interface anpassen.

Grüße
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!