Hello,
I've a strange problem after the upgrade from 2.1 to 2.3.
In my setup there are two Servers in a cluster. On of them
works without problems, but several times a day the other
server has a realy high load (3-5) and all clients on this host
are freezing.
The only solution to get the server working again is to do a restart.
This Server has a Raid5 on SAS-HDDs, after I added "elevator=deadline" to
grub the problem does not shown itself so often.
In syslog I found this logs, and I think after that the Server got these problems.
Does anyone know what I can do to solve the problem?
I've a strange problem after the upgrade from 2.1 to 2.3.
In my setup there are two Servers in a cluster. On of them
works without problems, but several times a day the other
server has a realy high load (3-5) and all clients on this host
are freezing.
The only solution to get the server working again is to do a restart.
This Server has a Raid5 on SAS-HDDs, after I added "elevator=deadline" to
grub the problem does not shown itself so often.
In syslog I found this logs, and I think after that the Server got these problems.
Code:
May 1 04:34:14 desokvm1 pvestatd[1934]: WARNING: closeing with write buffer at /usr/share/perl5/IO/Multiplex.pm line 913.
May 1 05:13:21 desokvm1 corosync[1565]: [TOTEM ] A processor failed, forming new configuration.
May 1 05:13:30 desokvm1 corosync[1565]: [CLM ] CLM CONFIGURATION CHANGE
May 1 05:13:30 desokvm1 corosync[1565]: [CLM ] New Configuration:
May 1 05:13:30 desokvm1 corosync[1565]: [CLM ] #011r(0) ip(10.0.3.1)
May 1 05:13:30 desokvm1 corosync[1565]: [CLM ] #011r(0) ip(10.0.3.2)
May 1 05:13:30 desokvm1 corosync[1565]: [CLM ] Members Left:
May 1 05:13:30 desokvm1 corosync[1565]: [CLM ] Members Joined:
May 1 05:13:30 desokvm1 corosync[1565]: [CLM ] CLM CONFIGURATION CHANGE
May 1 05:13:30 desokvm1 corosync[1565]: [CLM ] New Configuration:
May 1 05:13:30 desokvm1 corosync[1565]: [CLM ] #011r(0) ip(10.0.3.1)
May 1 05:13:30 desokvm1 corosync[1565]: [CLM ] #011r(0) ip(10.0.3.2)
May 1 05:13:30 desokvm1 corosync[1565]: [CLM ] Members Left:
May 1 05:13:30 desokvm1 corosync[1565]: [CLM ] Members Joined:
May 1 05:13:30 desokvm1 corosync[1565]: [TOTEM ] A processor joined or left the membership and a new membership was formed.
May 1 05:13:32 desokvm1 pmxcfs[1434]: [status] notice: cpg_send_message retry 10
May 1 05:13:33 desokvm1 pmxcfs[1434]: [status] notice: cpg_send_message retry 20
May 1 05:13:34 desokvm1 pmxcfs[1434]: [status] notice: cpg_send_message retry 30
May 1 05:13:35 desokvm1 pmxcfs[1434]: [status] notice: cpg_send_message retry 40
May 1 05:13:36 desokvm1 pmxcfs[1434]: [status] notice: cpg_send_message retry 50
May 1 05:13:37 desokvm1 pmxcfs[1434]: [status] notice: cpg_send_message retry 60
May 1 05:13:38 desokvm1 pmxcfs[1434]: [status] notice: cpg_send_message retry 70
May 1 05:13:39 desokvm1 pmxcfs[1434]: [status] notice: cpg_send_message retry 80
May 1 05:13:40 desokvm1 pmxcfs[1434]: [status] notice: cpg_send_message retry 90
May 1 05:13:41 desokvm1 pmxcfs[1434]: [status] notice: cpg_send_message retry 100
May 1 05:13:41 desokvm1 pmxcfs[1434]: [dcdb] notice: cpg_send_message retried 100 times
May 1 05:13:41 desokvm1 pmxcfs[1434]: [status] crit: cpg_send_message failed: 6
May 1 05:13:42 desokvm1 pmxcfs[1434]: [status] notice: cpg_send_message retry 10
May 1 05:13:43 desokvm1 pmxcfs[1434]: [status] notice: cpg_send_message retry 20
May 1 05:13:44 desokvm1 pmxcfs[1434]: [status] notice: cpg_send_message retry 30
May 1 05:13:45 desokvm1 pmxcfs[1434]: [status] notice: cpg_send_message retry 40
May 1 05:13:46 desokvm1 pmxcfs[1434]: [status] notice: cpg_send_message retry 50
May 1 05:13:47 desokvm1 pmxcfs[1434]: [status] notice: cpg_send_message retry 60
May 1 05:13:48 desokvm1 pmxcfs[1434]: [status] notice: cpg_send_message retry 70
May 1 05:13:49 desokvm1 pmxcfs[1434]: [status] notice: cpg_send_message retry 80
May 1 05:13:50 desokvm1 pmxcfs[1434]: [status] notice: cpg_send_message retry 90
May 1 05:13:51 desokvm1 pmxcfs[1434]: [status] notice: cpg_send_message retry 100
May 1 05:13:51 desokvm1 pmxcfs[1434]: [dcdb] notice: cpg_send_message retried 100 times
May 1 05:13:51 desokvm1 pmxcfs[1434]: [status] crit: cpg_send_message failed: 6
May 1 05:13:52 desokvm1 pmxcfs[1434]: [status] notice: cpg_send_message retry 10
May 1 05:13:53 desokvm1 pmxcfs[1434]: [status] notice: cpg_send_message retry 20
May 1 05:13:54 desokvm1 corosync[1565]: [CPG ] chosen downlist: sender r(0) ip(10.0.3.1) ; members(old:2 left:0)
May 1 05:13:54 desokvm1 pmxcfs[1434]: [status] notice: cpg_send_message retry 30
May 1 05:13:55 desokvm1 pmxcfs[1434]: [status] notice: cpg_send_message retry 40
May 1 05:13:56 desokvm1 pmxcfs[1434]: [status] notice: cpg_send_message retry 50
May 1 05:13:57 desokvm1 pmxcfs[1434]: [status] notice: cpg_send_message retry 60
May 1 05:13:58 desokvm1 pmxcfs[1434]: [status] notice: cpg_send_message retry 70
May 1 05:13:59 desokvm1 corosync[1565]: [MAIN ] Completed service synchronization, ready to provide service.
May 1 05:13:59 desokvm1 pmxcfs[1434]: [dcdb] notice: cpg_send_message retried 75 times
Does anyone know what I can do to solve the problem?