Hello,
We have a 3 node cluster setup with drbd9. Proxmox4 in combination with drbd9 really is very nice!
One node crashed, we have to bring this node back in the cluster. We
cannot get this done.
We reinstalled it on the same way as the others, following this setup
https://pve.proxmox.com/wiki/DRBD9 and https://pve.proxmox.com/wiki/Proxmox_VE_4.x_Cluster. Now we want te remove the failed node from the cluster on the primary node. Unfortunately we cannot
delete an offline node from the cluster. It will give these messages:
"drbd .drbdctrl: Auto-promote failed: Multiple primaries not allowed by config"
We think it would be the most easy way if we just remove the node from the cluster and add it again, so that all configuration and existing volumes will get synchronised automatically.
Here are some details:
drbdmanage list-nodes
+----------------------------------------------------------------------+
| Name | Pool Size | Pool Free | Site | | State |
|----------------------------------------------------------------------|
| hyp10 | 16777216 | 0 | N/A | | ok |
| hyp20 | 16777216 | 16521161 | N/A | | OFFLINE |
| hyp30 | 16777216 | 9690519 | N/A | | ok |
+----------------------------------------------------------------------+
drbdmanage remove-node -f hyp20
You are going to remove the node 'hyp20' from the cluster. This will
remove all resources from the node.
Please confirm:
yes/no: yes
Jan 5 23:09:43 hyp30 kernel: [717721.879728] drbd .drbdctrl:
Auto-promote failed: Multiple primaries not allowed by config
Jan 5 23:09:48 hyp30 kernel: [717726.880759] drbd .drbdctrl:
Auto-promote failed: Multiple primaries not allowed by config
Jan 5 23:09:52 hyp30 kernel: [717731.278239] drbd .drbdctrl:
Auto-promote failed: Multiple primaries not allowed by config
Jan 5 23:09:56 hyp30 kernel: [717735.163982] drbd .drbdctrl:
Auto-promote failed: Multiple primaries not allowed by config
Jan 5 23:10:00 hyp30 kernel: [717738.849674] drbd .drbdctrl:
Auto-promote failed: Multiple primaries not allowed by config
Jan 5 23:10:05 hyp30 kernel: [717743.614836] drbd .drbdctrl:
Auto-promote failed: Multiple primaries not allowed by config
Traceback (most recent call last):
File "/usr/bin/drbdmanage", line 30, in <module>
drbdmanage_client.main()
File "/usr/lib/python2.7/dist-packages/drbdmanage_client.py", line
3520, in main
client.run()
File "/usr/lib/python2.7/dist-packages/drbdmanage_client.py", line
1130, in run
self.parse(sys.argv[1:])
File "/usr/lib/python2.7/dist-packages/drbdmanage_client.py", line
991, in parse
args.func(args)
File "/usr/lib/python2.7/dist-packages/drbdmanage_client.py", line
1301, in cmd_remove_node
dbus.String(node_name), dbus.Boolean(force)
File "/usr/lib/python2.7/dist-packages/dbus/proxies.py", line 70, in
__call__
return self._proxy_method(*args, **keywords)
File "/usr/lib/python2.7/dist-packages/dbus/proxies.py", line 145, in
__call__
**keywords)
File "/usr/lib/python2.7/dist-packages/dbus/connection.py", line 651,
in call_blocking
message, timeout)
dbus.exceptions.DBusException: org.freedesktop.DBus.Error.NoReply: Did
not receive a reply. Possible causes include: the remote application did
not send a reply, the message bus security policy blocked the reply, the
reply timeout expired, or the network connection was broken.
We also followed these guides
http://drbd.linbit.com/users-guide-9.0/s-node-failure.html#s-perm-node-failure,
but without any success. How can we connect the resource .drbdctrl?
Anybode some advice how to restore this cluster?
It would be great if these steps are added to the Proxmox wiki. It will probably be used quite some times in the future.
We have a 3 node cluster setup with drbd9. Proxmox4 in combination with drbd9 really is very nice!
One node crashed, we have to bring this node back in the cluster. We
cannot get this done.
We reinstalled it on the same way as the others, following this setup
https://pve.proxmox.com/wiki/DRBD9 and https://pve.proxmox.com/wiki/Proxmox_VE_4.x_Cluster. Now we want te remove the failed node from the cluster on the primary node. Unfortunately we cannot
delete an offline node from the cluster. It will give these messages:
"drbd .drbdctrl: Auto-promote failed: Multiple primaries not allowed by config"
We think it would be the most easy way if we just remove the node from the cluster and add it again, so that all configuration and existing volumes will get synchronised automatically.
Here are some details:
drbdmanage list-nodes
+----------------------------------------------------------------------+
| Name | Pool Size | Pool Free | Site | | State |
|----------------------------------------------------------------------|
| hyp10 | 16777216 | 0 | N/A | | ok |
| hyp20 | 16777216 | 16521161 | N/A | | OFFLINE |
| hyp30 | 16777216 | 9690519 | N/A | | ok |
+----------------------------------------------------------------------+
drbdmanage remove-node -f hyp20
You are going to remove the node 'hyp20' from the cluster. This will
remove all resources from the node.
Please confirm:
yes/no: yes
Jan 5 23:09:43 hyp30 kernel: [717721.879728] drbd .drbdctrl:
Auto-promote failed: Multiple primaries not allowed by config
Jan 5 23:09:48 hyp30 kernel: [717726.880759] drbd .drbdctrl:
Auto-promote failed: Multiple primaries not allowed by config
Jan 5 23:09:52 hyp30 kernel: [717731.278239] drbd .drbdctrl:
Auto-promote failed: Multiple primaries not allowed by config
Jan 5 23:09:56 hyp30 kernel: [717735.163982] drbd .drbdctrl:
Auto-promote failed: Multiple primaries not allowed by config
Jan 5 23:10:00 hyp30 kernel: [717738.849674] drbd .drbdctrl:
Auto-promote failed: Multiple primaries not allowed by config
Jan 5 23:10:05 hyp30 kernel: [717743.614836] drbd .drbdctrl:
Auto-promote failed: Multiple primaries not allowed by config
Traceback (most recent call last):
File "/usr/bin/drbdmanage", line 30, in <module>
drbdmanage_client.main()
File "/usr/lib/python2.7/dist-packages/drbdmanage_client.py", line
3520, in main
client.run()
File "/usr/lib/python2.7/dist-packages/drbdmanage_client.py", line
1130, in run
self.parse(sys.argv[1:])
File "/usr/lib/python2.7/dist-packages/drbdmanage_client.py", line
991, in parse
args.func(args)
File "/usr/lib/python2.7/dist-packages/drbdmanage_client.py", line
1301, in cmd_remove_node
dbus.String(node_name), dbus.Boolean(force)
File "/usr/lib/python2.7/dist-packages/dbus/proxies.py", line 70, in
__call__
return self._proxy_method(*args, **keywords)
File "/usr/lib/python2.7/dist-packages/dbus/proxies.py", line 145, in
__call__
**keywords)
File "/usr/lib/python2.7/dist-packages/dbus/connection.py", line 651,
in call_blocking
message, timeout)
dbus.exceptions.DBusException: org.freedesktop.DBus.Error.NoReply: Did
not receive a reply. Possible causes include: the remote application did
not send a reply, the message bus security policy blocked the reply, the
reply timeout expired, or the network connection was broken.
We also followed these guides
http://drbd.linbit.com/users-guide-9.0/s-node-failure.html#s-perm-node-failure,
but without any success. How can we connect the resource .drbdctrl?
Anybode some advice how to restore this cluster?
It would be great if these steps are added to the Proxmox wiki. It will probably be used quite some times in the future.