Hi all,
I have a four node cluster up and running.
Now I tried to add a qdevice that runns inside a docker container on a Synology NAS.
The containerized qdevice was successfully added to the cluster but gives no vote.
I can ssh into the device from any node and vice versa without any certificate issues.
So the problem seems not to be with the cluster.
The corosync configs are in sync accross all nodes.
After a bit of investigation I found that the qnetd service in the qdevice container does not start:
The Dockerfile the container is build from contains:
And the docker compose file is this one:
My assumption now is that something is wrong with my container build or compose.yml.
Any Idea what may be the reason why the qnetd service is not starting at all?
Cheers
Stephan
I have a four node cluster up and running.
Now I tried to add a qdevice that runns inside a docker container on a Synology NAS.
The containerized qdevice was successfully added to the cluster but gives no vote.
I can ssh into the device from any node and vice versa without any certificate issues.
So the problem seems not to be with the cluster.
Code:
root@pveNode0:~# pvecm status
Cluster information
-------------------
Name: BSB-Datacenter
Config Version: 31
Transport: knet
Secure auth: on
Quorum information
------------------
Date: Fri Feb 17 01:48:17 2023
Quorum provider: corosync_votequorum
Nodes: 4
Node ID: 0x00000001
Ring ID: 1.335
Quorate: Yes
Votequorum information
----------------------
Expected votes: 5
Highest expected: 5
Total votes: 4
Quorum: 3
Flags: Quorate Qdevice
Membership information
----------------------
Nodeid Votes Qdevice Name
0x00000001 1 A,NV,NMW x.x.x.20 (local)
0x00000002 1 A,NV,NMW x.x.x.21
0x00000003 1 A,NV,NMW x.x.x.22
0x00000004 1 A,NV,NMW x.x.x.23
0x00000000 0 Qdevice (votes 1)
root@pveNode0:~#
The corosync configs are in sync accross all nodes.
Code:
root@pveNode0:~# cat /etc/pve/corosync.conf
logging {
debug: off
to_syslog: yes
}
nodelist {
node {
name: pveNode0
nodeid: 1
quorum_votes: 1
ring0_addr: x.x.x.20
}
node {
name: pveNode1
nodeid: 2
quorum_votes: 1
ring0_addr: x.x.x.21
}
node {
name: pveNode2
nodeid: 3
quorum_votes: 1
ring0_addr: x.x.x.22
}
node {
name: pveNode3
nodeid: 4
quorum_votes: 1
ring0_addr: x.x.x.23
}
}
quorum {
device {
model: net
net {
algorithm: ffsplit
host: x.x.x.30
tls: on
}
votes: 1
}
provider: corosync_votequorum
}
totem {
cluster_name: BSB-Datacenter
config_version: 31
interface {
linknumber: 0
}
ip_version: ipv4-6
link_mode: passive
secauth: on
version: 2
}
After a bit of investigation I found that the qnetd service in the qdevice container does not start:
Code:
root@qdevice:~# systemctl start corosync-qnetd
Job for corosync-qnetd.service failed because the control process exited with error code.
See "systemctl status corosync-qnetd.service" and "journalctl -xe" for details.
root@qdevice:~# systemctl status corosync-qnetd
● corosync-qnetd.service - Corosync Qdevice Network daemon
Loaded: loaded (/lib/systemd/system/corosync-qnetd.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Fri 2023-02-17 00:40:04 UTC; 11s ago
Docs: man:corosync-qnetd
Process: 59 ExecStart=/usr/bin/corosync-qnetd -f $COROSYNC_QNETD_OPTIONS (code=exited, status=1/FAILURE)
Main PID: 59 (code=exited, status=1/FAILURE)
root@qdevice:~# journalctl -xe
-- No entries --
The Dockerfile the container is build from contains:
Code:
ARG TAG=latest
FROM debian:${TAG}
RUN echo 'debconf debconf/frontend select teletype' | debconf-set-selections
RUN apt-get update
RUN apt-get dist-upgrade -y
RUN apt-get install -y --no-install-recommends \
systemd \
systemd-sysv \
cron \
anacron \
corosync-qnetd \
openssh-server \
mc
RUN sed -i 's/#PermitRootLogin prohibit-password/PermitRootLogin yes/' /etc/ssh/sshd_config
RUN echo 'root:i229l1NF!proxmox' | chpasswd
RUN chown -R coroqnetd:coroqnetd /etc/corosync/
RUN apt-get clean
RUN rm -rf \
/var/lib/apt/lists/* \
/var/log/alternatives.log \
/var/log/apt/history.log \
/var/log/apt/term.log \
/var/log/dpkg.log
RUN systemctl mask -- \
dev-hugepages.mount \
sys-fs-fuse-connections.mount
RUN rm -f \
/etc/machine-id \
/var/lib/dbus/machine-id
FROM debian:${TAG}
COPY --from=0 / /
ENV container docker
STOPSIGNAL SIGRTMIN+3
VOLUME [ "/sys/fs/cgroup", "/run", "/run/lock", "/tmp" ]
CMD [ "/sbin/init" ]
And the docker compose file is this one:
YAML:
version: "3.5"
services:
qdevice:
container_name: qdevice
image: 'bsb/qdevice'
build:
context: ./context
dockerfile: ./Dockerfile
hostname: qdevice
restart: unless-stopped
volumes:
- /volume1/docker/qnetd/corosync-data:/etc/corosync
- /sys/fs/cgroup:/sys/fs/cgroup:ro
ports:
- '22:22'
- '5403-5412:5403-5412/udp'
networks:
- qdevice-net
networks:
qdevice-net:
name: qdevice-net
driver: bridge
My assumption now is that something is wrong with my container build or compose.yml.
Any Idea what may be the reason why the qnetd service is not starting at all?
Cheers
Stephan
Last edited: