pmxcfs - dcdb failure

Angelo · May 11, 2020

Hi all,

Hope everyone is keeping safe and sane

We run a fairly busy production Proxmox cluster environment, consisting of 12 nodes (each with between 40-70 VMs, all configured with SSD RAIDZ2 pools, and 72-80 cores - Scalable - with between 640-768GB RAM per node).

We are in the process of migrating another 12 nodes across to Proxmox from SmartOS.

Last week, we suffered the following failure on one of our nodes (see below) - hasn't happened before - which resulted in us restarting the cluster service on the affected node (which looks like it fixed the issue), but the initial dcdb failure also resulted in what appeared to be IO errors associated with a number of hung VMs (which had to be restarted):

May 07 07:44:11 pve-11 pmxcfs[38522]: [dcdb] notice: data verification successful
May 07 08:12:14 pve-11 pmxcfs[38522]: [dcdb] crit: serious internal error - stop cluster connection
May 07 08:12:16 pve-11 pmxcfs[38522]: [dcdb] crit: can't initialize service
May 07 08:28:38 pve-11 corosync[39311]: [CPG ] *** 0x562f56194180 can't mcast to group pve_dcdb_v1 state:1, error:12
May 07 08:35:23 pve-11 pmxcfs[68736]: [dcdb] crit: cpg_initialize failed: 2
May 07 08:35:23 pve-11 pmxcfs[68736]: [dcdb] crit: can't initialize service
May 07 08:35:29 pve-11 pmxcfs[68736]: [dcdb] crit: cpg_initialize failed: 2
May 07 08:35:30 pve-11 pmxcfs[69920]: [dcdb] crit: cpg_initialize failed: 2
May 07 08:35:30 pve-11 pmxcfs[69920]: [dcdb] crit: can't initialize service
May 07 08:35:36 pve-11 pmxcfs[69920]: [dcdb] crit: cpg_initialize failed: 2
May 07 08:35:37 pve-11 pmxcfs[71126]: [dcdb] crit: cpg_initialize failed: 2
May 07 08:35:37 pve-11 pmxcfs[71126]: [dcdb] crit: can't initialize service
May 07 08:35:43 pve-11 pmxcfs[71126]: [dcdb] notice: members: 11/71126
May 07 08:35:43 pve-11 pmxcfs[71126]: [dcdb] notice: all data is up to date
May 07 08:35:46 pve-11 pmxcfs[71126]: [dcdb] notice: members: 1/348122, 2/340314, 3/2738130, 4/2814404, 5/4166389, 6/1180260, 7/3551791, 8/1501512, 9/1288849, 10/7438, 11/71126, 12/39377
May 07 08:35:46 pve-11 pmxcfs[71126]: [dcdb] notice: starting data syncronisation
May 07 08:35:48 pve-11 pmxcfs[71126]: [dcdb] notice: received sync request (epoch 1/348122/00000003)
May 07 08:35:48 pve-11 pmxcfs[71126]: [dcdb] notice: received all states
May 07 08:35:48 pve-11 pmxcfs[71126]: [dcdb] notice: leader is 1/348122
May 07 08:35:48 pve-11 pmxcfs[71126]: [dcdb] notice: synced members: 1/348122, 2/340314, 3/2738130, 4/2814404, 5/4166389, 6/1180260, 7/3551791, 8/1501512, 9/1288849, 10/7438, 12/39377
May 07 08:35:48 pve-11 pmxcfs[71126]: [dcdb] notice: waiting for updates from leader
May 07 08:35:48 pve-11 pmxcfs[71126]: [dcdb] notice: dfsm_deliver_queue: queue length 12
May 07 08:35:48 pve-11 pmxcfs[71126]: [dcdb] notice: update complete - trying to commit (got 19 inode updates)
May 07 08:35:48 pve-11 pmxcfs[71126]: [dcdb] notice: all data is up to date
May 07 08:35:48 pve-11-dc5-r08-vox-teraco-jb1 pmxcfs[71126]: [dcdb] notice: dfsm_deliver_sync_queue: queue length 15
May 07 08:35:48 pve-11-dc5-r08-vox-teraco-jb1 pmxcfs[71126]: [dcdb] notice: dfsm_deliver_queue: queue length 1

Also, I have attached additional logs from when the issue started (08h12) to when the cluster was restarted (08h35).

As mentioned, we've not had this issue before, but it is concerning.

In addition:

# pveversion --verbose
proxmox-ve: 6.1-2 (running kernel: 5.3.18-2-pve)
pve-manager: 6.1-8 (running version: 6.1-8/806edfe1)
pve-kernel-helper: 6.1-7
pve-kernel-5.3: 6.1-5
pve-kernel-5.3.18-2-pve: 5.3.18-2
pve-kernel-5.3.10-1-pve: 5.3.10-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.3-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: residual config
ifupdown2: 2.0.1-1+pve8
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.15-pve1
libpve-access-control: 6.0-6
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.0-17
libpve-guest-common-perl: 3.0-5
libpve-http-server-perl: 3.0-5
libpve-storage-perl: 6.1-5
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 3.2.1-1
lxcfs: 3.0.3-pve60
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.1-3
pve-cluster: 6.1-4
pve-container: 3.0-22
pve-docs: 6.1-6
pve-edk2-firmware: 2.20200229-1
pve-firewall: 4.0-10
pve-firmware: 3.0-6
pve-ha-manager: 3.0-9
pve-i18n: 2.0-4
pve-qemu-kvm: 4.1.1-4
pve-xtermjs: 4.3.0-1
qemu-server: 6.1-7
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.3-pve1

Can this be entirely attributed to a cluster service failure on the affected node, or should we be looking for anything else?

Thanks and regards,

Angelo.

oguz · May 11, 2020

hi,

Angelo said:
Can this be entirely attributed to a cluster service failure on the affected node, or should we be looking for anything else?

based on this log entry:

Code:

May 07 08:28:38 pve-11 corosync[39311]:   [CPG   ] *** 0x562f56194180 can't mcast to group pve_dcdb_v1 state:1, error:12

i'd guess you have a problem with multicast. check here[0] and here[1]

[0]: https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_cluster_network
[1]: https://pve.proxmox.com/wiki/Multicast_notes

edit:

also i just saw this in the attached log:

Code:

May  7 08:12:14 pve-11 pmxcfs[38522]: [database] crit: commit transaction failed: database or disk is full#010
May  7 08:12:14 pve-11 pmxcfs[38522]: [database] crit: rollback transaction failed: cannot rollback - no transaction is active#010
May  7 08:12:14 pve-11 pmxcfs[38522]: [dcdb] crit: serious internal error - stop cluster connection
May  7 08:12:14 pve-11 pve-firewall[13203]: status update error: close (rename) atomic file '/var/lib/pve-firewall/ipsetcmdlist1' failed: No space left on device

so it may be a disk issue as well..

Angelo · May 11, 2020

Hi Oguz,

Thanks for the response.

1. My understanding is that multicast (or the lack thereof) shouldn't be an issue because the latest version of Corosync is unicast/udp-based?

2. I did see the log entry ('database or disk is full#010') but pmxcfs is the shared cluster file-system - there are no 'user' files in there and it's not something we manage.

is there a reason that this would report as being full?

Kind regards,

Angelo.

oguz · May 11, 2020

Angelo said:
2. I did see the log entry ('database or disk is full#010') pmxcfs is the shared cluster file-system - there are no 'user' files in there and it's not something we manage.

is there a reason that this would report as being full?

the next entry is for /var/lilb which isn't contained in pmxcfs, so it may be that your node is running out of space?

are things working normally in the cluster after the restart of the service?
you can post your /etc/pve/corosync.conf (please remove any public IPs) and output of pvecm status

Angelo · May 11, 2020

Hi

Not sure what you are referring to but it's clear from the logs that the disk full error is in relation to pmxcfs - and the node zpool has about 1.2TB available (as it did last week):

May 7 08:12:14 pve-11 pmxcfs[38522]: [database] crit: commit transaction failed: database or disk is full#010
May 7 08:12:14 pve-11 pmxcfs[38522]: [database] crit: rollback transaction failed: cannot rollback - no transaction is active#010
May 7 08:12:14 pve-11 pmxcfs[38522]: [dcdb] crit: serious internal error - stop cluster connection

pvecm status:

Code:

Cluster information
-------------------
Name:             pve-clst-01
Config Version:   40
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Mon May 11 17:08:10 2020
Quorum provider:  corosync_votequorum
Nodes:            12
Node ID:          0x0000000b
Ring ID:          1.157d56
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   12
Highest expected: 12
Total votes:      12
Quorum:           7
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 192.168.6.245
0x00000002          1 192.168.6.17
0x00000003          1 192.168.6.149
0x00000004          1 192.168.6.98
0x00000005          1 192.168.6.171
0x00000006          1 192.168.6.174
0x00000007          1 192.168.6.173
0x00000008          1 192.168.6.140
0x00000009          1 192.168.6.41
0x0000000a          1 192.168.6.132
0x0000000b          1 192.168.6.133 (local)
0x0000000c          1 192.168.6.134

corosync.conf:

Code:

logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: pve-02
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 192.168.6.245
    ring1_addr: 192.168.3.58
  }
  node {
    name: pve-03
    nodeid: 3
    quorum_votes: 1
    ring0_addr: 192.168.6.149
    ring1_addr: 192.168.3.55
  }
  node {
    name: pve-04
    nodeid: 4
    quorum_votes: 1
    ring0_addr: 192.168.6.98
    ring1_addr: 192.168.3.98
  }
  node {
    name: pve-05
    nodeid: 5
    quorum_votes: 1
    ring0_addr: 192.168.6.171
    ring1_addr: 192.168.3.64
  }
  node {
    name: pve-06
    nodeid: 2
    quorum_votes: 1
    ring0_addr: 192.168.6.17
    ring1_addr: 192.168.3.191
  }
  node {
    name: pve-07
    nodeid: 9
    quorum_votes: 1
    ring0_addr: 192.168.6.41
    ring1_addr: 192.168.3.88
  }
  node {
    name: pve-10
    nodeid: 10
    quorum_votes: 1
    ring0_addr: 192.168.6.132
    ring1_addr: 192.168.3.188
  }
  node {
    name: pve-1001
    nodeid: 8
    quorum_votes: 1
    ring0_addr: 192.168.6.140
    ring1_addr: 192.168.3.90
  }
  node {
    name: pve-11
    nodeid: 11
    quorum_votes: 1
    ring0_addr: 192.168.6.133
    ring1_addr: 192.168.3.189
  }
  node {
    name: pve-12
    nodeid: 12
    quorum_votes: 1
    ring0_addr: 192.168.6.134
    ring1_addr: 192.168.3.155
  }
  node {
    name: pve-2001
    nodeid: 7
    quorum_votes: 1
    ring0_addr: 192.168.6.173
    ring1_addr: 192.168.3.127
  }
  node {
    name: pve-2002
    nodeid: 6
    quorum_votes: 1
    ring0_addr: 192.168.6.174
    ring1_addr: 192.168.3.128
  }
}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: pve-clst-01
  config_version: 40
  interface {
    bindnetaddr: 192.168.6.245
    ringnumber: 0
  }

  interface {
    bindnetaddr: 192.168.3.58
    ringnumber: 1
  }
  ip_version: ipv4
  rrp_mode: passive
  secauth: on
  version: 2
}

Hope this helps?

Kind regards,

Angelo.

Angelo · May 13, 2020

Hi Oguz,

Any feedback on this issue? Is there anything glaring that is a problem from the logs/configs?

Kind regards,

Angelo.

oguz · May 13, 2020

hi,

Angelo said:
Any feedback on this issue? Is there anything glaring that is a problem from the logs/configs?

configs look normal...

can you post zfs list && zpool list from the node?

Angelo · May 13, 2020

Here you go...

Code:

root@pve-11:~# zfs list && zpool list
NAME                              USED  AVAIL     REFER  MOUNTPOINT
rpool                            7.59T  2.02T      222K  /rpool
rpool/ROOT                       3.54G  2.02T      205K  /rpool/ROOT
rpool/ROOT/pve-1                 3.54G  2.02T     3.54G  /
rpool/data                        598G  2.02T      205K  /rpool/data
rpool/data/vm-1281-disk-0        26.6G  2.03T     20.4G  -
rpool/data/vm-2037-disk-0         571G  2.02T      569G  -
rpool/vmdata-16K                 7.00T  2.02T      205K  /rpool/vmdata-16K
rpool/vmdata-16K/vm-1062-disk-0  19.3G  2.02T     19.3G  -
rpool/vmdata-16K/vm-1062-disk-1  15.7G  2.02T     15.7G  -
rpool/vmdata-16K/vm-1084-disk-0  66.4G  2.02T     66.4G  -
rpool/vmdata-16K/vm-1088-disk-0  48.0G  2.02T     39.8G  -
rpool/vmdata-16K/vm-1120-disk-0   392G  2.02T      392G  -
rpool/vmdata-16K/vm-1126-disk-0  11.0G  2.02T     11.0G  -
rpool/vmdata-16K/vm-1202-disk-0  6.64G  2.02T     6.64G  -
rpool/vmdata-16K/vm-1216-disk-0  4.02G  2.02T     4.02G  -
rpool/vmdata-16K/vm-1239-disk-0   437G  2.02T      437G  -
rpool/vmdata-16K/vm-1248-disk-0   278G  2.02T      278G  -
rpool/vmdata-16K/vm-1274-disk-0  81.2G  2.02T     81.2G  -
rpool/vmdata-16K/vm-1284-disk-0   330G  2.02T      330G  -
rpool/vmdata-16K/vm-1306-disk-0  14.2G  2.02T     14.2G  -
rpool/vmdata-16K/vm-1334-disk-0  20.8G  2.02T     20.8G  -
rpool/vmdata-16K/vm-1471-disk-0   371G  2.02T      371G  -
rpool/vmdata-16K/vm-1492-disk-0  3.04G  2.02T     3.04G  -
rpool/vmdata-16K/vm-1492-disk-1   159G  2.02T      159G  -
rpool/vmdata-16K/vm-1531-disk-0  73.9G  2.02T     73.9G  -
rpool/vmdata-16K/vm-1723-disk-0  55.4G  2.02T     51.7G  -
rpool/vmdata-16K/vm-2037-disk-0   382G  2.02T      382G  -
rpool/vmdata-16K/vm-2218-disk-0  29.1G  2.02T     29.1G  -
rpool/vmdata-16K/vm-2220-disk-0  39.9G  2.02T     39.9G  -
rpool/vmdata-16K/vm-2221-disk-0  3.91G  2.02T     3.91G  -
rpool/vmdata-16K/vm-2221-disk-1   119K  2.02T      119K  -
rpool/vmdata-16K/vm-2222-disk-0  5.72G  2.02T     5.72G  -
rpool/vmdata-16K/vm-2222-disk-1  2.29M  2.02T     2.29M  -
rpool/vmdata-16K/vm-2227-disk-0  11.2G  2.02T     11.2G  -
rpool/vmdata-16K/vm-2252-disk-0  2.42G  2.02T     2.42G  -
rpool/vmdata-16K/vm-2257-disk-0  3.10G  2.02T     3.10G  -
rpool/vmdata-16K/vm-2258-disk-0  12.1G  2.02T     12.1G  -
rpool/vmdata-16K/vm-2258-disk-1  3.44M  2.02T     3.44M  -
rpool/vmdata-16K/vm-2753-disk-0  16.7G  2.02T     16.7G  -
rpool/vmdata-16K/vm-2835-disk-0  6.18G  2.02T     6.18G  -
rpool/vmdata-16K/vm-2875-disk-0  39.0G  2.02T     39.0G  -
rpool/vmdata-16K/vm-2875-disk-1  3.84T  2.02T     3.84T  -
rpool/vmdata-16K/vm-2877-disk-0  6.33G  2.02T     6.33G  -
rpool/vmdata-16K/vm-2881-disk-0  1.91G  2.02T     1.91G  -
rpool/vmdata-16K/vm-2882-disk-0  11.2G  2.02T     11.2G  -
rpool/vmdata-16K/vm-2883-disk-0  3.80G  2.02T     3.80G  -
rpool/vmdata-16K/vm-2885-disk-0  3.58G  2.02T     3.58G  -
rpool/vmdata-16K/vm-2886-disk-0  3.56G  2.02T     3.56G  -
rpool/vmdata-16K/vm-2887-disk-0  10.4G  2.02T     10.4G  -
rpool/vmdata-16K/vm-2888-disk-0  26.5G  2.02T     26.5G  -
rpool/vmdata-16K/vm-2891-disk-0  53.5G  2.02T     53.5G  -
rpool/vmdata-16K/vm-2990-disk-0  4.36G  2.02T     4.36G  -
rpool/vmdata-16K/vm-2992-disk-0  4.05G  2.02T     4.05G  -
rpool/vmdata-16K/vm-2993-disk-0  1.96G  2.02T     1.96G  -
rpool/vmdata-16K/vm-2994-disk-0  11.0G  2.02T     11.0G  -
rpool/vmdata-16K/vm-2995-disk-0  7.42G  2.02T     7.42G  -
rpool/vmdata-16K/vm-2996-disk-0  4.56G  2.02T     4.56G  -
rpool/vmdata-16K/vm-3000-disk-0  6.03G  2.02T     6.03G  -
rpool/vmdata-16K/vm-3004-disk-0  2.16G  2.02T     2.16G  -
rpool/vmdata-16K/vm-3008-disk-0  2.01G  2.02T     2.01G  -
rpool/vmdata-16K/vm-3009-disk-0  28.4G  2.02T     28.4G  -
rpool/vmdata-16K/vm-3010-disk-0  29.0G  2.02T     29.0G  -
rpool/vmdata-16K/vm-3011-disk-0  29.6G  2.02T     29.6G  -
rpool/vmdata-16K/vm-3014-disk-0  2.08G  2.02T     2.08G  -
rpool/vmdata-16K/vm-3015-disk-0  2.10G  2.02T     2.10G  -
rpool/vmdata-16K/vm-3019-disk-0  2.11G  2.02T     2.11G  -
rpool/vmdata-16K/vm-3019-disk-1   119K  2.02T      119K  -
rpool/vmdata-16K/vm-3019-disk-2   119K  2.02T      119K  -
rpool/vmdata-16K/vm-3020-disk-0  2.89G  2.02T     2.89G  -
rpool/vmdata-16K/vm-3023-disk-0  1.96G  2.02T     1.96G  -
rpool/vmdata-16K/vm-3024-disk-0  24.6G  2.02T     24.6G  -
rpool/vmdata-16K/vm-3025-disk-0  1.96G  2.02T     1.96G  -
rpool/vmdata-16K/vm-3025-disk-1   119K  2.02T      119K  -
rpool/vmdata-16K/vm-3026-disk-0  2.11G  2.02T     2.11G  -
rpool/vmdata-16K/vm-3027-disk-0  2.15G  2.02T     2.15G  -
rpool/vmdata-32K                  205K  2.02T      205K  /rpool/vmdata-32K
NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
rpool  14.0T  10.7T  3.29T        -         -    31%    76%  1.00x    ONLINE  -
root@pve-11:~#

oguz · May 14, 2020

okay,

can you also post ls -ali /var/lib/pve-cluster to see the size of the db

Angelo · May 14, 2020

Hi Oguz,

Here you go:

Code:

root@pve-11:~# ls -ali /var/lib/pve-cluster
total 976
 21984 drwxr-xr-x  3 root root       7 May  8 07:54 .
     5 drwxr-xr-x 38 root root      38 May  7 10:49 ..
 66979 drwxr-xr-x  2 root root       3 Apr  1 06:28 backup
 66182 -rw-------  1 root root  962560 May 14 13:55 config.db
175077 -rw-------  1 root root   32768 May 14 13:55 config.db-shm
175076 -rw-------  1 root root 4148872 May 14 13:56 config.db-wal
 66566 -rw-------  1 root root       0 Mar 31 09:11 .pmxcfs.lockfile
root@pve-11:~#

Kind regards,

Angelo.

oguz · May 14, 2020

everything looks okay, i'm really not sure why this is happening...

if this happens again, please open a bug report on https://bugzilla.proxmox.com

to get things forward you can try reinstalling that server

if you get the issue again, you can also try to start pmxcfs in local mode (stopping the service and starting pmxcfs manually with '-l' flag), that way we can figure out if it's a corosync issue

Search

Search

pmxcfs - dcdb failure

Angelo

Active Member

Attachments

oguz

Proxmox Retired Staff

Angelo

Active Member

oguz

Proxmox Retired Staff

Angelo

Active Member

Angelo

Active Member

oguz

Proxmox Retired Staff

Angelo

Active Member

oguz

Proxmox Retired Staff

Angelo

Active Member

oguz

Proxmox Retired Staff

We value your privacy