Very curious situation, which occured as well on another node of mine which I've been slowly unloading to reinstall completely, but has now happened on a more recently installed node.
Mariadb crashed for some reason (it is doing this much more frequently under pve 5.3 and .4 with maria 10.2 an 10.3, it used to never...) , but I could not restart it:
The odd thing is I couldnt also run ps auxww from that shell actually, because I was pct enter $ID'd into it, it gave:
and I had to ssh into it instead. This absolutely was happening frequently on other CTs on that other host I figure is broken somehow. I think this has to do with some sort of systemd restart or reinit or something that isnt working on containers under new versions of PVE.
Ultimately, to fix mariadb, I had to restart the container entirely. And I can now pct enter and ps auxwwf, until whenever this happens again.
I suppose I should upgrade the PVE host, but upgrading and rebooting entire hosts to get systemd working is tricky with production sites on them.
The container is debian 10.2
This problem is similar to or the same as the one I mentioned here: https://forum.proxmox.com/threads/pct-list-not-working.59820/#post-276406
Mariadb crashed for some reason (it is doing this much more frequently under pve 5.3 and .4 with maria 10.2 an 10.3, it used to never...) , but I could not restart it:
Code:
# service mysql start
Job for mariadb.service failed because the control process exited with error code.
See "systemctl status mariadb.service" and "journalctl -xe" for details.
#systemctl status mariadb.service
* mariadb.service - MariaDB 10.3.18 database server
Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Wed 2019-12-04 16:09:27 EST; 41s ago
Docs: man:mysqld(8)
https://mariadb.com/kb/en/library/systemd/
Process: 11181 ExecStartPre=/usr/bin/install -m 755 -o mysql -g root -d /var/run/mysqld (code=exited, status=0/SUCCESS)
Process: 11184 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=1/FAILURE)
Dec 04 16:09:26 test systemd[1]: Starting MariaDB 10.3.18 database server...
Dec 04 16:09:27 test sh[11184]: System has not been booted with systemd as init system (PID 1). Can't operate.
Dec 04 16:09:27 test sh[11184]: Failed to connect to bus: Host is down
Dec 04 16:09:27 test systemd[1]: mariadb.service: Control process exited, code=exited, status=1/FAILURE
Dec 04 16:09:27 test systemd[1]: mariadb.service: Failed with result 'exit-code'.
Dec 04 16:09:27 test systemd[1]: Failed to start MariaDB 10.3.18 database server.
#ps auxww | grep systemd
root 1 0.0 0.3 170504 7940 ? Ss Oct02 2:48 /lib/systemd/systemd --system --deserialize 18
root 83 0.0 0.3 19508 6740 ? Ss Oct02 0:12 /lib/systemd/systemd-logind
message+ 90 0.0 0.1 9424 3556 ? Ss Oct02 8:39 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only
root 5293 0.0 3.9 124144 83100 ? Ss Nov18 4:56 /lib/systemd/systemd-journald
gizmo 5508 0.0 0.2 21024 4332 ? Ss 14:38 0:00 /lib/systemd/systemd --user
root 11225 0.0 0.0 3084 880 pts/2 S+ 16:10 0:00 grep --color systemd
Code:
# ps auxww | grep systemd
Error: /proc must be mounted
To mount /proc at boot you need an /etc/fstab line like:
proc /proc proc defaults
In the meantime, run "mount proc /proc -t proc"
and I had to ssh into it instead. This absolutely was happening frequently on other CTs on that other host I figure is broken somehow. I think this has to do with some sort of systemd restart or reinit or something that isnt working on containers under new versions of PVE.
Ultimately, to fix mariadb, I had to restart the container entirely. And I can now pct enter and ps auxwwf, until whenever this happens again.
I suppose I should upgrade the PVE host, but upgrading and rebooting entire hosts to get systemd working is tricky with production sites on them.
Code:
#pveversion -V
proxmox-ve: 5.3-1 (running kernel: 4.15.18-11-pve)
pve-manager: 5.3-12 (running version: 5.3-12/5fbbbaf6)
pve-kernel-4.15: 5.3-3
pve-kernel-4.15.18-12-pve: 4.15.18-35
pve-kernel-4.15.18-11-pve: 4.15.18-34
corosync: 2.4.4-pve1
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.1-3
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-48
libpve-guest-common-perl: 2.0-20
libpve-http-server-perl: 2.0-12
libpve-storage-perl: 5.0-39
libqb0: 1.0.3-1~bpo9
lvm2: 2.02.168-pve6
lxc-pve: 3.1.0-3
lxcfs: 3.0.3-pve1
novnc-pve: 1.0.0-3
proxmox-widget-toolkit: 1.0-24
pve-cluster: 5.0-34
pve-container: 2.0-35
pve-docs: 5.3-3
pve-edk2-firmware: 1.20190312-1
pve-firewall: 3.0-18
pve-firmware: 2.0-6
pve-ha-manager: 2.0-8
pve-i18n: 1.0-9
pve-libspice-server1: 0.14.1-2
pve-qemu-kvm: 2.12.1-2
pve-xtermjs: 3.10.1-2
qemu-server: 5.0-47
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.13-pve1~bpo2
The container is debian 10.2
This problem is similar to or the same as the one I mentioned here: https://forum.proxmox.com/threads/pct-list-not-working.59820/#post-276406
Last edited: