Ceph cluster broke after updating proxmox to the latest 6.4, please help

Xenocit

Member
Jul 7, 2021
10
1
8
42
I am in serious trouble. I am running a 4 node PVE cluster in the office.
Today I started to update the nodes one by one to the latest 6.4 version in order to prepare for Proxmox 7 update.
After I updated and restarted 2 of the nodes, the ceph seems to degrade and start complaining that the other 2 nodes are running older versions of ceph in the ceph cluster.
At this point everything went south - VMs hang. I rushed to perform the upgrade and restart on the rest 2 nodes.

The PVE cluster is now UP - all Nodes are green, but the Ceph Cluster is not. I get timeout (500) in the web interface.

Every PVE has 4 SSD OSDs in the cluster and all but 1 VMs are using the Ceph.... I have no idea what to do at this point, I am in very very BIG trouble if I don't recover the cluster - I am positive that the drives are healthy.

Unfortunately I have no subscription but I am ready to get on asap if anyone can please help me!
 
the following might help getting an overview of the situation:
pveversion -v on all nodes
ceph -s
 
ceph -s just hangs on all 4 nodes - no results showing

We are ready to provide remove access (SSH or IPMI KVM to the nodes) as we are getting desparate here. Huge mistake was made to approach this update without making backups and all this data is very critical to our operation. We are ISP and all our customers have no services atm :(
I am available on phone +359885511000 (Ivan). Ofource subscriptions/payments will be made as necessary.

pveversion follows:

### NODE 1 ###
root@pve-n1:~# pveversion -v
proxmox-ve: 6.4-1 (running kernel: 5.4.124-1-pve)
pve-manager: 6.4-11 (running version: 6.4-11/28d576c2)
pve-kernel-5.4: 6.4-4
pve-kernel-helper: 6.4-4
pve-kernel-5.3: 6.1-6
pve-kernel-5.0: 6.0-11
pve-kernel-5.4.124-1-pve: 5.4.124-1
pve-kernel-5.4.78-2-pve: 5.4.78-2
pve-kernel-5.4.73-1-pve: 5.4.73-1
pve-kernel-5.4.44-2-pve: 5.4.44-2
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-5.3.13-1-pve: 5.3.13-1
pve-kernel-5.0.21-5-pve: 5.0.21-10
pve-kernel-5.0.18-1-pve: 5.0.18-3
pve-kernel-5.0.15-1-pve: 5.0.15-1
ceph: 14.2.20-pve1
ceph-fuse: 14.2.20-pve1
corosync: 3.1.2-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: residual config
ifupdown2: 3.0.0-1+pve3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.20-pve1
libproxmox-acme-perl: 1.1.0
libproxmox-backup-qemu0: 1.1.0-1
libpve-access-control: 6.4-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.4-3
libpve-guest-common-perl: 3.1-5
libpve-http-server-perl: 3.2-3
libpve-storage-perl: 6.4-1
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.6-2
lxcfs: 4.0.6-pve1
novnc-pve: 1.1.0-1
openvswitch-switch: 2.12.3-1
proxmox-backup-client: 1.1.10-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.6-1
pve-cluster: 6.4-1
pve-container: 3.3-5
pve-docs: 6.4-2
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-4
pve-firmware: 3.2-4
pve-ha-manager: 3.1-1
pve-i18n: 2.3-1
pve-qemu-kvm: 5.2.0-6
pve-xtermjs: 4.7.0-3
qemu-server: 6.4-2
smartmontools: 7.2-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 2.0.4-pve1

### NODE 2 ###
root@pve-n2:~# pveversion -v
proxmox-ve: 6.4-1 (running kernel: 5.4.124-1-pve)
pve-manager: 6.4-11 (running version: 6.4-11/28d576c2)
pve-kernel-5.4: 6.4-4
pve-kernel-helper: 6.4-4
pve-kernel-5.3: 6.1-6
pve-kernel-5.0: 6.0-11
pve-kernel-5.4.124-1-pve: 5.4.124-1
pve-kernel-5.4.78-2-pve: 5.4.78-2
pve-kernel-5.4.73-1-pve: 5.4.73-1
pve-kernel-5.4.44-2-pve: 5.4.44-2
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-5.3.13-1-pve: 5.3.13-1
pve-kernel-5.0.21-5-pve: 5.0.21-10
pve-kernel-5.0.18-1-pve: 5.0.18-3
pve-kernel-5.0.15-1-pve: 5.0.15-1
ceph: 14.2.20-pve1
ceph-fuse: 14.2.20-pve1
corosync: 3.1.2-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: residual config
ifupdown2: 3.0.0-1+pve3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.20-pve1
libproxmox-acme-perl: 1.1.0
libproxmox-backup-qemu0: 1.1.0-1
libpve-access-control: 6.4-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.4-3
libpve-guest-common-perl: 3.1-5
libpve-http-server-perl: 3.2-3
libpve-storage-perl: 6.4-1
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.6-2
lxcfs: 4.0.6-pve1
novnc-pve: 1.1.0-1
openvswitch-switch: 2.12.3-1
proxmox-backup-client: 1.1.10-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.6-1
pve-cluster: 6.4-1
pve-container: 3.3-5
pve-docs: 6.4-2
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-4
pve-firmware: 3.2-4
pve-ha-manager: 3.1-1
pve-i18n: 2.3-1
pve-qemu-kvm: 5.2.0-6
pve-xtermjs: 4.7.0-3
qemu-server: 6.4-2
smartmontools: 7.2-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 2.0.4-pve1

### NODE 3 ###
root@pve-n3:~# pveversion -v
proxmox-ve: 6.4-1 (running kernel: 5.4.124-1-pve)
pve-manager: 6.4-11 (running version: 6.4-11/28d576c2)
pve-kernel-5.4: 6.4-4
pve-kernel-helper: 6.4-4
pve-kernel-5.3: 6.1-6
pve-kernel-5.0: 6.0-11
pve-kernel-5.4.124-1-pve: 5.4.124-1
pve-kernel-5.4.78-2-pve: 5.4.78-2
pve-kernel-5.4.73-1-pve: 5.4.73-1
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-5.0.21-5-pve: 5.0.21-10
pve-kernel-5.0.15-1-pve: 5.0.15-1
ceph: 14.2.20-pve1
ceph-fuse: 14.2.20-pve1
corosync: 3.1.2-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.20-pve1
libproxmox-acme-perl: 1.1.0
libproxmox-backup-qemu0: 1.1.0-1
libpve-access-control: 6.4-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.4-3
libpve-guest-common-perl: 3.1-5
libpve-http-server-perl: 3.2-3
libpve-storage-perl: 6.4-1
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.6-2
lxcfs: 4.0.6-pve1
novnc-pve: 1.1.0-1
openvswitch-switch: 2.12.3-1
proxmox-backup-client: 1.1.10-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.6-1
pve-cluster: 6.4-1
pve-container: 3.3-5
pve-docs: 6.4-2
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-4
pve-firmware: 3.2-4
pve-ha-manager: 3.1-1
pve-i18n: 2.3-1
pve-qemu-kvm: 5.2.0-6
pve-xtermjs: 4.7.0-3
qemu-server: 6.4-2
smartmontools: 7.2-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 2.0.4-pve1

### NODE 4 ###
root@pve-n4:~# pveversion -v
proxmox-ve: 6.4-1 (running kernel: 5.4.124-1-pve)
pve-manager: 6.4-11 (running version: 6.4-11/28d576c2)
pve-kernel-5.4: 6.4-4
pve-kernel-helper: 6.4-4
pve-kernel-5.3: 6.1-6
pve-kernel-5.0: 6.0-11
pve-kernel-5.4.124-1-pve: 5.4.124-1
pve-kernel-5.4.78-2-pve: 5.4.78-2
pve-kernel-5.4.73-1-pve: 5.4.73-1
pve-kernel-5.4.44-2-pve: 5.4.44-2
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-5.3.13-1-pve: 5.3.13-1
pve-kernel-5.3.10-1-pve: 5.3.10-1
pve-kernel-5.0.21-5-pve: 5.0.21-10
pve-kernel-5.0.18-1-pve: 5.0.18-3
pve-kernel-5.0.15-1-pve: 5.0.15-1
ceph: 14.2.20-pve1
ceph-fuse: 14.2.20-pve1
corosync: 3.1.2-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: residual config
ifupdown2: 3.0.0-1+pve3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.20-pve1
libproxmox-acme-perl: 1.1.0
libproxmox-backup-qemu0: 1.1.0-1
libpve-access-control: 6.4-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.4-3
libpve-guest-common-perl: 3.1-5
libpve-http-server-perl: 3.2-3
libpve-storage-perl: 6.4-1
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.6-2
lxcfs: 4.0.6-pve1
novnc-pve: 1.1.0-1
openvswitch-switch: 2.12.3-1
proxmox-backup-client: 1.1.10-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.6-1
pve-cluster: 6.4-1
pve-container: 3.3-5
pve-docs: 6.4-2
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-4
pve-firmware: 3.2-4
pve-ha-manager: 3.1-1
pve-i18n: 2.3-1
pve-qemu-kvm: 5.2.0-6
pve-xtermjs: 4.7.0-3
qemu-server: 6.4-2
smartmontools: 7.2-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 2.0.4-pve1
 
Please provide your Ceph config (cat /etc/ceph/ceph.conf) and the output of ip -details a in addition to the journal @fabian asked for.
 
Please recommend what subscription and how many licenses should we purchase to start working on this asap?

Here is the output of:
journalctl -b -u "ceph*"
from NODE-1:

-- Logs begin at Wed 2021-07-07 12:31:40 EEST, end at Wed 2021-07-07 13:37:17 EEST. --
Jul 07 12:31:42 pve-n1 systemd[1]: Starting Ceph Volume activation: lvm-0-3de32be9-7235-412f-99b1-b039ec2fac6c...
Jul 07 12:31:42 pve-n1 systemd[1]: Starting Ceph Volume activation: lvm-1-06e845e6-8803-4077-b833-a4bb0a238d2f...
Jul 07 12:31:42 pve-n1 systemd[1]: Starting Ceph Volume activation: lvm-2-14a10a3b-24b7-4ad1-b774-8786a9a49242...
Jul 07 12:31:43 pve-n1 systemd[1]: Started Ceph crash dump collector.
Jul 07 12:31:43 pve-n1 systemd[1]: Starting Ceph Volume activation: lvm-3-c68e6a1e-5a7c-493a-a0ed-388571440041...
Jul 07 12:31:43 pve-n1 ceph-crash[1176]: INFO:__main__:monitoring path /var/lib/ceph/crash, delay 600s
Jul 07 12:31:43 pve-n1 sh[1169]: Running command: /usr/sbin/ceph-volume lvm trigger 2-14a10a3b-24b7-4ad1-b774-8786a9a49242
Jul 07 12:31:43 pve-n1 sh[1186]: Running command: /usr/sbin/ceph-volume lvm trigger 3-c68e6a1e-5a7c-493a-a0ed-388571440041
Jul 07 12:31:43 pve-n1 sh[1160]: Running command: /usr/sbin/ceph-volume lvm trigger 0-3de32be9-7235-412f-99b1-b039ec2fac6c
Jul 07 12:31:43 pve-n1 sh[1163]: Running command: /usr/sbin/ceph-volume lvm trigger 1-06e845e6-8803-4077-b833-a4bb0a238d2f
Jul 07 12:31:47 pve-n1 systemd[1]: Started Ceph cluster manager daemon.
Jul 07 12:31:47 pve-n1 systemd[1]: Started Ceph cluster monitor daemon.
Jul 07 12:31:47 pve-n1 systemd[1]: Reached target ceph target allowing to start/stop all ceph-mon@.service instances at once.
Jul 07 12:31:47 pve-n1 systemd[1]: Reached target ceph target allowing to start/stop all ceph-mgr@.service instances at once.
Jul 07 12:31:47 pve-n1 systemd[1]: Reached target ceph target allowing to start/stop all ceph-osd@.service instances at once.
Jul 07 12:31:47 pve-n1 systemd[1]: Reached target ceph target allowing to start/stop all ceph-mds@.service instances at once.
Jul 07 12:31:47 pve-n1 systemd[1]: Reached target ceph target allowing to start/stop all ceph*@.service instances at once.
Jul 07 12:31:48 pve-n1 sh[1169]: Running command: /usr/sbin/ceph-volume lvm trigger 2-14a10a3b-24b7-4ad1-b774-8786a9a49242
Jul 07 12:31:48 pve-n1 sh[1163]: Running command: /usr/sbin/ceph-volume lvm trigger 1-06e845e6-8803-4077-b833-a4bb0a238d2f
Jul 07 12:31:48 pve-n1 sh[1186]: Running command: /usr/sbin/ceph-volume lvm trigger 3-c68e6a1e-5a7c-493a-a0ed-388571440041
Jul 07 12:31:48 pve-n1 sh[1160]: Running command: /usr/sbin/ceph-volume lvm trigger 0-3de32be9-7235-412f-99b1-b039ec2fac6c
Jul 07 12:32:22 pve-n1 ceph-mon[1891]: 2021-07-07 12:32:22.379 7efc93c2b700 -1 mon.pve-n1@0(probing) e4 get_health_metrics reporting 1 slow ops, oldest is auth(proto 0 73 bytes epoch 0)
Jul 07 12:32:27 pve-n1 ceph-mon[1891]: 2021-07-07 12:32:27.383 7efc93c2b700 -1 mon.pve-n1@0(probing) e4 get_health_metrics reporting 1 slow ops, oldest is auth(proto 0 73 bytes epoch 0)
Jul 07 12:32:32 pve-n1 ceph-mon[1891]: 2021-07-07 12:32:32.383 7efc93c2b700 -1 mon.pve-n1@0(probing) e4 get_health_metrics reporting 1 slow ops, oldest is auth(proto 0 73 bytes epoch 0)
Jul 07 12:32:37 pve-n1 ceph-mon[1891]: 2021-07-07 12:32:37.383 7efc93c2b700 -1 mon.pve-n1@0(probing) e4 get_health_metrics reporting 1 slow ops, oldest is auth(proto 0 73 bytes epoch 0)
Jul 07 12:32:42 pve-n1 ceph-mon[1891]: 2021-07-07 12:32:42.383 7efc93c2b700 -1 mon.pve-n1@0(probing) e4 get_health_metrics reporting 1 slow ops, oldest is auth(proto 0 73 bytes epoch 0)
Jul 07 12:32:47 pve-n1 ceph-mon[1891]: 2021-07-07 12:32:47.379 7efc93c2b700 -1 mon.pve-n1@0(probing) e4 get_health_metrics reporting 1 slow ops, oldest is auth(proto 0 73 bytes epoch 0)

#### THIS LINE REPEATS ALOT UNTIL:
Jul 07 13:21:42 pve-n1 ceph-mon[1891]: 2021-07-07 13:21:42.496 7efc93c2b700 -1 mon.pve-n1@0(probing) e4 get_health_metrics reporting 1 slow ops, oldest is auth(proto 0 73 bytes epoch 0)
Jul 07 13:21:47 pve-n1 ceph-mgr[1890]: failed to fetch mon config (--no-mon-config to skip)
Jul 07 13:21:47 pve-n1 systemd[1]: ceph-mgr@pve-n1.service: Main process exited, code=exited, status=1/FAILURE
Jul 07 13:21:47 pve-n1 systemd[1]: ceph-mgr@pve-n1.service: Failed with result 'exit-code'.
Jul 07 13:21:47 pve-n1 ceph-mon[1891]: 2021-07-07 13:21:47.496 7efc93c2b700 -1 mon.pve-n1@0(probing) e4 get_health_metrics reporting 1 slow ops, oldest is auth(proto 0 73 bytes epoch 0)
Jul 07 13:21:52 pve-n1 ceph-mon[1891]: 2021-07-07 13:21:52.496 7efc93c2b700 -1 mon.pve-n1@0(probing) e4 get_health_metrics reporting 1 slow ops, oldest is auth(proto 0 73 bytes epoch 0)
Jul 07 13:21:53 pve-n1 sh[1169]: Running command: /usr/sbin/ceph-volume lvm trigger 2-14a10a3b-24b7-4ad1-b774-8786a9a49242
Jul 07 13:21:54 pve-n1 sh[1186]: Running command: /usr/sbin/ceph-volume lvm trigger 3-c68e6a1e-5a7c-493a-a0ed-388571440041
Jul 07 13:21:54 pve-n1 sh[1163]: Running command: /usr/sbin/ceph-volume lvm trigger 1-06e845e6-8803-4077-b833-a4bb0a238d2f
Jul 07 13:21:54 pve-n1 sh[1160]: Running command: /usr/sbin/ceph-volume lvm trigger 0-3de32be9-7235-412f-99b1-b039ec2fac6c
Jul 07 13:21:57 pve-n1 ceph-mon[1891]: 2021-07-07 13:21:57.496 7efc93c2b700 -1 mon.pve-n1@0(probing) e4 get_health_metrics reporting 1 slow ops, oldest is auth(proto 0 73 bytes epoch 0)
Jul 07 13:21:57 pve-n1 systemd[1]: ceph-mgr@pve-n1.service: Service RestartSec=10s expired, scheduling restart.
Jul 07 13:21:57 pve-n1 systemd[1]: ceph-mgr@pve-n1.service: Scheduled restart job, restart counter is at 1.
Jul 07 13:21:57 pve-n1 systemd[1]: Stopped Ceph cluster manager daemon.
Jul 07 13:21:57 pve-n1 systemd[1]: Started Ceph cluster manager daemon.
Jul 07 13:22:02 pve-n1 ceph-mon[1891]: 2021-07-07 13:22:02.496 7efc93c2b700 -1 mon.pve-n1@0(probing) e4 get_health_metrics reporting 1 slow ops, oldest is auth(proto 0 73 bytes epoch 0)

### THEN AGAIN ALOT OF THIS LINE:
Jul 07 13:22:07 pve-n1 ceph-mon[1891]: 2021-07-07 13:22:07.496 7efc93c2b700 -1 mon.pve-n1@0(probing) e4 get_health_metrics reporting 1 slow ops, oldest is auth(proto 0 73 bytes epoch 0)
Jul 07 13:22:12 pve-n1 ceph-mon[1891]: 2021-07-07 13:22:12.496 7efc93c2b700 -1 mon.pve-n1@0(probing) e4 get_health_metrics reporting 1 slow ops, oldest is auth(proto 0 73 bytes epoch 0)
Jul 07 13:22:17 pve-n1 ceph-mon[1891]: 2021-07-07 13:22:17.496 7efc93c2b700 -1 mon.pve-n1@0(probing) e4 get_health_metrics reporting 1 slow ops, oldest is auth(proto 0 73 bytes epoch 0)
Jul 07 13:22:22 pve-n1 ceph-mon[1891]: 2021-07-07 13:22:22.496 7efc93c2b700 -1 mon.pve-n1@0(probing) e4 get_health_metrics reporting 1 slow ops, oldest is auth(proto 0 73 bytes epoch 0)
Jul 07 13:22:27 pve-n1 ceph-mon[1891]: 2021-07-07 13:22:27.500 7efc93c2b700 -1 mon.pve-n1@0(probing) e4 get_health_metrics reporting 1 slow ops, oldest is auth(proto 0 73 bytes epoch 0)
Jul 07 13:22:32 pve-n1 ceph-mon[1891]: 2021-07-07 13:22:32.500 7efc93c2b700 -1 mon.pve-n1@0(probing) e4 get_health_metrics reporting 1 slow ops, oldest is auth(proto 0 73 bytes epoch 0)
Jul 07 13:22:37 pve-n1 ceph-mon[1891]: 2021-07-07 13:22:37.500 7efc93c2b700 -1 mon.pve-n1@0(probing) e4 get_health_metrics reporting 1 slow ops, oldest is auth(proto 0 73 bytes epoch 0)
Jul 07 13:22:42 pve-n1 ceph-mon[1891]: 2021-07-07 13:22:42.500 7efc93c2b700 -1 mon.pve-n1@0(probing) e4 get_health_metrics reporting 1 slow ops, oldest is auth(proto 0 73 bytes epoch 0)
Jul 07 13:22:47 pve-n1 ceph-mon[1891]: 2021-07-07 13:22:47.500 7efc93c2b700 -1 mon.pve-n1@0(probing) e4 get_health_metrics reporting 1 slow ops, oldest is auth(proto 0 73 bytes epoch 0)
Jul 07 13:22:52 pve-n1 ceph-mon[1891]: 2021-07-07 13:22:52.500 7efc93c2b700 -1 mon.pve-n1@0(probing) e4 get_health_metrics reporting 1 slow ops, oldest is auth(proto 0 73 bytes epoch 0)
Jul 07 13:22:57 pve-n1 ceph-mon[1891]: 2021-07-07 13:22:57.500 7efc93c2b700 -1 mon.pve-n1@0(probing) e4 get_health_metrics reporting 1 slow ops, oldest is auth(proto 0 73 bytes epoch 0)
 
Please recommend what subscription and how many licenses should we purchase to start working on this asap?
We sent you all details as answer to your request on office@proxmox.com, but your email systems in not up and running.

=> Host or domain name not found. Name service error for name=_____ type=MX: Host not found, try again

Please use a working email for ordering your support package via https://shop.maurer-it.com
 
  • Like
Reactions: Stoiko Ivanov
Here is the complete output from all 4 nodes for journalctl -b -u "ceph*"
A little background on how things went:
1. As usual I started updating from node-4, but first I live migrate all VMs from node-4 to node-3. Upgrade and restart node-4.
2. After node-4 is up I live migrate all VMs from node-3 to node-4 and perform upgrade and restart on node-3.
3. After node-3 went up, ceph degraded and start complaining about older versions running on node-1 and node-2.
4. All VMs started slowing down and freezing.
5. At this point I guess I paniced and decided to quickly upgrade and restart both node-1 and node-2 at the same time. The Ceph dashboard was still showing the degraded health and moniotrs in the we at this point, but after node-1 and 2 upgraded and restarted it only shows timeout (500).

I know how stupid was of me to not check for fresh backups before attempting all this... But now the only thing that matters is I can get out of this without loosing all company data for a veeeeery long time :|
 

Attachments

We sent you all details as answer to your request on office@proxmox.com, but your email systems in not up and running.

=> Host or domain name not found. Name service error for name=_____ type=MX: Host not found, try again

Please use a working email for ordering your support package via https://shop.maurer-it.com
I've changed my account e-mail as the company one is not working for the same reason - our DNS VMs are down ofc.
Can you please send the email to the e-mail I'm using now?
 
We sent you all details as answer to your request on office@proxmox.com, but your email systems in not up and running.

=> Host or domain name not found. Name service error for name=_____ type=MX: Host not found, try again

Please use a working email for ordering your support package via https://shop.maurer-it.com
I've just messaged to office@proxmox.com from my personal gmail account. Please resend your response.
 
As you have four nodes (with 2-CPUs), you need for all nodes a subscriptions and not just one.

Details on https://www.proxmox.com/en/downloads/item/proxmox-ve-subscription-agreement
I've submitted a ticked and activated the license on NODE-1. I am currently working on getting the money for the other 3 licenses. My humble request is to please get some to take a look at it - I've provided login detail in the ticket to the WebUI and SSH for all 4 nodes. Please guys, I am dying here...
 
Oh man. Can We have an update on what happened next? It was like that "24" show where all action happens in one day.

I'm afraid of upgrading now.
 
I upgraded and everything broke too... been down for 4 months... just too busy to try to think about it much anymore. Come back every week to see if anyone else had similar issues I can learn from without being called names... if this got resolved, would love to hear how.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!