Not sure what happened when I updated things a while back but I lost 5 of the 9 nodes... nothing special on any of them.. I managed to get the vm's live again from replicated data on other nodes and they are all working...
I let it be for a couple months and have not looked at it since everything was working - but I saw PM7 came out so I went to node1 and looked at it to do update - could not get gui console to load so I had to ssh to it (prob a cert issue?)... ssh from my desktop seemed to work fine... apt-update and apt upgrade went fine
looking at pve6to7 I see some issues...
So I am guessing the fact I have 5 nodes not working is freaking it out... I also deleted a couple LXC's and VM's that I didnt need anymore... like an old minecraft server instance...
I let it be for a couple months and have not looked at it since everything was working - but I saw PM7 came out so I went to node1 and looked at it to do update - could not get gui console to load so I had to ssh to it (prob a cert issue?)... ssh from my desktop seemed to work fine... apt-update and apt upgrade went fine
looking at pve6to7 I see some issues...
Code:
root@stack1:~# pve6to7
= CHECKING VERSION INFORMATION FOR PVE PACKAGES =
Checking for package updates..
PASS: all packages uptodate
Checking proxmox-ve package version..
PASS: proxmox-ve package has version >= 6.4-1
Checking running kernel version..
PASS: expected running kernel '5.4.124-1-pve'.
= CHECKING CLUSTER HEALTH/SETTINGS =
PASS: systemd unit 'pve-cluster.service' is in state 'active'
PASS: systemd unit 'corosync.service' is in state 'active'
FAIL: Cluster Filesystem readonly, lost quorum?!
Analzying quorum settings and state..
FAIL: 4 nodes are offline!
INFO: configured votes - nodes: 8
INFO: configured votes - qdevice: 0
INFO: current expected votes: 8
INFO: current total votes: 4
WARN: total votes < expected votes: 4/8!
Checking nodelist entries..
PASS: nodelist settings OK
Checking totem settings..
PASS: totem settings OK
INFO: run 'pvecm status' to get detailed cluster status..
= CHECKING HYPER-CONVERGED CEPH STATUS =
INFO: hyper-converged ceph setup detected!
INFO: getting Ceph status/health information..
WARN: Ceph health reported as 'HEALTH_WARN'.
Use the PVE dashboard or 'ceph -s' to determine the specific issues and try to resolve them.
INFO: getting Ceph daemon versions..
PASS: single running version detected for daemon type monitor.
PASS: single running version detected for daemon type manager.
PASS: single running version detected for daemon type MDS.
PASS: single running version detected for daemon type OSD.
PASS: single running overall version detected for all Ceph daemon types.
WARN: 'noout' flag not set - recommended to prevent rebalancing during cluster-wide upgrades.
INFO: checking Ceph config..
= CHECKING CONFIGURED STORAGES =
PASS: storage 'CPool1' enabled and active.
PASS: storage 'Ceph-USDivide200' enabled and active.
PASS: storage 'ISO_store1' enabled and active.
PASS: storage 'MinecraftCephFS1' enabled and active.
PASS: storage 'ceph-lxc' enabled and active.
PASS: storage 'ceph-vm1' enabled and active.
PASS: storage 'local' enabled and active.
= MISCELLANEOUS CHECKS =
INFO: Checking common daemon services..
PASS: systemd unit 'pveproxy.service' is in state 'active'
PASS: systemd unit 'pvedaemon.service' is in state 'active'
PASS: systemd unit 'pvestatd.service' is in state 'active'
INFO: Checking for running guests..
PASS: no running guest detected.
INFO: Checking if the local node's hostname 'stack1' is resolvable..
INFO: Checking if resolved IP is configured on local node..
PASS: Resolved node IP '10.0.1.1' configured and active on single interface.
INFO: Checking backup retention settings..
PASS: no problems found.
INFO: checking CIFS credential location..
PASS: no CIFS credentials at outdated location found.
INFO: Checking custom roles for pool permissions..
INFO: Checking node and guest description/note legnth..
PASS: All node config descriptions fit in the new limit of 64 KiB
PASS: All guest config descriptions fit in the new limit of 8 KiB
INFO: Checking container configs for deprecated lxc.cgroup entries
PASS: No legacy 'lxc.cgroup' keys found.
INFO: Checking storage content type configuration..
PASS: no problems found
INFO: Checking if the suite for the Debian security repository is correct..
INFO: Make sure to change the suite of the Debian security repository from 'buster/updates' to 'bullseye-security' - in /etc/apt/sources.list:6
SKIP: NOTE: Expensive checks, like CT cgroupv2 compat, not performed without '--full' parameter
= SUMMARY =
TOTAL: 36
PASSED: 30
SKIPPED: 1
WARNINGS: 3
FAILURES: 2
ATTENTION: Please check the output for detailed information!
Try to solve the problems one at a time and then run this checklist tool again.
So I am guessing the fact I have 5 nodes not working is freaking it out... I also deleted a couple LXC's and VM's that I didnt need anymore... like an old minecraft server instance...
Attachments
Last edited: