Hello everybody,
today my PVE host received an additional RAID1 storage upgrade.
Before the update, I had 2 x 240 GB SSD as Hardware-RAID1 (Adaptec controller).
To that controller, another 2 x 480 GB SSD as Hardware-RAID1 were added. Ever since the upgrade was done, my pve-cluster will not start.
Events:
* Before the downtime, I shutdown the node using the web GUI - I confirmed the the power off using my IPMI.
* data center provider added the disks physically and created another RAID 1 on the controller.
* PVE node was booted up again.
After this, I noticed I couldn't access the machine via Web GUI.
* I logged on as root via IPMI/KVM and checked out the situation. The network was fine so I simply used "reboot".
* After the reboot, the logon prompt came up and I was able to ping the host from the internet. So i tried to login and saw these messages on the console (pve1.png).
* So I rebooted again, and got this:
* Then I ran fsck -n /dev/mapper/pve-root to see the errors:
* And the same command without "-n" to really do the changes and corrected all the errors. In the end, the filesystem was clean.
* I used the command "exit" to boot again and got the logon prompt.
During login I noticed some errors (which were most probably there after the upgrade - I just hadn't noticed them)
* So I checked out systemctl status pve-cluster:
It seems my /var/lib/pve-cluster/config.db has a problem - although I am not really sure.
The node is standalone and I'm quite new to Proxmox.
Are my VMs still ok and is there any way to recover from this error?
* ls -l /var/lib/pve-cluster (notice the config.db.bak which I created manually after receiving these errors):
Any help is deeply appreciated!
Kind regards and thanks
JS1
today my PVE host received an additional RAID1 storage upgrade.
Before the update, I had 2 x 240 GB SSD as Hardware-RAID1 (Adaptec controller).
To that controller, another 2 x 480 GB SSD as Hardware-RAID1 were added. Ever since the upgrade was done, my pve-cluster will not start.
Events:
* Before the downtime, I shutdown the node using the web GUI - I confirmed the the power off using my IPMI.
* data center provider added the disks physically and created another RAID 1 on the controller.
* PVE node was booted up again.
After this, I noticed I couldn't access the machine via Web GUI.
* I logged on as root via IPMI/KVM and checked out the situation. The network was fine so I simply used "reboot".
* After the reboot, the logon prompt came up and I was able to ping the host from the internet. So i tried to login and saw these messages on the console (pve1.png).
* So I rebooted again, and got this:
* Then I ran fsck -n /dev/mapper/pve-root to see the errors:
* And the same command without "-n" to really do the changes and corrected all the errors. In the end, the filesystem was clean.
* I used the command "exit" to boot again and got the logon prompt.
During login I noticed some errors (which were most probably there after the upgrade - I just hadn't noticed them)
* So I checked out systemctl status pve-cluster:
It seems my /var/lib/pve-cluster/config.db has a problem - although I am not really sure.
The node is standalone and I'm quite new to Proxmox.
Are my VMs still ok and is there any way to recover from this error?
* ls -l /var/lib/pve-cluster (notice the config.db.bak which I created manually after receiving these errors):
Any help is deeply appreciated!
Kind regards and thanks
JS1