Hi All.
Normally i just read these and they mainly cover my needs. Having some trouble finding info about my issue. So figured i would try.
I have a 10 node cluster with about 50 VMs running. All running Ubunutu 20.04 LTS. Recently upgraded from 8.3.5(?) to 9.1.4. using the directions from Proxmox including the pve8to9 script. Script looked clean. The cluster is using local-zfs for storage on each VM, and what i noticed first was the SCSI Controller, megaRaid sas option no longer could see the VM hard disk (scsi0). Moving them to the default seemed to fix the issue. Shortly after I started getting the VMs freezing/crashing. The console would lock up and the VMs would stop responding. A reboot would resolve the issue for a short time until the next freeze < 24 hours. The VMs are running 5.15.0-138-generic kernels. Reviewing the logs nothing sticks out at all before the crash so it does not know its coming.
I could really use some help isolating the issue. I have tried rebooting the PVE nodes, making sure they are up to date, rebuilding the VMs. What logs or configs can I provide that might help figure out what i have messed up. On pve 8 this cluster and all vms ran for over a year with no issues at all so I am a bit lost because for me proxmox has always just worked. upgrading from 7to8 was a breeze back when i did it.
Thank you in advance for any help i can get.
Normally i just read these and they mainly cover my needs. Having some trouble finding info about my issue. So figured i would try.
I have a 10 node cluster with about 50 VMs running. All running Ubunutu 20.04 LTS. Recently upgraded from 8.3.5(?) to 9.1.4. using the directions from Proxmox including the pve8to9 script. Script looked clean. The cluster is using local-zfs for storage on each VM, and what i noticed first was the SCSI Controller, megaRaid sas option no longer could see the VM hard disk (scsi0). Moving them to the default seemed to fix the issue. Shortly after I started getting the VMs freezing/crashing. The console would lock up and the VMs would stop responding. A reboot would resolve the issue for a short time until the next freeze < 24 hours. The VMs are running 5.15.0-138-generic kernels. Reviewing the logs nothing sticks out at all before the crash so it does not know its coming.
I could really use some help isolating the issue. I have tried rebooting the PVE nodes, making sure they are up to date, rebuilding the VMs. What logs or configs can I provide that might help figure out what i have messed up. On pve 8 this cluster and all vms ran for over a year with no issues at all so I am a bit lost because for me proxmox has always just worked. upgrading from 7to8 was a breeze back when i did it.
Thank you in advance for any help i can get.