I've had PBS working, and backing up for over a year now, 399 backups and only 1 failure. However, for the last week now I'm getting the following message in my emails. Note ALL backups are failing, on all server nodes. My PVE environment is 4 server nodes, multiple VM's on each, and all the server/nodes I keep "updated" via the same command below.
vzdump backup status (pve.mydomain.com) : backup failed: could not activate storage 'Luke': Luke: error fetching datastores - 500 Can't connect to 10.5.1.7:8007 (Connection timed out)
The only change with anything was my normal "sudo apt update && sudo apt dist-upgrade". Which has been my updating process now for over a year, again NO problems at all until last week. There was an update, but I didn't pay attention to the time stamp to see if it was related closely to said update. However, that update only had a few items and to me nothing critical that would have caused this issue. Note, NO hardware changes, and no server power cycles around the time frame in question. Was power cycled a few weeks ago due to weather power outages, but had successful backups since the power was restored, many backups before THIS issue started.
I cannot rule out hardware, yet, but I am ordering new HD's and will completely destroy the RAID and start fresh, if I cannot find some other solution. I feel as though it could be hardware related, is an older Dell server that is due to replace in the next year, but again has been running fine for over a year without any issues until this. Could be an OS/software bug, which to me makes sense too. Still actively troubleshooting this, just wondering if anyone else has seen this or has suggestions.
Here's a copy of my datastore.cfg, which the drives are SSD 1TB in a RAID5 configuration, PERC controller.
root@pbs-luke:~# cat /etc/proxmox-backup/datastore.cfg
datastore: Backups
comment
gc-schedule daily
path /Backups
vzdump backup status (pve.mydomain.com) : backup failed: could not activate storage 'Luke': Luke: error fetching datastores - 500 Can't connect to 10.5.1.7:8007 (Connection timed out)
The only change with anything was my normal "sudo apt update && sudo apt dist-upgrade". Which has been my updating process now for over a year, again NO problems at all until last week. There was an update, but I didn't pay attention to the time stamp to see if it was related closely to said update. However, that update only had a few items and to me nothing critical that would have caused this issue. Note, NO hardware changes, and no server power cycles around the time frame in question. Was power cycled a few weeks ago due to weather power outages, but had successful backups since the power was restored, many backups before THIS issue started.
I cannot rule out hardware, yet, but I am ordering new HD's and will completely destroy the RAID and start fresh, if I cannot find some other solution. I feel as though it could be hardware related, is an older Dell server that is due to replace in the next year, but again has been running fine for over a year without any issues until this. Could be an OS/software bug, which to me makes sense too. Still actively troubleshooting this, just wondering if anyone else has seen this or has suggestions.
Here's a copy of my datastore.cfg, which the drives are SSD 1TB in a RAID5 configuration, PERC controller.
root@pbs-luke:~# cat /etc/proxmox-backup/datastore.cfg
datastore: Backups
comment
gc-schedule daily
path /Backups