Recent content by Zephrant

  1. Backup server eating drives

    Smartctl was not showing errors, only the zpool status (read and write). The above errors looked suspicious, so I cleared it and let the drive resilver back in to use. Then I wrote 74T to the pool, and scrubbed it. No errors still. So I've now gone almost a week without any errors at all...
  2. Backup server eating drives

    I took a working FreeNAS system and reformatted it for Proxmox backup server. It contains 36 4T Seagate SAS drives, and has been in use for almost three years. After I started using it, it started getting errors on the drives, and failing them out of the zpool. Recently, it was failing a drive...
  3. Occasional backup failures

    I thought that setting a BW limit of 1G on the backup server helped, but it's still doing it after another day or two. Oddly, when I backup to a NFS mount (NetApp) I get no errors. The same VMs backed up to a Proxmox backup server gets timeouts. Both the NFS and the Proxmox backup server have...
  4. Occasional backup failures

    Email: 430 test1 FAILED 00:02:33 VM 430 qmp command 'backup' failed - got timeout From the backup server: 2022-01-21T01:05:38-08:00: starting new backup on datastore 'store1': "vm/430/2022-01-21T09:07:33Z" 2022-01-21T01:05:38-08:00: download 'index.json.blob' from previous backup...
  5. Occasional backup failures

    The cluster nodes each have dual 40g links to dual switches, in a trunk. The backup server has dual 10g links, so could be buried by 12 high-end nodes doing backup simultaneously. The CEPH runs on a vlan on the same trunk as the backup, which is on another vlan. Any tips on how to slow down...
  6. Occasional backup failures

    It looks like all nodes backup simultaneously. Is there any way to spread out the backups, maybe have the nodes go sequentially? It's not a race, I don't care how long it takes as long as it is less than a few hours.
  7. Occasional backup failures

    I've not found how to increase the timeout. This is becoming very concerning though. Most every night I have a few VMs that fail to backup. 420 VM 420 FAILED 00:00:00 unable to open file '/etc/pve/nodes/test-prox-n101/qemu-server/420.conf.tmp.729468' - Device or resource busy 902...
  8. Occasional backup failures

    Sometimes the VM is shutdown, so nothing in the logs. Had one failure this weekend of a VM that has been off for a week. 6 failures last night. 17 fails out of 2,977 backups so far. Worth noting, I'm backing up to a NFS mount twice a day too (offset by six hours), and no failures occurred on...
  9. Occasional backup failures

    Just got a new failure: 118: 2021-11-19 12:32:06 INFO: Starting Backup of VM 118 (qemu) 118: 2021-11-19 12:32:06 INFO: status = running 118: 2021-11-19 12:32:06 INFO: VM Name: spk-ubuntu-test2 118: 2021-11-19 12:32:06 INFO: include disk 'scsi0' 'spk-ceph-pool1:vm-118-disk-0' 32G 118: 2021-11-19...
  10. Occasional backup failures

    My test bed backs up 4 times a day- twice to a NFS mount, and twice to the Proxmox Backup server. No additional failures since the above, no network or other changes since then either. This was not the first time backups have failed. Out of 2713 backups to the Proxmox Backup server, I have 15...
  11. Occasional backup failures

    Sorry, was reporting the Proxmox Backup version. My Proxmox cluster was updated to the latest a few weeks ago. It's on 7.0-13.
  12. Occasional backup failures

    I was running 2.0-13, I just tripped an update to 2.0-14. No failures in backups last night though.
  13. Occasional backup failures

    The VMs that fail appear to be random, I wouldn't want to disable the guest agent on all of my VMs. They are all upgraded to the lastest version AFAIK, this is a test bed so all-new.
  14. Occasional backup failures

    I have a cluster backing up to a dedicated Proxmox Backup server, which is normally working great. Out of 54 VMs, three, all from the same node failed backup last night: 313: 2021-11-15 01:04:04 INFO: Starting Backup of VM 313 (qemu) 313: 2021-11-15 01:04:04 INFO: status = running 313...
  15. Backup on Supermicro, Broadcom SAS3008

    I've read that as well, and sounds like good advice, but I couldn't get it to work. I tried to remove/replace a drive by ID, and it wouldn't take it, but maybe that's because I didn't create the pool that way. Managing 36 disk-name IDs on one command line when they are each 22+ chars would be...

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!