For several days we were struggling with a VM crash during the night.
Unfortunately this VM is currently a SPOF, causing a lot of problems.
PVE dashboard showed a yellow triangle with text "IO error".
VM frozen state, no access possible using ssh or console, until poweroff/on.
On the VM itself there was no clue, it just froze.
On the PVE side we noticed at the end of /var/log/vzdump qemu-110.log:
110: 2020-02-14 03:38:16 INFO: status: 71% (924994764800/1299227607040), sparse 5% (66562551808), duration 9288, read/write 13/12 MB/s
110: 2020-02-14 03:38:16 ERROR: vma_queue_write: write error - Broken pipe
110: 2020-02-14 03:38:16 INFO: aborting backup job
110: 2020-02-14 03:38:23 ERROR: Backup of VM 110 failed - vma_queue_write: write error - Broken pipe
From that moment the VM was frozen and not accessible by ssh or console.
Ofcourse we need to expand the storage pool for back-up's...
Work-around for now is to reduce the number of back-up's from 3 to 2.
The VM has a 10GB and 1200GB disk, BACKUP pool just had not enough space.
Just wanted to share this, as we really had no clue why the VM was frozen.
Unfortunately this VM is currently a SPOF, causing a lot of problems.
PVE dashboard showed a yellow triangle with text "IO error".
VM frozen state, no access possible using ssh or console, until poweroff/on.
On the VM itself there was no clue, it just froze.
On the PVE side we noticed at the end of /var/log/vzdump qemu-110.log:
110: 2020-02-14 03:38:16 INFO: status: 71% (924994764800/1299227607040), sparse 5% (66562551808), duration 9288, read/write 13/12 MB/s
110: 2020-02-14 03:38:16 ERROR: vma_queue_write: write error - Broken pipe
110: 2020-02-14 03:38:16 INFO: aborting backup job
110: 2020-02-14 03:38:23 ERROR: Backup of VM 110 failed - vma_queue_write: write error - Broken pipe
From that moment the VM was frozen and not accessible by ssh or console.
Ofcourse we need to expand the storage pool for back-up's...
Work-around for now is to reduce the number of back-up's from 3 to 2.
The VM has a 10GB and 1200GB disk, BACKUP pool just had not enough space.
Just wanted to share this, as we really had no clue why the VM was frozen.