Random RBD Snapshot Rollback Failures on Ceph Storage

Timothy1056

Member
Aug 18, 2022
17
0
6
We are experiencing intermittent VM restore and rollback failures on our Proxmox cluster. The issue occurs randomly when the user starts restoring 5 to 10 VM's at a time.

Error logs:

Code:
Timed out while waiting for udev queue to empty.
VM quit/powerdown failed - terminating now with SIGTERM
VM still running - terminating now with SIGKILL
Rolling back to snapshot: 0% complete...failed.
TASK ERROR: rbd snapshot vm-206-disk-3 to 'nigel' error: Rolling back to snapshot: 0% complete...failed.

Environment:
  • Proxmox VE version: 9.0.6
  • Ceph version: 19.2.3