[SOLVED] vm got stuck?

killmasta93

Active Member
Aug 13, 2017
580
31
33
26
Hi,
I was wondering if something has happened to them before? currently running proxmox 5.3-8, it was doing a backup vzdump last night and it stopped on the last vm 106
any ideas?

Thank you

this was the error
Code:
106: 2019-07-17 02:28:52 ERROR: vma_queue_write: write error - Broken pipe
106: 2019-07-17 02:28:52 INFO: aborting backup job
106: 2019-07-17 02:28:54 INFO: unable to open file '/etc/pve/nodes/prometheus/qemu-server/106.conf.tmp.14455' - Input/output error
106: 2019-07-17 02:29:17 ERROR: Backup of VM 106 failed - vma_queue_write: write error - Broken pipe
Code:
Jul 17 06:25:07 prometheus pvesr[26885]: trying to acquire cfs lock 'file-replication_cfg' ...
Jul 17 06:25:08 prometheus pvesr[26885]: trying to acquire cfs lock 'file-replication_cfg' ...
Jul 17 06:25:09 prometheus pvesr[26885]: error with cfs lock 'file-replication_cfg': got lock request timeout
Jul 17 06:25:09 prometheus systemd[1]: pvesr.service: Main process exited, code=exited, status=5/NOTINSTALLED
Jul 17 06:25:09 prometheus systemd[1]: Failed to start Proxmox VE replication runner.
Jul 17 06:25:09 prometheus systemd[1]: pvesr.service: Unit entered failed state.
Jul 17 06:25:09 prometheus systemd[1]: pvesr.service: Failed with result 'exit-code'.
Jul 17 06:25:10 prometheus spiceproxy[8577]: worker exit
Jul 17 06:25:10 prometheus spiceproxy[3119]: worker 8577 finished
Jul 17 06:25:10 prometheus pveproxy[24370]: worker exit
Jul 17 06:25:10 prometheus pveproxy[8812]: worker exit
Jul 17 06:25:10 prometheus pveproxy[8813]: worker exit
Jul 17 06:25:10 prometheus pveproxy[3088]: worker 24370 finished
Jul 17 06:25:10 prometheus pveproxy[3088]: worker 8812 finished
Jul 17 06:25:10 prometheus pveproxy[3088]: worker 8813 finished
Jul 17 06:25:12 prometheus pve-ha-lrm[3110]: unable to write lrm status file - unable to delete old temp file: Input/output error
Jul 17 06:25:17 prometheus pve-ha-lrm[3110]: unable to write lrm status file - unable to delete old temp file: Input/output error
Jul 17 06:25:22 prometheus pve-ha-lrm[3110]: unable to write lrm status file - unable to delete old temp file: Input/output error
Jul 17 06:25:27 prometheus pve-ha-lrm[3110]: unable to write lrm status file - unable to delete old temp file: Input/output error
Jul 17 06:25:32 prometheus pve-ha-lrm[3110]: unable to write lrm status file - unable to delete old temp file: Input/output error
Jul 17 06:25:37 prometheus pve-ha-lrm[3110]: unable to write lrm status file - unable to delete old temp file: Input/output error
Jul 17 06:25:42 prometheus pve-ha-lrm[3110]: unable to write lrm status file - unable to delete old temp file: Input/output error
Jul 17 06:25:47 prometheus pve-ha-lrm[3110]: unable to write lrm status file - unable to delete old temp file: Input/output error
Jul 17 06:25:52 prometheus pve-ha-lrm[3110]: unable to write lrm status file - unable to delete old temp file: Input/output error
Jul 17 06:25:57 prometheus pve-ha-lrm[3110]: unable to write lrm status file - unable to delete old temp file: Input/output error
Jul 17 06:26:00 prometheus systemd[1]: Starting Proxmox VE replication runner...
Jul 17 06:26:00 prometheus pvesr[28995]: trying to acquire cfs lock 'file-replication_cfg' ...
Jul 17 06:26:01 prometheus pvesr[28995]: trying to acquire cfs lock 'file-replication_cfg' ...
Jul 17 06:26:02 prometheus pve-ha-lrm[3110]: unable to write lrm status file - unable to delete old temp file: Input/output error
Jul 17 06:26:02 prometheus pvesr[28995]: trying to acquire cfs lock 'file-replication_cfg' ...
Jul 17 06:26:03 prometheus pvesr[28995]: trying to acquire cfs lock 'file-replication_cfg' ...
Jul 17 06:26:04 prometheus pvesr[28995]: trying to acquire cfs lock 'file-replication_cfg' ...
Jul 17 06:26:05 prometheus pvesr[28995]: trying to acquire cfs lock 'file-replication_cfg' ...
Jul 17 06:26:06 prometheus pvesr[28995]: trying to acquire cfs lock 'file-replication_cfg' ...
Jul 17 06:26:07 prometheus pve-ha-lrm[3110]: unable to write lrm status file - unable to delete old temp file: Input/output error
 

Alwin

Proxmox Staff Member
Staff member
Aug 1, 2017
3,477
299
88
The cluster file system (/etc/pve/) was not available while the backup was running. If its a cluster, then check your network.
 

Alwin

Proxmox Staff Member
Staff member
Aug 1, 2017
3,477
299
88
Does the server have enough resources during the backup? Are those messages showing up in the journal/syslog more frequently too?
 

killmasta93

Active Member
Aug 13, 2017
580
31
33
26
thanks for the reply, correct there is enough resources around 30gigs of ram only happened when the backup got stuck haven't done a backup since that error should i try it one last time?
 

Alwin

Proxmox Staff Member
Staff member
Aug 1, 2017
3,477
299
88
thanks for the reply, correct there is enough resources around 30gigs of ram only happened when the backup got stuck
I don't understand. How much RAM is available when the backups are running?

haven't done a backup since that error should i try it one last time?
Up to you. I don't believe you want to keep it in that state. ;)
 

killmasta93

Active Member
Aug 13, 2017
580
31
33
26
i will postback im going to run it this week just incase it gets stuck i wont get hammered the next day
 

killmasta93

Active Member
Aug 13, 2017
580
31
33
26
so reran it and did not backup change the USB disk and worked i think it had to do something with the USB disk very odd but thank you
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!