[SOLVED] vm got stuck?

killmasta93 · Jul 17, 2019

Hi,
I was wondering if something has happened to them before? currently running proxmox 5.3-8, it was doing a backup vzdump last night and it stopped on the last vm 106
any ideas?

Thank you

this was the error

Code:

106: 2019-07-17 02:28:52 ERROR: vma_queue_write: write error - Broken pipe
106: 2019-07-17 02:28:52 INFO: aborting backup job
106: 2019-07-17 02:28:54 INFO: unable to open file '/etc/pve/nodes/prometheus/qemu-server/106.conf.tmp.14455' - Input/output error
106: 2019-07-17 02:29:17 ERROR: Backup of VM 106 failed - vma_queue_write: write error - Broken pipe

Code:

Jul 17 06:25:07 prometheus pvesr[26885]: trying to acquire cfs lock 'file-replication_cfg' ...
Jul 17 06:25:08 prometheus pvesr[26885]: trying to acquire cfs lock 'file-replication_cfg' ...
Jul 17 06:25:09 prometheus pvesr[26885]: error with cfs lock 'file-replication_cfg': got lock request timeout
Jul 17 06:25:09 prometheus systemd[1]: pvesr.service: Main process exited, code=exited, status=5/NOTINSTALLED
Jul 17 06:25:09 prometheus systemd[1]: Failed to start Proxmox VE replication runner.
Jul 17 06:25:09 prometheus systemd[1]: pvesr.service: Unit entered failed state.
Jul 17 06:25:09 prometheus systemd[1]: pvesr.service: Failed with result 'exit-code'.
Jul 17 06:25:10 prometheus spiceproxy[8577]: worker exit
Jul 17 06:25:10 prometheus spiceproxy[3119]: worker 8577 finished
Jul 17 06:25:10 prometheus pveproxy[24370]: worker exit
Jul 17 06:25:10 prometheus pveproxy[8812]: worker exit
Jul 17 06:25:10 prometheus pveproxy[8813]: worker exit
Jul 17 06:25:10 prometheus pveproxy[3088]: worker 24370 finished
Jul 17 06:25:10 prometheus pveproxy[3088]: worker 8812 finished
Jul 17 06:25:10 prometheus pveproxy[3088]: worker 8813 finished
Jul 17 06:25:12 prometheus pve-ha-lrm[3110]: unable to write lrm status file - unable to delete old temp file: Input/output error
Jul 17 06:25:17 prometheus pve-ha-lrm[3110]: unable to write lrm status file - unable to delete old temp file: Input/output error
Jul 17 06:25:22 prometheus pve-ha-lrm[3110]: unable to write lrm status file - unable to delete old temp file: Input/output error
Jul 17 06:25:27 prometheus pve-ha-lrm[3110]: unable to write lrm status file - unable to delete old temp file: Input/output error
Jul 17 06:25:32 prometheus pve-ha-lrm[3110]: unable to write lrm status file - unable to delete old temp file: Input/output error
Jul 17 06:25:37 prometheus pve-ha-lrm[3110]: unable to write lrm status file - unable to delete old temp file: Input/output error
Jul 17 06:25:42 prometheus pve-ha-lrm[3110]: unable to write lrm status file - unable to delete old temp file: Input/output error
Jul 17 06:25:47 prometheus pve-ha-lrm[3110]: unable to write lrm status file - unable to delete old temp file: Input/output error
Jul 17 06:25:52 prometheus pve-ha-lrm[3110]: unable to write lrm status file - unable to delete old temp file: Input/output error
Jul 17 06:25:57 prometheus pve-ha-lrm[3110]: unable to write lrm status file - unable to delete old temp file: Input/output error
Jul 17 06:26:00 prometheus systemd[1]: Starting Proxmox VE replication runner...
Jul 17 06:26:00 prometheus pvesr[28995]: trying to acquire cfs lock 'file-replication_cfg' ...
Jul 17 06:26:01 prometheus pvesr[28995]: trying to acquire cfs lock 'file-replication_cfg' ...
Jul 17 06:26:02 prometheus pve-ha-lrm[3110]: unable to write lrm status file - unable to delete old temp file: Input/output error
Jul 17 06:26:02 prometheus pvesr[28995]: trying to acquire cfs lock 'file-replication_cfg' ...
Jul 17 06:26:03 prometheus pvesr[28995]: trying to acquire cfs lock 'file-replication_cfg' ...
Jul 17 06:26:04 prometheus pvesr[28995]: trying to acquire cfs lock 'file-replication_cfg' ...
Jul 17 06:26:05 prometheus pvesr[28995]: trying to acquire cfs lock 'file-replication_cfg' ...
Jul 17 06:26:06 prometheus pvesr[28995]: trying to acquire cfs lock 'file-replication_cfg' ...
Jul 17 06:26:07 prometheus pve-ha-lrm[3110]: unable to write lrm status file - unable to delete old temp file: Input/output error

Alwin · Jul 18, 2019

The cluster file system (/etc/pve/) was not available while the backup was running. If its a cluster, then check your network.

killmasta93 · Jul 18, 2019

Thanks for the reply, the server is not in a cluster

Alwin · Jul 19, 2019

Does the server have enough resources during the backup? Are those messages showing up in the journal/syslog more frequently too?

killmasta93 · Jul 20, 2019

thanks for the reply, correct there is enough resources around 30gigs of ram only happened when the backup got stuck haven't done a backup since that error should i try it one last time?

Alwin · Jul 22, 2019

killmasta93 said:
thanks for the reply, correct there is enough resources around 30gigs of ram only happened when the backup got stuck

I don't understand. How much RAM is available when the backups are running?

killmasta93 said:
haven't done a backup since that error should i try it one last time?

Up to you. I don't believe you want to keep it in that state.

killmasta93 · Jul 23, 2019

i will postback im going to run it this week just incase it gets stuck i wont get hammered the next day

killmasta93 · Jul 25, 2019

so reran it and did not backup change the USB disk and worked i think it had to do something with the USB disk very odd but thank you

[SOLVED] vm got stuck?

killmasta93

Renowned Member

Alwin

Proxmox Retired Staff

killmasta93

Renowned Member

Alwin

Proxmox Retired Staff

killmasta93

Renowned Member

Alwin

Proxmox Retired Staff

killmasta93

Renowned Member

killmasta93

Renowned Member

We value your privacy