Web interface connexion not working after fail backup

Aug 30, 2021
5
0
6
40
Hello

we have a bakckup plan every saturday on our servers on a local storage.

On one of them the backup fail.
Code:
VMID    NAME    STATUS    TIME    SIZE    FILENAME
103    APP2    FAILED    00:24:33    vma_queue_write: write error - Broken pipe
104    VM 104    FAILED    00:00:00    unable to open file '/etc/pve/nodes/XXX/qemu-server/104.conf.tmp.9793' - Input/output error
106    VM 106    FAILED    00:00:00    unable to open file '/etc/pve/nodes/XXX/qemu-server/106.conf.tmp.9793' - Input/output error
107    VM 107    FAILED    00:00:00    unable to open file '/etc/pve/nodes/XXX/qemu-server/107.conf.tmp.9793' - Input/output error
108    VM 108    FAILED    00:00:00    unable to open file '/etc/pve/nodes/XXX/qemu-server/108.conf.tmp.9793' - Input/output error
TOTAL    00:24:33    0KB

If we look the detail log it crash at 41%
Code:
103: 2021-08-28 05:24:33 INFO:  41% (211.7 GiB of 505.0 GiB) in 24m 27s, read: 107.0 MiB/s, write: 107.0 MiB/s
103: 2021-08-28 05:24:33 ERROR: vma_queue_write: write error - Broken pipe
103: 2021-08-28 05:24:33 INFO: aborting backup job
103: 2021-08-28 05:24:33 INFO: resuming VM again
103: 2021-08-28 05:24:35 ERROR: Backup of VM 103 failed - vma_queue_write: write error - Broken pipe

After that i can only log with ssh on the server. When i try to connect with the webinterface using different realm ( Linux PAM or Active directory )it says the login Failed.

When i look on the server with journalctl i have a lot of the following lines
Code:
 unable to write lrm status file - unable to open file '/etc/pve/nodes/nsXXX/lrm_status.tmp.4177' - Input/output error
 authkey rotation error: cfs-lock 'authkey' error: got lock request timeout

The only way to deal with it is to restart the service. It's not very convenient as it is a PROD/LIVE server.

I already try several command
Bash:
service pvedaemon restart
service pveproxy restart
service pvestatd restart

Here is my space available

Bash:
root@ns3181572:/mnt# df -h
Filesystem        Size  Used Avail Use% Mounted on
udev               63G     0   63G   0% /dev
tmpfs              13G  498M   13G   4% /run
rpool/ROOT/pve-1  1.1T  820G  211G  80% /
tmpfs              63G   43M   63G   1% /dev/shm
tmpfs             5.0M     0  5.0M   0% /run/lock
tmpfs              63G     0   63G   0% /sys/fs/cgroup
rpool             211G  128K  211G   1% /rpool
rpool/ROOT        211G  128K  211G   1% /rpool/ROOT
rpool/data        211G  128K  211G   1% /rpool/data
/dev/fuse          30M   32K   30M   1% /etc/pve
tmpfs              13G     0   13G   0% /run/user/0

Is there any way to resolve the issue without restarting the server?

Thanks in advance
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!