Backup crashes WebUI / Promox Envoirement

Gamienator

Member
Mar 16, 2021
41
4
13
32
Hello everyone,
for two days I got a really weird issue.

I got a daily backup routine on an external Harddrive. I noticed today that I didn't reveiced an success mail. Logging in into my Proxmox server showed me this screen:

1673855966282.png

I was shocked, but every service was reachable. SSHing into the proxmox was possible, and tried
Code:
pct list
with following result:

1673856031126.png

A Reboot helped. I saw that LXC 107 was still locked. So I unlocked it and started the backup job again. Soon after that same behaviour, locked in 107, WebUI schows the same behaviour. I'm now restarting again and uncheck 107 from the backup.

Can anybody give me a hint which logs I could check what is going on there?
Thanks!
 
Quite some versions ago, with slower drives (very few IOPS) and before PBS, I had similar issues that the backup I/O swamped the system. I separated the (logical) disk for the Proxmox host installation from the disk for VMs (and from the backup storage), which prevented Proxmox from becoming unresponsive during heavy I/O (from backups or VMs).
Maybe a different I/O scheduler that balances the work-load from different sources could help? Or maybe your issue is totally different? Check the journalctl from around the time of the issue.
 
Hey again,

I found the reason for the crashes. In my proxmox Node I'm running several unprivileged LXC Container running Docker. On my discussion here we figured out that the storage driver was vfs. After digging in I changed everything to
Code:
fuse-overlay
as storage driver.

I don't see why, but on that container the storage driver was
Code:
overlay2
which caused a lot of trouble! After setting the storage driver to
Code:
fuse-overlay
Backups running fine again and the node doesn't crash anymore :)
 
Last edited:
  • Like
Reactions: leesteken
Hey everyone,
sadly the issue is not solved! Now my backup crashed two times again and locking up my Node. Luckly didn't had to reboot, just to kill -9 a start process of that container. Nor following issue is reported on my backup log:

Code:
short read on command socket (16 != 0)

Are there other logs aswell or should I investigate on journalctl?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!