Critical server issue - After kernel update today LXC containers deleted by system and restored backups unreachable

Ordent

Member
Nov 5, 2017
4
0
6
40
Today I did an update of two servers using the Proxmox web interface, which included a kernel update. After rebooting, on both servers several LXC container files had disappeared (only the config file remained in /etc/pve/lxc). This is a serious problem!

Code:
TASK ERROR: volume 'local2:510/vm-510-disk-0.raw' does not exist

After restoring backups of the missing containers, none of the services is reachable except for ping and ssh.

Code:
# wget lxccontainer.domain.name/index.html
--2020-09-06 09:29:49--  http://lxccontainer.domain.name/index.html
Resolving lxccontainer.domain.name (lxccontainer.domain.name)... 5.200.23.121
Connecting to lxccontainer.domain.name (lxccontainer.domain.name)|5.200.23.121|:80... failed: No route to host.


- Has anyone else had such a problem with LXC files disappearing / being deleted by the system?
- Any suggestions what could fix the unreachable service in the containers? The restored backups are from a working container, so in principle the only change is the PVE system itself.

No firewalls are active, PVE is the latest version ugrading from the latest previous version (the server was installed recently). The LXC containers run CentOS8 64-bit, most updated version.

Your thoughts are highly appreciated!

Kind regards,

Ordent
 
Last edited:
Since Proxmox IT support seems not open on Sunday to offer suggestions and support, I had to install new servers for all disappeared servers and failed restores (all restores were unreachable) and copy userfiles via the host.

Most of the users are online again but it is a serious concern that a backup that I made 5 minutes before restarting one of the servers does not function when it is restored (the container is running but no services can be reached aside from ping and ssh).
 
neither upgrades nor reboots delete containers.. it sounds like your storage has/had some kind of problem, or you have a race on bootup when mounting storages. without details and logs, it's impossible to tell what happened though.

as for your backup/restore issue - that looks like something specific to your network and/or setup, but again, no details given so not possible to tell exactly. but this is the reason why backups alone are never sufficient, you also need to test your restore routine regularly and verify it actually works!
 
Thank you for your response. Since the problem occurred on two separate servers (but same proxmox and hardware) and directly after an upgrade of the Proxmox files (through the web interface), I am concerned that it is something else than my system.

I have rebooted the server quite often over the past 4 weeks (fresh install) without any issues and I have successfully restored backups at least 12 times over the same period (migrating users).

It may still be an issue with those two servers, but the fact that it occurred on both servers immediately after an upgrade at least merits taking note. I had submitted a ticket to technical support yesterday with server credentials, so that all the details can be reviewed and perhaps a cause can be found.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!