[SOLVED] My data storage got wiped and I have no idea why

baretomas

New Member
Aug 13, 2019
15
0
1
44
So I got home from work on friday and noticed that all my containers on a node had been shut down, with a lot of error messages saying my containers couldn't start.
I researched it for a bit, and after awhile understood that all my .raw files for the containers had been wiped. In fact, the entire storage had been wiped and all that was left was and empty folder hierarchy.

I checked the server logs and all I could see was that some of the containers running tasks had a lot of SIGKILL shutdowns due to out of memory exceptions. This was a bit expected as i was tuning the containers for optimal ram use the last few days. I wasn't too worried about it affecting the node, since they are in containers anyway right?

Well... Im not to familiar with container tech, but I couldnt see any reason why this would be able to wipe my storage completely?

I checked the nodes syslog and found something which *looks* to me like some kind of recovery process, but am not sure. It clear that there has been a restart. Anyway, after that block in the logs,I can see that it can't start the containers with the missing .raw file messages. So this happened right before the files disappeared.

So I am wondering what all this could because?

- Have I had some kind of security breach in on of my containers, which the intruder has somehow managed to wipe the storage drive?
This theory is only based on the fact that I was running a Cardano stake node which had internet access. The only containers had no access to internet. It doesnt make much sense though, as wiping the container like that would make the intruder lose his access to the container, and thus he wouldnt be able to do anything else. Also, why leave the folder structure?

- Is it possible that the multiple (once every 2 hours) out of memory exceptions from some of the containers, somehow kicked of a reset of the storage or what?

- Is it because of something else entirely?
What could in theory caused such an incident?

I can post the syslog here on what I believe is some form of recovery which happened before it couldn't start the containers, but if anyone has any idea where else to look for explanation please feel free to suggest anything! Because I am at a loss here!

/Thanks
 
hi,

there's no known bug that causes the entire storage to be wiped.

what storage were the .raw files on? maybe it's just not mounted?
 
I was afraid of that...

A directory. I can still open the location where it is mounted, and browse the folders, but there are no files in there.
 
what about the disk itself?

when you run fdisk -l or lsblk -f you should see the info for the disk and the partitions.

is the partition empty? if not can you mount it?
 
*embarrased*

The drive wasn't mounted! Got confused as the folders still had the correct folders, but no data there. I tried to mount it and presto. It's all back. But what could make it drop the mount on standard boot?

Also...the containers doesn't start still... This odd?

conf - conf.c:run_buffer:340 - Script exec /usr/share/lxc/hooks/lxc-pve-prestart-hook 102 lxc pre-start produced output: unable to detect OS distribution

Im running Centos 8.

Should I open another thread on this btw? As my first problem was solved.
 
glad you didn't lose your data!

But what could make it drop the mount on standard boot?
how did you create the storage in the first place? was it through GUI or did you make it manually? maybe you forgot to add it into /etc/fstab

Also...the containers doesn't start still... This odd?

conf - conf.c:run_buffer:340 - Script exec /usr/share/lxc/hooks/lxc-pve-prestart-hook 102 lxc pre-start produced output: unable to detect OS distribution

Im running Centos 8.
if none of them are starting with that error, it's usually an indicator that the volumes aren't mounted. are you using ZFS by chance? maybe your pool isn't mounted either
 
couldnt remember how I mounted it, but it was missing from fstab, so added it there.

It's an ext4 mount, btw.
I mustve boinked the config here, because the node won't get back up after I modified the fstab and rebooted. I dont have physical access until a few hours, so have to wait until then.

But, before I rebooted, I checked the Proxmox GUI, and I saw the .raw files in the content list on the storage. So shouldn't that indicate that it's a-ok?
 
Yeh. Turns out only some of the containers has that issue. There are 4 others that work fine. Not sure if there is much difference between them either.
edit: yep. The ones running are actually just backups, from the ones that are not, and restored into new images.
 
Last edited:
Thanks. I resolved all issues by doing a backup and restore on the images.

Thanks for help for my apparently very n00b problem!

edit: ooh...doesnt work for my Centos 8 images.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!