[SOLVED] LXC File system issues. Proxmox showing >1000% disk utilisation. df -h showing negative usage. Cannot figure out how to fix this.

trotski94 · Jun 9, 2020

First post - apologies if I'm in the wrong place or doing something wrong. I'm starting to pull my hair out at this one.
This is my first time doing anything with LXC's and I'm feeling way out of my depth right now. Summary page on Proxmox for my LXC is showing "Bootdisk size 1120.99% (16.15 TiB of 1.44 TiB)" which is obviously wrong. Tried to run fsck but I must be doing something wrong and no amount of google-fu is putting me on the right track. I've tried using "shutdown -Fr now" but that seems to have done nothing. I've tried to mount root as readonly and run fsck on it, but "mount -o remount,ro /" tells me it cannont mount /dev/loop1 read-only. Doing "umount /" tells me /: not mounted. Doing "umount /dev/loop1" tells me the same message as before.

Ultimately I know the cause to this - I only had a single storage set up in Proxmox, and it was used for everything including backups. A weekly backup ran and used up all the available disk space, starving all the LXCs and VMs of any ability to write to disk, screwing up a bunch of them. I restored most of them from backups but unfortunately this is the only LXC not backed up (because I don't have space).

I'd just rebuild it from scratch, but it has 1.0TB of files on it and my total server space is only 1.8TB, so I can't have two instances side by side... Any advice or guidance would be appreciated, because I'm completely lost here.

Stoiko Ivanov · Jun 9, 2020

please post the config of the container (`pct config $VMID`) and your storage configuration (`cat /etc/pve/storage.cfg`) - is there anything in the journal from the time the disk was full?

trotski94 said:
I'd just rebuild it from scratch, but it has 1.0TB of files on it and my total server space is only 1.8TB, so I can't have two instances side by side... Any advice or guidance would be appreciated, because I'm completely lost here.

any chance to get some additional diskspace there (maybe even in form of an external USB-hdd?) - it's always quite scary to have to work on the last available copy of data

trotski94 · Jun 9, 2020

Stoiko Ivanov said:
any chance to get some additional diskspace there (maybe even in form of an external USB-hdd?) - it's always quite scary to have to work on the last available copy of data

Afraid not, its rented hardware in another country. Worst comes to worst I can pull the data off, rebuild and push it back up.. but it's going to take days to weeks with my connection.

pct config output:

Code:

arch: amd64
cores: 2
hostname: Storage
memory: 1024
net0: name=eth0,bridge=vmbr0,hwaddr=6E:7D:8E:25:C8:C2,ip=dhcp,ip6=dhcp,type=veth
ostype: debian
rootfs: local:105/vm-105-disk-0.raw,size=1500G
swap: 1024
unprivileged: 1

storage.cfg:

Code:

dir: local
        path /var/lib/vz
        content snippets,images,vztmpl,iso,backup,rootdir
        maxfiles 0
        shared 0

Nothing I can see in the journal.. although honestly it was over a week ago this issue happened and I thought i'd gotten away with just restoring from backups (some VMs/LXCs wouldn't even boot. I thought I was lucky that this one seemed unaffected). But now over a week later I've noticed this.

Stoiko Ivanov · Jun 9, 2020

hmm - would make a backup in all cases if possible at all if you value the data...
if you stop the container you can run fsck using `pct fsck VMID`

trotski94 · Jun 9, 2020

pct fsck fails

Code:

fsck from util-linux 2.29.2
/var/lib/vz/images/105/vm-105-disk-0.raw: Group descriptor 8112 has invalid unused inodes count 8190.  FIXED.
/var/lib/vz/images/105/vm-105-disk-0.raw: Group descriptor 8289 has invalid unused inodes count 8189.  FIXED.
/var/lib/vz/images/105/vm-105-disk-0.raw: Inode 65011796 extent tree (at level 2) could be narrower.  IGNORED.
[REMOVED SIMILAR IGNORED MESSAGES FOR POST LENGTH]
/var/lib/vz/images/105/vm-105-disk-0.raw: Inode 65014591 extent tree (at level 2) could be narrower.  IGNORED.
/var/lib/vz/images/105/vm-105-disk-0.raw: Inode 65014619 extent tree (at level 2) could be narrower.  IGNORED.
/var/lib/vz/images/105/vm-105-disk-0.raw: Inode 65015472 extent tree (at level 2) could be narrower.  IGNORED.
/var/lib/vz/images/105/vm-105-disk-0.raw: Deleted inode 66846724 has zero dtime.  FIXED.
/var/lib/vz/images/105/vm-105-disk-0.raw: Inodes that were part of a corrupted orphan linked list found.

/var/lib/vz/images/105/vm-105-disk-0.raw: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
        (i.e., without -a or -p options)
command 'fsck -a -l /var/lib/vz/images/105/vm-105-disk-0.raw' failed: exit code 4

Currently running it manually as it suggests

trotski94 · Jun 9, 2020

Hey, running it manually fixed it! Thanks so much for your help. I'll pull the data off of it just in case.

Stoiko Ivanov · Jun 10, 2020

Great to hear

and yes - last time I ran into a broken disk was also the time when I really started making backups!

Please mark the thread as 'SOLVED'

Search

Search

[SOLVED] LXC File system issues. Proxmox showing >1000% disk utilisation. df -h showing negative usage. Cannot figure out how to fix this.

trotski94

New Member

Stoiko Ivanov

Proxmox Staff Member

trotski94

New Member

Stoiko Ivanov

Proxmox Staff Member

trotski94

New Member

trotski94

New Member

Stoiko Ivanov

Proxmox Staff Member

We value your privacy