[SOLVED] rpool/root/pve-1 taking up entire disk

RustySnail

New Member
Sep 10, 2022
13
0
1
Something happened that caused my proxmox install drive to completely fill up and run out of space causing my server to be unusable, can't login to the webui, start any CTs or VMs or do pretty much anything else. Luckily I can still ssh into the server so they my be hope yet to fix it but I can't see how to clear up disk space.

Running df -h

Code:
df -h
Filesystem                      Size  Used Avail Use% Mounted on
udev                             95G     0   95G   0% /dev
tmpfs                            19G   19M   19G   1% /run
rpool/ROOT/pve-1                 41G   41G     0 100% /
tmpfs                            95G     0   95G   0% /dev/shm
tmpfs                           5.0M     0  5.0M   0% /run/lock
efivarfs                        304K  151K  149K  51% /sys/firmware/efi/efivars
rpool                           128K  128K     0 100% /rpool
rpool/var-lib-vz                 12G   12G     0 100% /var/lib/vz
ProxmoxFast                     539G  384K  539G   1% /ProxmoxFast
rpool/ROOT                      128K  128K     0 100% /rpool/ROOT
rpool/data                      128K  128K     0 100% /rpool/data
tmpfs                            19G     0   19G   0% /run/user/0

I see rpool/ROOT/pve-1 is using 41G and rpool/var-lib-vz is using 12G which is a few backups. I could delete the backups but I am not sure how to do that, pct and pvesm wont work because it can't load access control lists. Looking at pve-1 there doesn't appear to be anything in there to delete. I am at a loss on what to do and to repair the install. I know I could just reinstall but I would need to reconfigure by pcie passthrough and vgpu which I would like to avoid doing.

Code:
root@pve: du -h -x -d1 /
512     /srv
512     /opt
25K     /tmp
3.0K    /mnt
512     /home
512     /media
246M    /var
3.8M    /etc
151M    /boot
1.6G    /usr
54K     /root
1.9G    /
root@pve: du -h -x -d1 /var
80M     /var/cache
1.8M    /var/spool
888K    /var/backups
54M     /var/lib
26K     /var/tmp
512     /var/opt
512     /var/mail
512     /var/local
111M    /var/log
246M    /var

I've run various other commands to see what is taking up all the space but there is nothing I can see that would be the culprite.
 
As a rule-of-thumb PVE doesn't fill up "on its own" your root/local user space. Most of the time its caused by an incorrect/inaccessible mount, causing the output to be incorrectly passed to local space. Some of the time - users who are still using the default backup location - which in time will also fill up.

bbgeek17 has kindly pointed you to a post on this subject.
 
Please review https://forum.proxmox.com/threads/no-space-left-on-device.120901/#post-525313


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
Doesn't seem to help the issue, the only devices in mnt are from proxmox and they don't show they contain much data.

What is the best way I can backup my vms and cts? All of them are installed on a separate drive but the conf files would be on os disk. I do have PBS but my backups are a week old. Would manually copying the config files for each Ct and then replacing them in the new install work?
 
In your particular case, this ^ would be my first place to check

A poorman's way would be "tar -zcvf /someplacewithspace/myconfig.tgz /etc/pve


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
ProxmoxFast is my nvme array with all my vms I haven't messed and don't see how it could have mounted itself inside pve.

I have physically removed every drive except the boot drives and the nvme drives to see if that helped but no help so maybe it is an issue with ProxmoxFast or something else entirely.

At this point tho as long as I can backup everything I can probably get nvidia vgpu working again in less time then it takes to sort this mess out.
 
ProxmoxFast is my nvme array with all my vms I haven't messed and don't see how it could have mounted itself inside pve.
It sounds like you did not fully understand what is the suspected cause of your space issue.

The working theory (based on nothing more than the first output you provided) is that at one point /ProxmoxFast was not mounted, yet you wrote data into it. Then /ProxmoxFast was mounted (automatically or manually). This hid the data that was previously put into a non-mounted /ProxmoxFast. It's occupying the space, yet you can't see it.

To confirm, one way or the other, you need to _unmount_ (umount) /ProxmoxFast.


P.S. It seems highly unlikely that the disk mounted on /ProxmoxFast has _all_ your VMs on it, given only 389kb of 539GB is used...


Good luck


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
It sounds like you did not fully understand what is the suspected cause of your space issue.

The working theory (based on nothing more than the first output you provided) is that at one point /ProxmoxFast was not mounted, yet you wrote data into it. Then /ProxmoxFast was mounted (automatically or manually). This hid the data that was previously put into a non-mounted /ProxmoxFast. It's occupying the space, yet you can't see it.

To confirm, one way or the other, you need to _unmount_ (umount) /ProxmoxFast.


P.S. It seems highly unlikely that the disk mounted on /ProxmoxFast has _all_ your VMs on it, given only 389kb of 539GB is used...


Good luck


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
I see, so what you are saying is some sort of error occurred where ProxmoxFast was unmounted, a vm or ct attempted to save to disk and instead wrote to my OS disk instead and now it is filled? Something does appear to have gone seriously wrong at some point as /etc/pve is empty so all my config files are gone. I hadn't done anything other then reboot a few times or tried to delete any data yet. Looking back I did have a issue where my when trying to download a CT templete using a network drive on my NAS to store the temepletes. A unreleated issue meant the drive was no longer connected so I unmounted and remounted it and it seemed to work fine.

Live and learn I guess, might start taking more frequent backups from now on.
 
Last edited:
If I boot from a live usb they should all be there then?
No, read the link. Its a database and it contents are presented via fuse as files. Without the PVE service running (and it won't when booting something else than PVE) there won't be any files.

Find out what's consuming the space, delete that stuff, reboot and see if it is working again. And if you found out whats consuming the space, make sure this won't happen again. Like fixing the problem that is spamming the logs or adding a "is_mountpoint" option to a directory storage so a unmounted filesystem can't be written to by PVE.
 
No, read the link. Its a database and it contents are presented via fuse as files. Without the PVE service running (and it won't when booting something else than PVE) there won't be any files.

Find out what's consuming the space, delete that stuff, reboot and see if it is working again. And if you found out whats consuming the space, make sure this won't happen again. Like fixing the problem that is spamming the logs or adding a "is_mountpoint" option to a directory storage so a unmounted filesystem can't be written to by PVE.
Okay so I'm back to trying to fix the install then. I've unmounted ProxmoxFast along with every other drive on the system so only thing left is OS disks.

Code:
Filesystem        Size  Used Avail Use% Mounted on
udev               95G     0   95G   0% /dev
tmpfs              19G   11M   19G   1% /run
rpool/ROOT/pve-1   41G   41G     0 100% /
tmpfs              95G     0   95G   0% /dev/shm
tmpfs             5.0M     0  5.0M   0% /run/lock
efivarfs          304K  206K   94K  69% /sys/firmware/efi/efivars
rpool             128K  128K     0 100% /rpool
rpool/var-lib-vz   12G   12G     0 100% /var/lib/vz
rpool/ROOT        128K  128K     0 100% /rpool/ROOT
rpool/data        128K  128K     0 100% /rpool/data
tmpfs              19G     0   19G   0% /run/user/0

/rpool/ROOT/pve-1 is hogging all the disk space but I am still not sure how to go about emptying it. /rpool/var-lib-vz has 2 lxc backups I can safely delete if I could fine them. To me it seems to easiest solution would be to delete the backups, free up some disk space and then back everything up manually and reinstall and restore.
 
Wait think I found the issue, ProxmoxFast wasn't the issue but another zfs array. Looks like it also had the problem where it was unmounted and data was then written to the volume totaling 39GB.

You mentioned you can a is_mountpoint option so no data will be written if the drives are unmounted, how does one go about doing that?

Code:
pvesm set YourStorageID --is_mountpoint yes

I found this, so setting pvesm set ProxmoxFast --is_mountpoint yes and pvesm set downloads --is_mountpoint yes would stop proxmox and any lxc from writing data if the disks became unmounted for any reason?
 
Last edited:
I found this, so setting pvesm set ProxmoxFast --is_mountpoint yes and pvesm set downloads --is_mountpoint yes would stop proxmox and any lxc from writing data if the disks became unmounted for any reason?
Yes. Or "pvesm set <YourStorageID> --is_mountpoint <AbsolutePathToMountpoint>" in case the folder your are using as a directory storage isn't the mountpoint itself but a folder below it.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!