Host Disk full - no access to GUI - How to safely cleanup /var/tmp and other affected folders

Evito

Member
Jan 29, 2022
4
0
6
34
Hello Everyone,

I've searched through forums here for a bit and have determined I am having a space issue on my proxmos host drive.
This is not an inodes issue as far as I've determined.

These drives are 2x 256 SSDs running in a ZFS pool - ever since I upgraded to a recent proxmox version to test Win11 and TPM now my proxmox server is filling up the host drive. I was able to reboot the host last weekend and get back into GUI and didn't think much of it - but now this weekend the disk is full, I can't upgrade the packages and I can't even pull the pveversion because the drive is full.

I'd really like to clean the host drive and then set some limits so it never goes beyond 90% capacity or something similar so I can at least still access the GUI and do some house keeping on the drive, I saw I could set chron jobs to do it but I feel like setting a limitation may just be easier.

Here is the output of what I can pull - system is accessible without any issues over CLI but I'm a linux newb and I'm not sure how to properly RM files safely without destroying the system.

so far I've determine the VAR + TMP logs are the largest culprits as I dug into forum posts everyone is stating to dig in further and then delete. At this point I think I just need to be pointed in a direction on how to safely CD into them and determine what can be removed, heres my outputs of the server:

I can see that essentially the rpool/ROOT/pve-1 is just full, but accessing the rpool and deleting the junk I am unable to figure out and need some help.

proxmox CLI outputs quoted below:
$df -h outut

Filesystem Size Used Avail Use% Mounted on
udev 48G 0 48G 0% /dev
tmpfs 9.5G 910M 8.6G 10% /run
rpool/ROOT/pve-1 229G 229G 0 100% /
tmpfs 48G 0 48G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
SSD_2Tb 1.4T 128K 1.4T 1% /SSD_2Tb
rpool 128K 128K 0 100% /rpool
SSD_2Tb/subvol-210-disk-0 10G 717M 9.4G 7% /SSD_2Tb/subvol-210-disk-0
SSD_2Tb/subvol-205-disk-0 25G 2.5G 23G 10% /SSD_2Tb/subvol-205-disk-0
rpool/data 128K 128K 0 100% /rpool/data
rpool/ROOT 128K 128K 0 100% /rpool/ROOT

HDD_8Tb 976G 128K 976G 1% /HDD_8Tb
tmpfs 9.5G 0 9.5G 0% /run/user/0


$fdisk -l output (I removed the other drives as the concern is the 2x 256 SSDs storing the host OS which is full)

fdisk -l

Disk /dev/sdb: 238.47 GiB, 256060514304 bytes, 500118192 sectors
Disk model: Samsung SSD 850
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 7EC2FABE-DF93-4080-8F08-48FE5DB368BC

Device Start End Sectors Size Type
/dev/sdb1 34 2047 2014 1007K BIOS boot
/dev/sdb2 2048 1050623 1048576 512M EFI System
/dev/sdb3 1050624 500118158 499067535 238G Solaris /usr & Apple ZFS


Disk /dev/sda: 238.47 GiB, 256060514304 bytes, 500118192 sectors
Disk model: Samsung SSD 850
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 8BC59758-D389-4496-8FEF-55F18708B0C8

Device Start End Sectors Size Type
/dev/sda1 34 2047 2014 1007K BIOS boot
/dev/sda2 2048 1050623 1048576 512M EFI System
/dev/sda3 1050624 500118158 499067535 238G Solaris /usr & Apple ZFS

root@prox:/var# du -shc /var/*

474K /var/backups
360M /var/cache
du: cannot access '/var/lib/lxcfs/cgroup': Input/output error
227G /var/lib
512 /var/local
512 /var/lock
167M /var/log
512 /var/mail
512 /var/opt
512 /var/run
1.4M /var/spool
26K /var/tmp
228G total

root@prox:/var# du -h --max-depth=1 /var
du: cannot access '/var/lib/lxcfs/cgroup': Input/output error
227G /var/lib
360M /var/cache
1.4M /var/spool

474K /var/backups
167M /var/log
512 /var/mail
512 /var/opt
26K /var/tmp
512 /var/local
228G /var


root@prox:~# zfs list
NAME USED AVAIL REFER MOUNTPOINT
HDD_8Tb 6.19T 976G 96K /HDD_8Tb
HDD_8Tb/vm-155-disk-0 1.03T 1.84T 144G -
HDD_8Tb/vm-155-disk-1 1.03T 1.84T 144G -
HDD_8Tb/vm-155-disk-2 1.03T 1.98T 30.2M -
HDD_8Tb/vm-155-disk-3 1.03T 1.98T 1.20M -
HDD_8Tb/vm-155-disk-4 1.03T 1.98T 30.2M -
HDD_8Tb/vm-155-disk-5 1.03T 1.98T 1.20M -
SSD_2Tb 511G 1.30T 104K /SSD_2Tb
SSD_2Tb/base-110-disk-0 54.2G 1.35T 2.58G -
SSD_2Tb/base-120-disk-0 72.1G 1.36T 10.2G -
SSD_2Tb/subvol-205-disk-0 2.41G 22.6G 2.41G /SSD_2Tb/subvol-205-disk-0
SSD_2Tb/subvol-210-disk-0 716M 9.30G 716M /SSD_2Tb/subvol-210-disk-0
SSD_2Tb/vm-100-disk-0 114G 1.40T 14.3G -
SSD_2Tb/vm-100-state-DeployedMachineName 3.91G 1.31T 930M -
SSD_2Tb/vm-105-disk-0 1.08G 1.30T 10.2G -
SSD_2Tb/vm-115-disk-0 41.3G 1.33T 11.8G -
SSD_2Tb/vm-125-disk-0 627M 1.30T 3.05G -
SSD_2Tb/vm-130-disk-0 30.9G 1.33T 1.57G -
SSD_2Tb/vm-135-disk-0 66.0G 1.35T 16.2G -
SSD_2Tb/vm-135-disk-1 3M 1.30T 112K -
SSD_2Tb/vm-135-disk-2 6M 1.30T 68K -
SSD_2Tb/vm-140-disk-0 33.0G 1.33T 2.14G -
SSD_2Tb/vm-145-disk-0 1.15G 1.30T 10.2G -
SSD_2Tb/vm-150-disk-0 36.1G 1.33T 12.2G -
SSD_2Tb/vm-155-disk-0 33.0G 1.33T 1.53G -
SSD_2Tb/vm-200-disk-0 20.6G 1.32T 1.19G -
rpool 229G 0B 96K /rpool
rpool/ROOT 229G 0B 96K /rpool/ROOT
rpool/ROOT/pve-1 229G 0B 229G /
rpool/data 96K 0B 96K /rpool/data


Any help is appreciated - wasn't trying to make a new post but I have determined this is a little out of the ordinary as its not an Inode problem, and its not the Samba bug filling up. Primarily its just a space issue needing cleanup but I've exhausted my resources on how to do this safely without damaging the system and VMs etc.

Thank you for reading and helping
 
the biggest dir you have is seemingly this:
227G /var/lib

this can happen if you save e.g. backups/isos on the storage 'local' which is in '/var/lib/vz'
 
the biggest dir you have is seemingly this:


this can happen if you save e.g. backups/isos on the storage 'local' which is in '/var/lib/vz'

Hi thanks for the reply - I do store some isos on the local drive but this only started happening after upgrading to a new proxmox version.

I’ll check to see if backups are firing off I had setup a different server to do those not the host.

I assume I can just go to var/lib and RM anything I see within reason?

Basically what I’m asking is the items stored here are not system level files and would be safe to delete from?

Really appreciate your time here
 
that was just a hunch, i cannot see what really takes up space. there are system files in '/var/lib/' but '/var/lib/vz' is used by pve for all types of things: vm images/disks, templates, isos, backups, etc

if you can delete something from there depends if you need that data there, i cannot tell you that ;)
 
Hello again - writing in to say I was finally able to navigate and clean this up successfully. it looks like I had local backups running and thats what filled it up - but like alot of what everyone said - clearing out a few ISOs after doing some digging and rebooting the server allowed me to gain access then prune all the local backups out.

I did try the script but was unable to get it to execute properly - either way this was a useful learning experience so thanks to all and to have the courage to make an account on proxmox forums and ask for help.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!