Host Disk full - no access to GUI - How to safely cleanup /var/tmp and other affected folders

Evito · Jan 29, 2022

Hello Everyone,

I've searched through forums here for a bit and have determined I am having a space issue on my proxmos host drive.
This is not an inodes issue as far as I've determined.

These drives are 2x 256 SSDs running in a ZFS pool - ever since I upgraded to a recent proxmox version to test Win11 and TPM now my proxmox server is filling up the host drive. I was able to reboot the host last weekend and get back into GUI and didn't think much of it - but now this weekend the disk is full, I can't upgrade the packages and I can't even pull the pveversion because the drive is full.

I'd really like to clean the host drive and then set some limits so it never goes beyond 90% capacity or something similar so I can at least still access the GUI and do some house keeping on the drive, I saw I could set chron jobs to do it but I feel like setting a limitation may just be easier.

Here is the output of what I can pull - system is accessible without any issues over CLI but I'm a linux newb and I'm not sure how to properly RM files safely without destroying the system.

so far I've determine the VAR + TMP logs are the largest culprits as I dug into forum posts everyone is stating to dig in further and then delete. At this point I think I just need to be pointed in a direction on how to safely CD into them and determine what can be removed, heres my outputs of the server:

I can see that essentially the rpool/ROOT/pve-1 is just full, but accessing the rpool and deleting the junk I am unable to figure out and need some help.

proxmox CLI outputs quoted below:

$df -h outut

Filesystem Size Used Avail Use% Mounted on
udev 48G 0 48G 0% /dev
tmpfs 9.5G 910M 8.6G 10% /run
rpool/ROOT/pve-1 229G 229G 0 100% /
tmpfs 48G 0 48G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
SSD_2Tb 1.4T 128K 1.4T 1% /SSD_2Tb
rpool 128K 128K 0 100% /rpool
SSD_2Tb/subvol-210-disk-0 10G 717M 9.4G 7% /SSD_2Tb/subvol-210-disk-0
SSD_2Tb/subvol-205-disk-0 25G 2.5G 23G 10% /SSD_2Tb/subvol-205-disk-0
rpool/data 128K 128K 0 100% /rpool/data
rpool/ROOT 128K 128K 0 100% /rpool/ROOT
HDD_8Tb 976G 128K 976G 1% /HDD_8Tb
tmpfs 9.5G 0 9.5G 0% /run/user/0

$fdisk -l output (I removed the other drives as the concern is the 2x 256 SSDs storing the host OS which is full)

fdisk -l

Disk /dev/sdb: 238.47 GiB, 256060514304 bytes, 500118192 sectors
Disk model: Samsung SSD 850
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 7EC2FABE-DF93-4080-8F08-48FE5DB368BC

Device Start End Sectors Size Type
/dev/sdb1 34 2047 2014 1007K BIOS boot
/dev/sdb2 2048 1050623 1048576 512M EFI System
/dev/sdb3 1050624 500118158 499067535 238G Solaris /usr & Apple ZFS

Disk /dev/sda: 238.47 GiB, 256060514304 bytes, 500118192 sectors
Disk model: Samsung SSD 850
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 8BC59758-D389-4496-8FEF-55F18708B0C8

Device Start End Sectors Size Type
/dev/sda1 34 2047 2014 1007K BIOS boot
/dev/sda2 2048 1050623 1048576 512M EFI System
/dev/sda3 1050624 500118158 499067535 238G Solaris /usr & Apple ZFS

root@prox:/var# du -shc /var/*

474K /var/backups
360M /var/cache
du: cannot access '/var/lib/lxcfs/cgroup': Input/output error
227G /var/lib
512 /var/local
512 /var/lock
167M /var/log
512 /var/mail
512 /var/opt
512 /var/run
1.4M /var/spool
26K /var/tmp
228G total

root@prox:/var# du -h --max-depth=1 /var
du: cannot access '/var/lib/lxcfs/cgroup': Input/output error
227G /var/lib
360M /var/cache
1.4M /var/spool
474K /var/backups
167M /var/log
512 /var/mail
512 /var/opt
26K /var/tmp
512 /var/local
228G /var

root@prox:~# zfs list
NAME USED AVAIL REFER MOUNTPOINT
HDD_8Tb 6.19T 976G 96K /HDD_8Tb
HDD_8Tb/vm-155-disk-0 1.03T 1.84T 144G -
HDD_8Tb/vm-155-disk-1 1.03T 1.84T 144G -
HDD_8Tb/vm-155-disk-2 1.03T 1.98T 30.2M -
HDD_8Tb/vm-155-disk-3 1.03T 1.98T 1.20M -
HDD_8Tb/vm-155-disk-4 1.03T 1.98T 30.2M -
HDD_8Tb/vm-155-disk-5 1.03T 1.98T 1.20M -
SSD_2Tb 511G 1.30T 104K /SSD_2Tb
SSD_2Tb/base-110-disk-0 54.2G 1.35T 2.58G -
SSD_2Tb/base-120-disk-0 72.1G 1.36T 10.2G -
SSD_2Tb/subvol-205-disk-0 2.41G 22.6G 2.41G /SSD_2Tb/subvol-205-disk-0
SSD_2Tb/subvol-210-disk-0 716M 9.30G 716M /SSD_2Tb/subvol-210-disk-0
SSD_2Tb/vm-100-disk-0 114G 1.40T 14.3G -
SSD_2Tb/vm-100-state-DeployedMachineName 3.91G 1.31T 930M -
SSD_2Tb/vm-105-disk-0 1.08G 1.30T 10.2G -
SSD_2Tb/vm-115-disk-0 41.3G 1.33T 11.8G -
SSD_2Tb/vm-125-disk-0 627M 1.30T 3.05G -
SSD_2Tb/vm-130-disk-0 30.9G 1.33T 1.57G -
SSD_2Tb/vm-135-disk-0 66.0G 1.35T 16.2G -
SSD_2Tb/vm-135-disk-1 3M 1.30T 112K -
SSD_2Tb/vm-135-disk-2 6M 1.30T 68K -
SSD_2Tb/vm-140-disk-0 33.0G 1.33T 2.14G -
SSD_2Tb/vm-145-disk-0 1.15G 1.30T 10.2G -
SSD_2Tb/vm-150-disk-0 36.1G 1.33T 12.2G -
SSD_2Tb/vm-155-disk-0 33.0G 1.33T 1.53G -
SSD_2Tb/vm-200-disk-0 20.6G 1.32T 1.19G -
rpool 229G 0B 96K /rpool
rpool/ROOT 229G 0B 96K /rpool/ROOT
rpool/ROOT/pve-1 229G 0B 229G /
rpool/data 96K 0B 96K /rpool/data

Any help is appreciated - wasn't trying to make a new post but I have determined this is a little out of the ordinary as its not an Inode problem, and its not the Samba bug filling up. Primarily its just a space issue needing cleanup but I've exhausted my resources on how to do this safely without damaging the system and VMs etc.

Thank you for reading and helping

Evito · Feb 2, 2022

are bump posts acceptable here? BUMP

dcsapak · Feb 2, 2022

the biggest dir you have is seemingly this:

227G /var/lib

this can happen if you save e.g. backups/isos on the storage 'local' which is in '/var/lib/vz'

Evito · Feb 2, 2022

dcsapak said:
the biggest dir you have is seemingly this:

this can happen if you save e.g. backups/isos on the storage 'local' which is in '/var/lib/vz'

Hi thanks for the reply - I do store some isos on the local drive but this only started happening after upgrading to a new proxmox version.

I’ll check to see if backups are firing off I had setup a different server to do those not the host.

I assume I can just go to var/lib and RM anything I see within reason?

Basically what I’m asking is the items stored here are not system level files and would be safe to delete from?

Really appreciate your time here

bobmc · Feb 2, 2022

someone posted a clean-up script a while ago .....

dcsapak · Feb 2, 2022

that was just a hunch, i cannot see what really takes up space. there are system files in '/var/lib/' but '/var/lib/vz' is used by pve for all types of things: vm images/disks, templates, isos, backups, etc

if you can delete something from there depends if you need that data there, i cannot tell you that

Evito · Feb 7, 2022

Hello again - writing in to say I was finally able to navigate and clean this up successfully. it looks like I had local backups running and thats what filled it up - but like alot of what everyone said - clearing out a few ISOs after doing some digging and rebooting the server allowed me to gain access then prune all the local backups out.

I did try the script but was unable to get it to execute properly - either way this was a useful learning experience so thanks to all and to have the courage to make an account on proxmox forums and ask for help.

Host Disk full - no access to GUI - How to safely cleanup /var/tmp and other affected folders

Evito

Member

Evito

Member

dcsapak

Proxmox Staff Member

Evito

Member

bobmc

Famous Member

dcsapak

Proxmox Staff Member

Evito

Member

We value your privacy