[SOLVED] What is suddenly taking up so much space on local?

jsalas424

Member
Jul 5, 2020
141
2
23
34
5 days ago, my local directory starting becoming more full despite me not adding any new disks in there. What gives?
1605242120702.png
 
Do you have any directory storages configured on a manual mountpoint?

If you install the package ncdu and run it on / with ncdu /, it can show you where all that space is used.
 
I woke up this morning to a locked out server. My local storage had gone up to 100% usage and resulted in I/O errors. I was able to recover by rebooting the server which freed up a couple MB which then allowed me to move disks off of there.

Here is the output of ncdu /

446.8 GiB [##########] /mnt
75.4 GiB [# ] /Nextcloud.Storage
18.5 GiB [ ] /var
1.3 GiB [ ] /usr
165.0 MiB [ ] /boot
45.7 MiB [ ] /dev
9.5 MiB [ ] /run
4.9 MiB [ ] /root
3.8 MiB [ ] /etc
44.5 KiB [ ] /home
28.0 KiB [ ] /tmp
12.0 KiB [ ] /Storage.1
2.0 KiB [ ] /rpool

I'm not sure, but I have a feeling it's related to this other problem I posted about: https://forum.proxmox.com/threads/d...b-zfs-disk-only-uses-200gb.78969/#post-349639

This is ncdu / after panicking and moving things off of local drive.
 
So about half a TB is located in /mnt.

Can you show your /etc/pve/storage.conf file? My suspicion is, that there are directory storages defined on mounts which are not automounted by PVE itself.

Thus it can happen, that the mount happens too late and PVE will already use that directory storage. There are 2 options that can be set for such a storage (via the CLI or directly in the config file) which are
Code:
is_mountpoint 1
mkdir 0

They tell PVE that it should only start using that directory storage once something is mounted at the defined path, and it is not responsible to create the path, should it not exist.
 
Code:
root@TracheServ:~# cat /etc/pve/storage.conf
cat: /etc/pve/storage.conf: No such file or directory
root@TracheServ:~#

I have only defined storage via the GUI thus far.
 
is the pve-cluster service running?
systemctl status pve-cluster
 
is the pve-cluster service running?
systemctl status pve-cluster


Code:
root@TracheServ:~# cat /etc/pve/storage.conf
cat: /etc/pve/storage.conf: No such file or directory
root@TracheServ:~# systemctl status pve-cluster
● pve-cluster.service - The Proxmox VE cluster filesystem
   Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
   Active: active (running) since Fri 2020-11-13 07:33:46 EST; 34min ago
  Process: 7827 ExecStart=/usr/bin/pmxcfs (code=exited, status=0/SUCCESS)
 Main PID: 8283 (pmxcfs)
    Tasks: 8 (limit: 7372)
   Memory: 63.6M
   CGroup: /system.slice/pve-cluster.service
           └─8283 /usr/bin/pmxcfs

Nov 13 07:33:45 TracheServ systemd[1]: Starting The Proxmox VE cluster filesystem...
Nov 13 07:33:46 TracheServ systemd[1]: Started The Proxmox VE cluster filesystem.
root@TracheServ:~#
 
Ah my bad, /etc/pve/storage.cfg
 
Code:
root@TracheServ:~# cat /etc/pve/storage.cfg
zfspool: local-zfs
        pool rpool/data
        content rootdir,images
        sparse 1

dir: local
        path /var/lib/vz
        content rootdir,images,vztmpl,backup,snippets,iso
        maxfiles 1
        shared 0

zfspool: Storage.1
        pool Storage.1
        content images,rootdir
        mountpoint /Storage.1
        sparse 1

zfspool: Nextcloud.Storage
        pool Nextcloud.Storage
        content images,rootdir
        mountpoint /Nextcloud.Storage
        sparse 1

dir: spare
        path /mnt/pve/spare
        content backup,rootdir,vztmpl,snippets,iso,images
        is_mountpoint 1
        maxfiles 3
        shared 1

nfs: Proxmox_backups
        export /data/backups/proxmox
        path /mnt/pve/Proxmox_backups
        server 192.168.1.139
        content rootdir,backup,images,vztmpl,iso,snippets
        maxfiles 5
        options vers=4.2

dir: User.data
        path /Nextcloud.Storage
        content images,vztmpl,snippets,iso,rootdir,backup
        maxfiles 10
        shared 1


root@TracheServ:~#
 
Last edited:
Hmm interesting, the likely suspect "spare" has is_mountpoint already defined. If you run ncdu again, you can actually enter a directory and dive in deeper to see the actual culprit. If you enter /mnt, which directory in there is using up all that space?
 
Code:
ncdu 1.13 ~ Use the arrow keys to navigate, press ? for help                                                         
--- /mnt ------------------------------------------------------------------------------------------------------------
  437.9 GiB [##########] /pve                                                                                       
e 512.0   B [          ] /iso
e 512.0   B [          ] /hostrun

--- /mnt/pve --------------------------------------------------------------------------------------------------------
                         /..                                                                                         
  275.3 GiB [##########] /spare
  162.7 GiB [#####     ] /Proxmox_backups
 
Code:
ncdu 1.13 ~ Use the arrow keys to navigate, press ? for help                                                        
--- /mnt ------------------------------------------------------------------------------------------------------------
  437.9 GiB [##########] /pve                                                                                      
e 512.0   B [          ] /iso
e 512.0   B [          ] /hostrun

--- /mnt/pve --------------------------------------------------------------------------------------------------------
                         /..                                                                                        
  275.3 GiB [##########] /spare
  162.7 GiB [#####     ] /Proxmox_backups
Spare is a single mounted drive separate from the local drive that just has backups of user data.
 
thanks, can you also please add the output of df -h?
 
I'm more and more confident that it has to do with this error:

https://forum.proxmox.com/threads/d...b-zfs-disk-only-uses-200gb.78969/#post-349639

After waking up to a 100% full root directory, I panicked. The only thing I have done new recently was migrate disks to qcow2 and then put them User.data mounted at /Nextcloud.storage. I've since moved them all from /Nextcloud.storage/User.data to just /Nextcloud.storage as raw disks.

Code:
root@TracheServ:~# df -h
Filesystem                           Size  Used Avail Use% Mounted on
udev                                  24G     0   24G   0% /dev
tmpfs                                4.8G  9.5M  4.7G   1% /run
rpool/ROOT/pve-1                     186G   18G  168G  10% /
tmpfs                                 24G   46M   24G   1% /dev/shm
tmpfs                                5.0M     0  5.0M   0% /run/lock
tmpfs                                 24G     0   24G   0% /sys/fs/cgroup
/dev/sdc1                            458G  276G  159G  64% /mnt/pve/spare
rpool                                168G  128K  168G   1% /rpool
rpool/ROOT                           168G  128K  168G   1% /rpool/ROOT
rpool/data                           168G  128K  168G   1% /rpool/data
/dev/fuse                             30M   24K   30M   1% /etc/pve
Storage.1                            1.8G  128K  1.8G   1% /Storage.1
192.168.1.139:/data/backups/proxmox  870G  493G  378G  57% /mnt/pve/Proxmox_backups
tmpfs                                4.8G     0  4.8G   0% /run/user/0
root@TracheServ:~#
 
Although I have recently implemented the user data backups to the spare drive so there's definitely a correlation there as well. This disk fill up happened overnight when a backup drive would have bene running.
 
Okay, /mnt/pve/Proxmox_backups is mounted from 192.168.1.139:/data/backups/proxmox and /mnt/pve/spare is /dev/sdc1. So that should be okay and not on the root disk.

I cannot see /Nextcloud.Storage being mounted anywhere. Since the storage config for it is missing the is_mountpoint 1 option, I suspect that at some point, the 2.5TB drive wasn't mounted (in time) and now it is stored on the root disk of the node, filling it up.

You should add the is_mountpoint option to the User.data storage configuration.

Then you have to figure out how to fix the issue. A first step would be to stop all services that want to access it and temporarily disable the storage. Then move the directory to another path so you don't lose data.

Recreate the directory, mount it and then you can enable the storage again.

After that you should have the actual 2.5TB disk mounted to /Nextcloud.Storage and can figure out how you merge the data.
 
Nextcloud.Storage is a ZFS mirrored pool. I definitely have access to the storage as I'm storing plenty of stuff on it. Pics attached.

1605274716942.png

1605274725275.png

So where do I go from here?
 
Oh man, I kinda glanced over the ZFS part and went in the wrong direction. Going over the thread with that in mind I am a bit confused to be honest.

Can you show the output of zfs list and zpool status?
 
Code:
root@TracheServ:~# zfs list
NAME                                USED  AVAIL     REFER  MOUNTPOINT
Nextcloud.Storage                  1.82T   834G      105G  /Nextcloud.Storage
Nextcloud.Storage/vm-201-disk-0    7.59G   834G     7.59G  -
Nextcloud.Storage/vm-400-disk-0    14.2G   834G     14.2G  -
Nextcloud.Storage/vm-400-disk-1     342M   834G      342M  -
Nextcloud.Storage/vm-42069-disk-0  3.14M   834G     3.14M  -
Nextcloud.Storage/vm-42069-disk-1  11.5G   834G     11.5G  -
Nextcloud.Storage/vm-600-disk-0    77.6G   834G     77.6G  -
Nextcloud.Storage/vm-600-disk-1     101M   834G      101M  -
Nextcloud.Storage/vm-700-disk-0    1.51T  1.99T      345G  -
Nextcloud.Storage/vm-800-disk-0    23.3G   834G     23.3G  -
Nextcloud.Storage/vm-900-disk-0    76.9G   834G     76.9G  -
Storage.1                           897G  1.76G      104K  /Storage.1
Storage.1/vm-700-disk-0             897G  30.9G      868G  -
rpool                              57.5G   168G      104K  /rpool
rpool/ROOT                         17.6G   168G       96K  /rpool/ROOT
rpool/ROOT/pve-1                   17.6G   168G     17.6G  /
rpool/data                         39.9G   168G       96K  /rpool/data
rpool/data/vm-300-disk-0           1.78G   168G     1.78G  -
rpool/data/vm-400-disk-0           7.37G   168G     7.37G  -
rpool/data/vm-500-disk-0           4.62G   168G     4.62G  -
rpool/data/vm-600-disk-0           6.25G   168G     6.25G  -
rpool/data/vm-600-disk-1           6.28G   168G     6.28G  -
rpool/data/vm-700-disk-0           13.6G   168G     13.6G  -
root@TracheServ:~# zpool status
  pool: Nextcloud.Storage
 state: ONLINE
  scan: scrub repaired 0B in 0 days 01:07:51 with 0 errors on Sun Nov  8 01:31:53 2020
config:

        NAME                        STATE     READ WRITE CKSUM
        Nextcloud.Storage           ONLINE       0     0     0
          mirror-0                  ONLINE       0     0     0
            wwn-0x5000c50064941e16  ONLINE       0     0     0
            wwn-0x5000c5006497492d  ONLINE       0     0     0

errors: No known data errors

  pool: Storage.1
 state: ONLINE
  scan: scrub repaired 0B in 0 days 02:42:19 with 0 errors on Sun Nov  8 03:06:22 2020
config:

        NAME                                  STATE     READ WRITE CKSUM
        Storage.1                             ONLINE       0     0     0
          mirror-0                            ONLINE       0     0     0
            wwn-0x500003956b800304            ONLINE       0     0     0
            ata-TOSHIBA_MG03ACA100_44N2KLTFF  ONLINE       0     0     0

errors: No known data errors

  pool: rpool
 state: ONLINE
  scan: scrub repaired 0B in 0 days 00:06:26 with 0 errors on Sun Nov  8 00:30:31 2020
config:

        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          sdb3      ONLINE       0     0     0
          sda2      ONLINE       0     0     0

errors: No known data errors
root@TracheServ:~#
 
Oh and I haven't taken the time to thank you yet Aaron! I really appreciate the support.

I'm leaving on vacation today so this was a terrible way to start the day - but moving everything off of local and off of the User.data mount point seems to have me back and running again - albeit without knowing the cause of my crash nor how to prevent it in the future, yet.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!