[SOLVED] "Error: No space left on device"

Discussion in 'Proxmox VE: Installation and configuration' started by mattlach, Jun 11, 2016.

  1. mattlach

    mattlach Member

    Joined:
    Mar 23, 2016
    Messages:
    154
    Likes Received:
    12
    Hey all,

    I am having an odd problem I'm hopng someone might help me solve.

    I keep randomly getting "Error: No space left on device" error messages in console.

    Examples:
    Code:
    # ifup vmbr3
    
    Waiting for vmbr3 to get ready (MAXWAIT is 2 seconds).
    Error: No space left on device
    Error: No space left on device
    This happens with other random commands as well, like when restarting the nfs server, etc. etc.

    Thing is, I can't seem to find ANY device that is full.

    Code:
    # df -h
    Filesystem                 Size  Used Avail Use% Mounted on
    udev                        10M     0   10M   0% /dev
    tmpfs                       38G   18M   38G   1% /run
    rpool/ROOT/pve-1           435G  7.6G  427G   2% /
    tmpfs                       95G   43M   95G   1% /dev/shm
    tmpfs                      5.0M  4.0K  5.0M   1% /run/lock
    tmpfs                       95G     0   95G   0% /sys/fs/cgroup
    rpool                      427G  128K  427G   1% /rpool
    rpool/ROOT                 427G     0  427G   0% /rpool/ROOT
    rpool/subvol-101-disk-1    8.0G  601M  7.5G   8% /rpool/subvol-101-disk-1
    rpool/subvol-102-disk-1    8.0G  439M  7.6G   6% /rpool/subvol-102-disk-1
    rpool/subvol-110-disk-1     16G  2.2G   14G  14% /rpool/subvol-110-disk-1
    rpool/subvol-111-disk-1    8.0G  368M  7.7G   5% /rpool/subvol-111-disk-1
    rpool/subvol-120-disk-1     16G  422M   16G   3% /rpool/subvol-120-disk-1
    rpool/subvol-125-disk-1    8.0G  355M  7.7G   5% /rpool/subvol-125-disk-1
    rpool/subvol-130-disk-1     16G  4.5G   12G  28% /rpool/subvol-130-disk-1
    rpool/subvol-140-disk-1     16G  1.4G   15G   9% /rpool/subvol-140-disk-1
    zfshome                     18T  2.2T   16T  13% /zfshome
    zfshome/media               24T  7.7T   16T  33% /zfshome/media
    zfshome/mythtv_recordings   19T  2.8T   16T  15% /zfshome/mythtv_recordings
    /dev/sds1                  917G  769G  149G  84% /mnt/mythbuntu/scheduled
    /dev/sdt1                  118G   60M  118G   1% /mnt/mythbuntu/live1
    tmpfs                      100K     0  100K   0% /run/lxcfs/controllers
    cgmfs                      100K     0  100K   0% /run/cgmanager/fs
    /dev/fuse                   30M   16K   30M   1% /etc/pve
    rpool/subvol-150-disk-1     32G  264M   32G   1% /rpool/subvol-150-disk-1
    The host seems to be working just fine, other than this strange error message.

    Does anyone know what might be going on?

    Thanks,
    Matt
     
  2. ned

    ned Member

    Joined:
    Jan 26, 2015
    Messages:
    92
    Likes Received:
    1
    Paste output of
    Code:
     df -hi
     
  3. mattlach

    mattlach Member

    Joined:
    Mar 23, 2016
    Messages:
    154
    Likes Received:
    12
    Thanks for your help.

    Inodes look good to me too:

    Code:
    ~# df -hi
    Filesystem                Inodes IUsed IFree IUse% Mounted on
    udev                         24M   884   24M    1% /dev
    tmpfs                        24M  1.4K   24M    1% /run
    rpool/ROOT/pve-1            854M   84K  854M    1% /
    tmpfs                        24M    67   24M    1% /dev/shm
    tmpfs                        24M    24   24M    1% /run/lock
    tmpfs                        24M    18   24M    1% /sys/fs/cgroup
    rpool                       854M    16  854M    1% /rpool
    rpool/ROOT                  854M     7  854M    1% /rpool/ROOT
    rpool/subvol-101-disk-1      15M   34K   15M    1% /rpool/subvol-101-disk-1
    rpool/subvol-102-disk-1      16M   25K   16M    1% /rpool/subvol-102-disk-1
    rpool/subvol-110-disk-1      28M   95K   28M    1% /rpool/subvol-110-disk-1
    rpool/subvol-111-disk-1      16M   22K   16M    1% /rpool/subvol-111-disk-1
    rpool/subvol-120-disk-1      32M   25K   32M    1% /rpool/subvol-120-disk-1
    rpool/subvol-125-disk-1      16M   22K   16M    1% /rpool/subvol-125-disk-1
    rpool/subvol-130-disk-1      23M   38K   23M    1% /rpool/subvol-130-disk-1
    rpool/subvol-140-disk-1      30M   24K   30M    1% /rpool/subvol-140-disk-1
    zfshome                      32G  1.3M   32G    1% /zfshome
    zfshome/media                32G   39K   32G    1% /zfshome/media
    zfshome/mythtv_recordings    32G  2.8K   32G    1% /zfshome/mythtv_recordings
    /dev/sds1                    59M   656   59M    1% /mnt/mythbuntu/scheduled
    /dev/sdt1                   7.5M    10  7.5M    1% /mnt/mythbuntu/live1
    tmpfs                        24M    12   24M    1% /run/lxcfs/controllers
    cgmfs                        24M    14   24M    1% /run/cgmanager/fs
    /dev/fuse                   9.8K    35  9.8K    1% /etc/pve
    rpool/subvol-150-disk-1      63M   30K   63M    1% /rpool/subvol-150-disk-1
    Any idea what else could be causing this puzzling out of disk space message?
     
  4. mattlach

    mattlach Member

    Joined:
    Mar 23, 2016
    Messages:
    154
    Likes Received:
    12
    So, googling around I found someone else with an identical problem to mine, on a Debian Webserver.

    both df -h and df -ih show plenty of free space, but he, like me, is still getting "no space left on device" error messages.

    He seems to have solved his issue by making a change to fs.inotify.max_user_watches. I ahve no idea what this is, or what it does. Can anyone explain this to me? I don't want to touch it before I do.

    Does anyone have any thoughts?

    Another thought I had was that it might be drive corruption, but I am running off of a ZFS mirror, and a scrub shows no issues...
     
  5. mattlach

    mattlach Member

    Joined:
    Mar 23, 2016
    Messages:
    154
    Likes Received:
    12
    Anyone mind if I bump this?

    Could really use some assistance.
     
  6. fabian

    fabian Proxmox Staff Member
    Staff Member

    Joined:
    Jan 7, 2016
    Messages:
    3,390
    Likes Received:
    523
    you can check whether there are still inotify watches available by trying to "watch" a file ;)

    if the "ENOSPACE" error is reproducible and caused by hitting the inotify limit, you should get an error message when trying "tail -f /var/log/messages": "tail: cannot watch '/var/log/messages': No space left on device". if it is just spurious, you will need to try that test when the issue occurs (or just proactively try with an increased inotify limit).

    if you run a lot of (containers with) processes that utilize inotify, you might need to increase the watch limit. the only downside is that inotify's need some unswappable kernel memory, but the amount is pretty much irrelevant on any modern server (1kb for each used inotify watch). you can definitely bump the default 8k limit to at least 64k without any worries, unless your system is really really memory constrained (in which case, you should probably not use ZFS like you are ;))

    unfortunately, a lot of programs that use inotify and fail don't propagate the root cause and instead report "ENOSPACE"..
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  7. mattlach

    mattlach Member

    Joined:
    Mar 23, 2016
    Messages:
    154
    Likes Received:
    12
    Ahh, I think that is the cause.

    I have Crashplan running in a LXC container, and it seems to want to use A LOT of watches.

    I'm probably going to boost it up to 1048576, and see if that helps. RAM is really not an issue for me, I have 192GB in my server. I upgraded it from 96GB before I switched from ESXi. Now with Proxmox and containers, my system is so much more RAM efficient, that I have more RAM than I know what to do with!

    Am I correct in assuming that I only have to do this for the host, and all the LXC containers are automatically included?

    Is the appropriate way to do this to edit /etc/sysctl.conf, or is there a built in way in the management interface to deal with this?

    Since Proxmox is geared towards these types of activities, maybe it makes sense for it to have a higher max_user_watches than the Debian default?

    Thanks,
    Matt
     
  8. fabian

    fabian Proxmox Staff Member
    Staff Member

    Joined:
    Jan 7, 2016
    Messages:
    3,390
    Likes Received:
    523
    yes, just like all other sysctl values, you can temporarily set it by echoing some value to /proc/sys/... and persist it by adding a line to /etc/sysctl.conf . setting it on the host also affects the containers, yes (shared kernel). I am not sure whether you need a (container) reboot or not - but that should be easy enough to empirically find out ;) increasing the default seems like a good idea, I'll file a tracking bug for it.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  9. mattlach

    mattlach Member

    Joined:
    Mar 23, 2016
    Messages:
    154
    Likes Received:
    12
    Thanks for the help.

    If anyone else is trying to solve the same problem, here is how I wound up doing it: (from the console in the host)

    Check the current limit:
    Code:
    cat /proc/sys/fs/inotify/max_user_watches
    For me this was 8192

    Temporarily test if increasing the value fixes things: (this will reset to the default of 8192 next reboot)
    Code:
    echo 1048576 > /proc/sys/fs/inotify/max_user_watches
    Keep in mind at the size of ~1k each, this will increase the RAM consumption by watches from ~8MB to ~1GB. You may not need it to be this large. Try something smaller to conserve RAM.

    If the above makes the problem go away, we can now make it permanent through reboots:

    Edit /etc/sysctl.conf
    Code:
    nano /etc/sysctl.conf
    Add the following line (or edit it if present already)
    Code:
    fs.inotify.max_user_watches=1048576
    Edit the 1048576 value to suit your needs as discussed above

    Reboot, or do the following:
    Code:
    sysctl -p /etc/sysctl.conf
     
  10. Llewellyn Pienaar

    Llewellyn Pienaar New Member

    Joined:
    May 9, 2019
    Messages:
    23
    Likes Received:
    1
    I am unable to make a new post so I am bumping here.

    I have the same issue
    But the steps taken did not resolve the issue for me.

    I increased to 2048576 after my first try increasing to 1048576.

    I get the error

    No space left on device at /usr/share/perl5/PVE/RESTEnvironment.pm

    pveversion -v
    proxmox-ve: 5.2-2 (running kernel: 4.15.18-8-pve)
    pve-manager: 5.2-10 (running version: 5.2-10/6f892b40)
    pve-kernel-4.15: 5.2-11
    pve-kernel-4.15.18-8-pve: 4.15.18-28
    pve-kernel-4.15.18-7-pve: 4.15.18-27
    pve-kernel-4.15.17-2-pve: 4.15.17-10
    corosync: 2.4.2-pve5
    criu: 2.11.1-1~bpo90
    glusterfs-client: 3.8.8-1
    ksm-control-daemon: not correctly installed
    libjs-extjs: 6.0.1-2
    libpve-access-control: 5.0-8
    libpve-apiclient-perl: 2.0-5
    libpve-common-perl: 5.0-41
    libpve-guest-common-perl: 2.0-18
    libpve-http-server-perl: 2.0-11
    libpve-storage-perl: 5.0-30
    libqb0: 1.0.1-1
    lvm2: 2.02.168-pve6
    lxc-pve: 3.0.2+pve1-3
    lxcfs: 3.0.2-2
    novnc-pve: 1.0.0-2
    proxmox-widget-toolkit: 1.0-20
    pve-cluster: 5.0-30
    pve-container: 2.0-29
    pve-docs: 5.2-9
    pve-firewall: 3.0-14
    pve-firmware: 2.0-6
    pve-ha-manager: 2.0-5
    pve-i18n: 1.0-6
    pve-libspice-server1: 0.14.1-2
    pve-qemu-kvm: 2.12.1-1
    pve-xtermjs: 1.0-5
    qemu-server: 5.0-38
    smartmontools: 6.5+svn4324-1
    spiceterm: 3.0-5
    vncterm: 1.5-3

    df -h

    Filesystem Size Used Avail Use% Mounted on
    udev 7.8G 0 7.8G 0% /dev
    tmpfs 1.6G 19M 1.6G 2% /run
    /dev/md2 3.7T 1.7T 1.8T 49% /
    tmpfs 7.9G 43M 7.8G 1% /dev/shm
    tmpfs 5.0M 0 5.0M 0% /run/lock
    tmpfs 7.9G 0 7.9G 0% /sys/fs/cgroup
    /dev/md1 488M 173M 289M 38% /boot
    /dev/sdc1 954G 538G 417G 57% /var/lib/lxc-ssd
    /dev/fuse 30M 24K 30M 1% /etc/pve
    tmpfs 1.6G 0 1.6G 0% /run/user/0



    Any idea what could be the issue?

    The system seems stable until I make any change on the VM then the HOST reports the error. The change I am making is just adding a user in example to the VM.
     
  11. Llewellyn Pienaar

    Llewellyn Pienaar New Member

    Joined:
    May 9, 2019
    Messages:
    23
    Likes Received:
    1
    After a reboot I was able to run commands from CLI on HOST

    I ran df -hi
    Filesystem Inodes IUsed IFree IUse% Mounted on
    udev 2.0M 513 2.0M 1% /dev
    tmpfs 2.0M 2.5K 2.0M 1% /run
    /dev/md2 233M 193K 233M 1% /
    tmpfs 2.0M 103 2.0M 1% /dev/shm
    tmpfs 2.0M 15 2.0M 1% /run/lock
    tmpfs 2.0M 17 2.0M 1% /sys/fs/cgroup
    /dev/md1 128K 368 128K 1% /boot
    /dev/sdc1 477M 100 477M 1% /var/lib/lxc-ssd
    /dev/fuse 9.8K 55 9.8K 1% /etc/pve
    tmpfs 2.0M 10 2.0M 1% /run/user/0
     
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice