No space left on device (500)

SilverApe

New Member
Nov 2, 2020
16
2
3
50
Hi,
I am using proxmox on a laptop to run Home Assistant. It was running smoothly but now I get an error that there is no more space left. In the summary I see that the HD space(root) is at 99.97%.
I have run df -h and the outcome shows: /dev/mapper/pve-root 59G 59G 0 100% /
My total disk space is 250GB. I have only one VM with HA in it.

How do I clean this up? And how do I make this clean in the future.
I have come to this by searching the forums but I am a total newbie on proxmox and linux, so please keep it simple :)

Thanks in advance
 
Hi,
When the proxmox installer partitions your drives (using lvm-thin [1] as default), it allocates part of your storage for the root partition (in your case 60GB) and creates another pool for VM and CT disk storage. This is why your root partition is capped at 59GB. If you look at the output of lsblk you will see how the rest of your storage has been allocated.

That being said, it's strange that your root partition has filled up, if you say you haven't done much on it. You can run du -sh /* /root/ /home/* to see the storage used by typical "base" directories.

[1] https://pve.proxmox.com/pve-docs/pve-admin-guide.html#storage_lvmthin
 
Hi dylanw,
This is what I get:

Code:
0    /bin
72M    /boot
37M    /dev
4.8M    /etc
4.0K    /home
0    /lib
0    /lib32
0    /lib64
0    /libx32
16K    /lost+found
4.0K    /media
25G    /mnt
4.0K    /opt
du: cannot access '/proc/754/task/754/fd/24': No such file or directory
du: cannot access '/proc/754/task/754/fd/25': No such file or directory
du: cannot access '/proc/17111/task/17111/fd/4': No such file or directory
du: cannot access '/proc/17111/task/17111/fdinfo/4': No such file or directory
du: cannot access '/proc/17111/fd/3': No such file or directory
du: cannot access '/proc/17111/fdinfo/3': No such file or directory
0    /proc
60K    /root
78M    /run
0    /sbin
4.0K    /srv
0    /sys
32K    /tmp
1.8G    /usr
57G    /var
du: cannot access '/home/*': No such file or directory
 
Okay so your /var directory seems to be taking up quite a bit more space than I think it should. You could try du -h /var | grep -P '^\d+\.?\d*G' to list any files that take up more than a GB. Maybe there is one major culprit in there ;)
And could you post the output of lsblk and lvs?
 
I did the
Code:
du -h /var | grep -P '^\d+\.?\d*G'
and this was the result:
Code:
57G    /var/log
57G    /var

The output of
Code:
lsblk
is:
Code:
NAME                         MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                            8:0    0 238.5G  0 disk
├─sda1                         8:1    0  1007K  0 part
├─sda2                         8:2    0   512M  0 part /boot/efi
└─sda3                         8:3    0   238G  0 part
  ├─pve-swap                 253:0    0     7G  0 lvm  [SWAP]
  ├─pve-root                 253:1    0  59.3G  0 lvm  /
  ├─pve-data_tmeta           253:2    0   1.6G  0 lvm 
  │ └─pve-data-tpool         253:4    0 152.6G  0 lvm 
  │   ├─pve-data             253:5    0 152.6G  0 lvm 
  │   ├─pve-vm--100--disk--0 253:6    0     4M  0 lvm 
  │   ├─pve-vm--100--disk--1 253:7    0    32G  0 lvm 
  │   ├─pve-vm--101--disk--0 253:8    0     4M  0 lvm 
  │   ├─pve-vm--101--disk--1 253:9    0    32G  0 lvm 
  │   ├─pve-vm--102--disk--0 253:10   0     4M  0 lvm 
  │   ├─pve-vm--102--disk--1 253:11   0    32G  0 lvm 
  │   ├─pve-vm--103--disk--0 253:12   0     4M  0 lvm 
  │   └─pve-vm--103--disk--1 253:13   0    32G  0 lvm 
  └─pve-data_tdata           253:3    0 152.6G  0 lvm 
    └─pve-data-tpool         253:4    0 152.6G  0 lvm 
      ├─pve-data             253:5    0 152.6G  0 lvm 
      ├─pve-vm--100--disk--0 253:6    0     4M  0 lvm 
      ├─pve-vm--100--disk--1 253:7    0    32G  0 lvm 
      ├─pve-vm--101--disk--0 253:8    0     4M  0 lvm 
      ├─pve-vm--101--disk--1 253:9    0    32G  0 lvm 
      ├─pve-vm--102--disk--0 253:10   0     4M  0 lvm 
      ├─pve-vm--102--disk--1 253:11   0    32G  0 lvm 
      ├─pve-vm--103--disk--0 253:12   0     4M  0 lvm 
      └─pve-vm--103--disk--1 253:13   0    32G  0 lvm

and from
Code:
lvs
it is:
Code:
  LV            VG  Attr       LSize    Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  data          pve twi-aotz-- <152.61g             29.00  2.46                           
  root          pve -wi-ao----   59.25g                                                   
  swap          pve -wi-ao----    7.00g                                                   
  vm-100-disk-0 pve Vwi-aotz--    4.00m data        0.00                                   
  vm-100-disk-1 pve Vwi-aotz--   32.00g data        49.10                                 
  vm-101-disk-0 pve Vwi-a-tz--    4.00m data        0.00                                   
  vm-101-disk-1 pve Vwi-a-tz--   32.00g data        6.70                                   
  vm-102-disk-0 pve Vwi-a-tz--    4.00m data        0.00                                   
  vm-102-disk-1 pve Vwi-a-tz--   32.00g data        35.39                                 
  vm-103-disk-0 pve Vwi-a-tz--    4.00m data        0.00                                   
  vm-103-disk-1 pve Vwi-a-tz--   32.00g data        47.11

I only have 100 and 101 right now. 100 being the main VM with HA on it. 101 being the copy of 100 before changed anything to HA (clean install). 102 and 103 are old copies from 100 I take before I update HA. Those are already deleted but seem to be taking up some space, or am I wrong?
 
Wow okay, that is an absurd amount of data saved in your log files. How long ago did you install it? Could you post the output of tail -n 1000 /var/log/syslog to see if some process is spamming the log files (you might need to attach this one as a file: tail -n 1000 /var/log/syslog > syslog.txt)? Also the output of cat /etc/logrotate.conf and cat /etc/cron.daily/logrotate could be helpful.
 
can't save it to a txt file: tail: error writing 'standard output': No space left on device :p i have copied into a file

Code:
cat /etc/logrotate.conf
gave this:
Code:
# see "man logrotate" for details
# rotate log files weekly
weekly

# keep 4 weeks worth of backlogs
rotate 4

# create new (empty) log files after rotating old ones
create

# use date as a suffix of the rotated file
#dateext

# uncomment this if you want your log files compressed
#compress

# packages drop log rotation information into this directory
include /etc/logrotate.d

# system-specific logs may be also be configured here.

Code:
cat /etc/cron.daily/logrotate

gave this:

Code:
#!/bin/sh

# skip in favour of systemd timer
if [ -d /run/systemd/system ]; then
    exit 0
fi

# this cronjob persists removals (but not purges)
if [ ! -x /usr/sbin/logrotate ]; then
    exit 0
fi

/usr/sbin/logrotate /etc/logrotate.conf
EXITVALUE=$?
if [ $EXITVALUE != 0 ]; then
    /usr/bin/logger -t logrotate "ALERT exited abnormally with [$EXITVALUE]"
fi
exit $EXITVALUE

I have installed proxmox 2 or 3 weeks ago. Had some problems with another laptop. It was hanging all the time. I could not fix that. Installed a new proxmox machine and put the HA VM backup back. It all seemed to work fine until I discovered this.

Thanks for your time.
 

Attachments

53GB of logs in 2 or 3 weeks...you could browse you logfiles and look what error is causing this. At that rate it must write to the logfile several times a second.
 
Could it be HA that is logging to much? Is there a way to clean this up for now until I find what is wrong?
When I look in the log I see multiple items per second being added. Here is a grab from the log:
 

Attachments

can't save it to a txt file: tail: error writing 'standard output': No space left on device
Ah.. of course.. :p

Could it be HA that is logging to much?
It should be logging to its own disk, but I can't say for certain whether or not it is the problem in some indirect way.


Is there a way to clean this up for now until I find what is wrong?
Could you run logrotate -f /etc/logrotate.conf and see if that clears up space? This forces logrotate to rotate the log files outside of its regular schedule. If this works, it would be great if you could repost the syslog output again to see what is actually writing to it. The error messages in the log file at the moment are just due to the fact that the system lacks the storage to store more logs.

When I look in the log I see multiple items per second being added.
This is typical to some extent. But I imagine with yours, a very extreme example will be seen!
 
Also in case you haven't rotated the logs yet, could you also post the output of du -sh /var/log/* first? Might get a closer idea of what is doing the damage. If not, wait an hour or two, then post the output. Log files are normally quite small, so it should be easy to spot the culprit.
 
I did the rotate but I got errors:
Code:
error: Compressing program wrote following message to stderr when compressing log /var/log/alternatives.log.1:

gzip: stdout: No space left on device
error: failed to compress log /var/log/alternatives.log.1
error: Compressing program wrote following message to stderr when compressing log /var/log/pve-firewall.log.1:

gzip: stdout: No space left on device
error: failed to compress log /var/log/pve-firewall.log.1
error: Compressing program wrote following message to stderr when compressing log /var/log/syslog.1:

gzip: stdout: No space left on device
error: failed to compress log /var/log/syslog.1
error: Compressing program wrote following message to stderr when compressing log /var/log/mail.info.1:

gzip: stdout: No space left on device
error: failed to compress log /var/log/mail.info.1
error: Compressing program wrote following message to stderr when compressing log /var/log/mail.warn.1:

gzip: stdout: No space left on device
error: failed to compress log /var/log/mail.warn.1
error: Compressing program wrote following message to stderr when compressing log /var/log/mail.log.1:

gzip: stdout: No space left on device
error: failed to compress log /var/log/mail.log.1
error: Compressing program wrote following message to stderr when compressing log /var/log/daemon.log.1:

gzip: stdout: No space left on device
error: failed to compress log /var/log/daemon.log.1
error: Compressing program wrote following message to stderr when compressing log /var/log/kern.log.1:

gzip: stdout: No space left on device
error: failed to compress log /var/log/kern.log.1
error: Compressing program wrote following message to stderr when compressing log /var/log/auth.log.1:

gzip: stdout: No space left on device
error: failed to compress log /var/log/auth.log.1
error: Compressing program wrote following message to stderr when compressing log /var/log/user.log.1:

gzip: stdout: No space left on device
error: failed to compress log /var/log/user.log.1
error: Compressing program wrote following message to stderr when compressing log /var/log/debug.1:

gzip: stdout: No space left on device
error: failed to compress log /var/log/debug.1
error: Compressing program wrote following message to stderr when compressing log /var/log/messages.1:

gzip: stdout: No space left on device
error: failed to compress log /var/log/messages.1
error: error creating temp state file /var/lib/logrotate/status.tmp: No space left on device

When I did the du -sh /var/log/* it gave me:
Code:
4.0K    /var/log/alternatives.log
20K    /var/log/alternatives.log.1
40K    /var/log/apt
14G    /var/log/auth.log
43G    /var/log/auth.log.1
36M    /var/log/auth.log.2.gz
0    /var/log/btmp
0    /var/log/btmp.1
4.0K    /var/log/ceph
8.0K    /var/log/corosync
756K    /var/log/daemon.log
2.3M    /var/log/daemon.log.1
48K    /var/log/daemon.log.2.gz
12K    /var/log/debug
12K    /var/log/debug.1
4.0K    /var/log/debug.2.gz
0    /var/log/dpkg.log
356K    /var/log/dpkg.log.1
8.0K    /var/log/faillog
4.0K    /var/log/fontconfig.log
4.0K    /var/log/glusterfs
108K    /var/log/kern.log
1.2M    /var/log/kern.log.1
104K    /var/log/kern.log.2.gz
12K    /var/log/lastlog
8.0K    /var/log/lxc
12K    /var/log/mail.info
4.0K    /var/log/mail.info.1
4.0K    /var/log/mail.info.2.gz
12K    /var/log/mail.log
4.0K    /var/log/mail.log.1
4.0K    /var/log/mail.log.2.gz
12K    /var/log/mail.warn
4.0K    /var/log/mail.warn.1
4.0K    /var/log/mail.warn.2.gz
100K    /var/log/messages
292K    /var/log/messages.1
92K    /var/log/messages.2.gz
4.0K    /var/log/private
396K    /var/log/pve
16K    /var/log/pveam.log
4.0K    /var/log/pve-firewall.log
4.0K    /var/log/pve-firewall.log.1
4.0K    /var/log/pve-firewall.log.2.gz
4.0K    /var/log/pve-firewall.log.3.gz
4.0K    /var/log/pve-firewall.log.4.gz
4.0K    /var/log/pve-firewall.log.5.gz
4.0K    /var/log/pve-firewall.log.6.gz
4.0K    /var/log/pve-firewall.log.7.gz
372K    /var/log/pveproxy
4.0K    /var/log/samba
152K    /var/log/syslog
460K    /var/log/syslog.1
24K    /var/log/syslog.2.gz
24K    /var/log/syslog.3.gz
24K    /var/log/syslog.4.gz
96K    /var/log/syslog.5.gz
28K    /var/log/syslog.6.gz
28K    /var/log/syslog.7.gz
4.0K    /var/log/user.log
4.0K    /var/log/user.log.1
4.0K    /var/log/user.log.2.gz
16K    /var/log/vzdump
0    /var/log/wtmp
28K    /var/log/wtmp.1
 
Does your proxmox ve have a public (internet reachable) IP address, or can you only access from your local network? I can't say I've seen a auth log file this large, but it is common that public IP addresses see upwards of hundreds of login attempts per day, due to automated scripts that constantly attempt logins on public IPs.
Could you post/look at the output from cat /var/log/auth.log | grep sshd to see if there are many login attempts through ssh?
Note: This shouldn't be an issue on a private network.
 
As far as I know it is only local.
I did the cat /var/log/auth.log | grep sshd

this is the outcome:
Code:
Dec  3 21:18:25 proxserver sshd[857]: Server listening on 0.0.0.0 port 22.
Dec  3 21:18:25 proxserver sshd[857]: Server listening on :: port 22.
Dec  3 22:30:55 proxserver sshd[863]: Server listening on 0.0.0.0 port 22.
Dec  3 22:30:55 proxserver sshd[863]: Server listening on :: port 22.
 
Okay, well that's good at least. Are there any messages in there that are appearing a lot? Feel free to post a snippet in case you're unsure. You can also remove the old log files with rm /var/log/auth.log.1 /var/log/auth.log.2.gz, which will give you back 43GB. I would keep /var/log/auth.log until we figure out what's writing to it.
 
Last edited:
that freed up some space!
Attached is my new syslog.
Get allot from these:

Code:
Dec 04 15:53:34 proxserver systemd-logind[754]: Failed to execute operation: Permission denied
Dec 04 15:53:34 proxserver systemd-logind[754]: Suspending...
Dec 04 15:53:34 proxserver systemd-logind[754]: Unit suspend.target is masked, refusing operation.
Dec 04 15:53:34 proxserver systemd-logind[754]: Failed to execute operation: Permission denied
Dec 04 15:53:34 proxserver systemd-logind[754]: Suspending...
Dec 04 15:53:34 proxserver systemd-logind[754]: Unit suspend.target is masked, refusing operation.
Dec 04 15:53:34 proxserver systemd-logind[754]: Failed to execute operation: Permission denied
Dec 04 15:53:34 proxserver systemd-logind[754]: Suspending...
Dec 04 15:53:34 proxserver systemd-logind[754]: Unit suspend.target is masked, refusing operation.
Dec 04 15:53:34 proxserver systemd-logind[754]: Failed to execute operation: Permission denied
Dec 04 15:53:34 proxserver systemd-logind[754]: Suspending...
Dec 04 15:53:34 proxserver systemd-logind[754]: Unit suspend.target is masked, refusing operation.
Dec 04 15:53:34 proxserver systemd-logind[754]: Failed to execute operation: Permission denied
Dec 04 15:53:34 proxserver systemd-logind[754]: Suspending...
Dec 04 15:53:34 proxserver systemd-logind[754]: Unit suspend.target is masked, refusing operation.
Dec 04 15:53:34 proxserver systemd-logind[754]: Failed to execute operation: Permission denied
Dec 04 15:53:34 proxserver systemd-logind[754]: Suspending...
Dec 04 15:53:34 proxserver systemd-logind[754]: Unit suspend.target is masked, refusing operation.
Dec 04 15:53:34 proxserver systemd-logind[754]: Failed to execute operation: Permission denied
 

Attachments

Apologies for the delayed response, but I think I have your answer.

I guess some guide that you followed told you to mask suspend.target so that your laptop is always active and suspend calls are ignored, which makes sense for your use case. However, there are some other system settings that tell the system to suspend the session upon a certain event, such as the lid of the laptop closing. It seems that when these get called, the system repeatedly tries and fails to enter suspend mode, spamming the log files in the process.

To fix it, use nano /etc/systemd/logind.conf (or whatever text editor you're most comfortable with), to edit the logind.conf file. On any line that you see <option>=suspend, delete the # (comment marker) at the beginning, and replace suspend with ignore. Then save the file, exit nano, and run the command systemctl restart systemd-logind.service.
 
Last edited:
Hi Dylanw,
No problem! Thanks for replaying. Yes I have it on a laptop, that is why needed the laptop to stay on when the lid was closed. Thanks for the fix, I am going to try it when I have time.
I wil let you know if it worked.