pveupload temp file at /var/tmp is causing the OS disk to become full and system crash

Tekuno-Kage

Well-Known Member
Jun 1, 2016
42
14
48
44
Today, we face an issue where an OS disk gets full. The reason was that even though we were uploading files to a Proxmox Storage outside the OS partition (a CEPH-FS Storage), we identified that Today, we faced an issue where the OS disk got full. Despite uploading files to a Proxmox Storage outside the OS partition (a CEPH-FS Storage), we found out that a file was generated in the /var/tmp directory. Initially, we assumed that this was due to a particular procedure used. However, I replicated the behavior and was able to confirm that it indeed caused the undesired situation. To test the theory further, I deleted the hidden file, which resulted in the following error message: "Error 500: temporary file '/var/tmp/pveupload-edac042ecf93413e26ad85a04506305a' does not exist."a file was generated on the /var/tmp. In the beginning, those are assumptions based on the procedure used. Therefore, I replicated the behavior, and sadly, I succeeded in replicating the undesired situation. To get some error, I deleted the hidden file, and that caused the following error, which proves the theory.
Error 500: temporary file '/var/tmp/pveupload-edac042ecf93413e26ad85a04506305a' does not exist


Code:
Package Versions:
proxmox-ve: 7.4-1 (running kernel: 5.15.131-2-pve)
pve-manager: 7.4-17 (running version: 7.4-17/513c62be)
pve-kernel-5.15: 7.4-9
pve-kernel-5.15.131-2-pve: 5.15.131-3
pve-kernel-5.15.126-1-pve: 5.15.126-1
pve-kernel-5.15.39-3-pve: 5.15.39-3
pve-kernel-5.15.35-1-pve: 5.15.35-3
pve-kernel-5.13.19-2-pve: 5.13.19-4
ceph: 16.2.13-pve1
ceph-fuse: 16.2.13-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: not correctly installed
ifupdown2: 3.1.0-1+pmx4
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.4
libproxmox-backup-qemu0: 1.3.1-1
libproxmox-rs-perl: 0.2.1
libpve-access-control: 7.4.1
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.4-2
libpve-guest-common-perl: 4.2-4
libpve-http-server-perl: 4.2-3
libpve-rs-perl: 0.7.7
libpve-storage-perl: 7.4-3
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.2-2
lxcfs: 5.0.3-pve1
novnc-pve: 1.4.0-1
proxmox-backup-client: 2.4.4-1
proxmox-backup-file-restore: 2.4.4-1
proxmox-kernel-helper: 7.4-1
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-offline-mirror-helper: 0.5.2
proxmox-widget-toolkit: 3.7.3
pve-cluster: 7.3-3
pve-container: 4.4-6
pve-docs: 7.4-2
pve-edk2-firmware: 3.20230228-4~bpo11+1
pve-firewall: 4.3-5
pve-firmware: 3.6-6
pve-ha-manager: 3.6.1
pve-i18n: 2.12-1
pve-qemu-kvm: 7.2.0-8
pve-xtermjs: 4.16.0-2
qemu-server: 7.4-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+3
vncterm: 1.7-1
zfsutils-linux: 2.1.14-pve1

I strongly believe that "granting access" to a section that a user should not have access to, simply for the purpose of uploading a file, is not an appropriate behavior. It is vital that we ensure the security of our system.
I'm counting on to get help me keep our system running smoothly and without any issues.

Thanks
 
It's unfortunate, but uploads are spooled to a local dir and then moved to your destination.

If your root storage is not sufficient, which is often the case, do not use the GUI to upload content, use SSH/SCP directly to your desired CephFS directory.
 
  • Like
Reactions: Tekuno-Kage
I wish Proxmox used /tmp instead of /var/tmp, since upload don't need to survive a reboot. But /var/tmp is usually bigger than /tmp, which makes sense with large uploads. How small is your root filesystem (or /var/) that you run into issues with uploaded several GB?
 
I wish Proxmox used /tmp instead of /var/tmp, since upload don't need to survive a reboot. But /var/tmp is usually bigger than /tmp, which makes sense with large uploads. How small is your root filesystem (or /var/) that you run into issues with uploaded several GB?

I have been using a small root partition of about 16GB for my deployments since the 5.x versions. Typically, the OS occupies around 3.5GB to 6.5GB of space. However, recently, I noticed that the system was consuming more space than expected. I suspect that this happened because I left some old ZFS snapshots that were taking up space. I did not realize that during an upload, the system was using the OS disk, which is meant only for user functions, not OS functions. The uploaded file was around 6GB, and since the OS was already occupying 6.5GB, the system ran out of space and crashed. This was an unexpected situation for me as I had never seen it before. I understand now that my mistake was leaving a very old snapshot.

It's important to note that the current implementation could potentially lead to serious issues. No matter how much space we have, it's possible for a malicious actor to start multiple uploads, which can quickly fill up the OS partition. This can result in a range of problems that could affect the system's stability and performance. Therefore, it's crucial to address this concern proactively to ensure the system's health and efficiency.


@tom @dietmar
 
Last edited:
  • Like
Reactions: Kingneutron
And please some mechanism that is removing orphaned pveupload files from failed uploads. Especially if you have a look at the big amount of complaints about failed uploads.
 
  • Like
Reactions: Kingneutron
Hey @t.lamprecht, hope you're doing well! I just wanted to quickly reach out to you regarding an issue that I believe needs to be addressed. While you're currently evaluating Bugzilla bug 5254, I wanted to draw your attention to a related issue that concerns image downloads. I really think that addressing this problem would go a long way in improving the overall user experience. Would you mind taking a look when you have a moment? Thanks so much!
 
> I have been using a small root partition of about 16GB for my deployments since the 5.x versions

Yep, I did the same thing on a recent install of 8.1.4. Decided to bite the bullet and reinstall with LVM/ext4 root at 50GB. A workaround would be to redirect /var/tmp to a compressed zfs dataset with a soft symlink
 
Yep, I did the same thing on a recent install of 8.1.4. Decided to bite the bullet and reinstall with LVM/ext4 root at 50GB.

My genuine concern is not to redeploy; my installation is zfs, which makes it quite simple (grow the partition and allow zfs to grow, or use the boot disk to “install” with the new size and then rollback a saved snapshot).
My concern is about a malicious attack; someone could explode that behavior and crash the server very easily, like a time bomb. Make multiple downloads simultaneously and leave. A few minutes or hours later, the disk is full, and the system crashes. And because it is not on /tmp, the consumed space remains, and the system did not correctly boot until someone manually got there and cleaned the storage. It is almost impossible to do it by console because the log message hugs the screen and doesn't allow you to see what you are typing. The option is to check if network services and SSH are up and accessed by SSH.
That is why I am bringing this issue to the community and developers' attention. I believe that leaving the implementation of the downloads like that leaves the system vulnerable.

As an example of an option:
A workaround would be to redirect /var/tmp to a compressed zfs dataset with a soft symlink
To me, if the long-term decision is not to change the solution behavior, most probably, I will end up doing that. But that doesn’t look clean to me; it is a hardening that could be avoided. And will require preparing documentation and procedures to mitigate that one.

I want to open a respectful discussion on whether this implementation/behavior should be optimized or left with the concern that disk full could happen.

Regards,
 
  • Like
Reactions: Kingneutron

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!