[SOLVED] pbs backup - stack overflow

kaer

New Member
Jul 5, 2024
8
1
3
hi,
after upgrading from 8.2.2 to 8.2.4 backups of one random lxc container stopped working,
on this node there are 14 lxc containers, backup of 13 working fine, only one have problem (size of lxc with problem is 110GB, but it should not be issue, i have also 200, 320 and 340 gb, and backup working fine here)

logs:
Code:
INFO: starting new backup job: vzdump 288 --notification-mode auto --notes-template '{{guestname}}' --remove 0 --storage backupstorage --mode snapshot --node proxmoxhost
INFO: Starting Backup of VM 288 (lxc)
INFO: Backup started at 2024-07-05 10:10:02
INFO: status = running
INFO: CT Name: vm288
INFO: including mount point rootfs ('/') in backup
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: create storage snapshot 'vzdump'
  WARNING: You have not turned on protection against thin pools running out of space.
  WARNING: Set activation/thin_pool_autoextend_threshold below 100 to trigger automatic extension of thin pools before they get full.
  Logical volume "snap_vm-288-disk-0_vzdump" created.
  WARNING: Sum of all thin volume sizes (2.06 TiB) exceeds the size of thin pool vg/storage and the size of whole volume group (<1.75 TiB).
INFO: creating Proxmox Backup Server archive 'ct/288/2024-07-05T08:10:02Z'
INFO: set max number of entries in memory for file-based backups to 1048576
INFO: run: lxc-usernsexec -m u:0:100000:65536 -m g:0:100000:65536 -- /usr/bin/proxmox-backup-client backup --crypt-mode=none pct.conf:/var/tmp/vzdumptmp1536770_288/etc/vzdump/pct.conf root.pxar:/mnt/vzsnap0 --include-dev /mnt/vzsnap0/./ --skip-lost-and-found --exclude=/tmp/?* --exclude=/var/tmp/?* --exclude=/var/run/?*.pid --backup-type ct --backup-id 288 --backup-time 1720167002 --entries-max 1048576 --repository backupuser@pbs@192.168.1.231:storage --ns <HOSTNAME>
INFO: Starting backup: [<HOSTNAME>]:ct/288/2024-07-05T08:10:02Z
INFO: Client name: proxmoxhost
INFO: Starting backup protocol: Fri Jul  5 10:10:02 2024
INFO: No previous manifest available.
INFO: Upload config file '/var/tmp/vzdumptmp1536770_288/etc/vzdump/pct.conf' to 'backupuser@pbs@192.168.1.231:8007:storage' as pct.conf.blob
INFO: Upload directory '/mnt/vzsnap0' to 'backupuser@pbs@192.168.1.231:8007:storage' as root.pxar.didx
INFO: thread 'tokio-runtime-worker' has overflowed its stack
INFO: fatal runtime error: stack overflow
INFO: adding notes to backup
WARN: unable to add notes - proxmox-backup-client failed: Error: unable to update manifest blob - unable to load blob '"/storage/pbs_datastore/ns/<HOSTNAME>/ct/288/2024-07-05T08:10:02Z/index.json.blob"' - No such file or directory (os error 2)
INFO: cleanup temporary 'vzdump' snapshot
  Logical volume "snap_vm-288-disk-0_vzdump" successfully removed.
INFO: Finished Backup of VM 288 (00:06:34)
INFO: Backup finished at 2024-07-05 10:16:36
WARN: uploading backup task log failed: Error: mkstemp "/storage/pbs_datastore/ns/<HOSTNAME>/ct/288/2024-07-05T08:10:02Z/client.log.tmp_XXXXXX" failed: ENOENT: No such file or directory
INFO: Backup job finished successfully
ERROR: could not notify via target `mail-to-root`: could not notify via endpoint(s): mail-to-root: At least one recipient has to be specified!
TASK WARNINGS: 2

pveversion
Code:
pve-manager/8.2.4/faa83925c9641325 (running kernel: 6.8.8-2-pve)

and ideas?
 
Hi,
please post the full output of pveversion -v and the container config pct config 288
 
Code:
proxmox-ve: 8.2.0 (running kernel: 6.8.8-2-pve)
pve-manager: 8.2.4 (running version: 8.2.4/faa83925c9641325)
proxmox-kernel-helper: 8.1.0
pve-kernel-6.2: 8.0.5
proxmox-kernel-6.8: 6.8.8-2
proxmox-kernel-6.8.8-2-pve-signed: 6.8.8-2
proxmox-kernel-6.8.4-2-pve-signed: 6.8.4-2
proxmox-kernel-6.2.16-20-pve: 6.2.16-20
proxmox-kernel-6.2: 6.2.16-20
ceph-fuse: 16.2.11+ds-2
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown: residual config
ifupdown2: 3.2.0-1+pmx8
intel-microcode: 3.20240514.1~deb12u1
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.1
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.3
libpve-access-control: 8.1.4
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.0.7
libpve-cluster-perl: 8.0.7
libpve-common-perl: 8.2.1
libpve-guest-common-perl: 5.1.3
libpve-http-server-perl: 5.1.0
libpve-network-perl: 0.9.8
libpve-rs-perl: 0.8.9
libpve-storage-perl: 8.2.3
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.2.4-1
proxmox-backup-file-restore: 3.2.4-1
proxmox-firewall: 0.4.2
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.3
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.6
proxmox-widget-toolkit: 4.2.3
pve-cluster: 8.0.7
pve-container: 5.1.12
pve-docs: 8.2.2
pve-edk2-firmware: not correctly installed
pve-esxi-import-tools: 0.7.1
pve-firewall: 5.0.7
pve-firmware: 3.12-1
pve-ha-manager: 4.0.5
pve-i18n: 3.2.2
pve-qemu-kvm: 9.0.0-3
pve-xtermjs: 5.3.0-3
qemu-server: 8.2.1
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.4-pve1

Code:
arch: amd64
cores: 4
features: nesting=1
hostname: vm288
memory: 5120
net0: name=eth0,bridge=vmbr0,firewall=1,gw=10.0.0.1,hwaddr=BC:24:11:38:39:2B,ip=10.0.0.23/24,type=veth
onboot: 1
ostype: debian
rootfs: storage:vm-288-disk-0,size=110G
swap: 512
unprivileged: 1
 
pve-edk2-firmware: not correctly installed
Hmm, are you sure that the upgrade went trough successfully? This might indicate that there is something wrong with your setup.

What is the output of apt update && apt full-upgrade?

Edit: Please also perform a debsums proxmox-backup-client, you will have to install debsums first via apt install debsums
 
Last edited:
upgrade was finished with no errors, everything was updated,
but with full-upgrade i see:

Code:
The following packages will be upgraded:
  proxmox-backup-client proxmox-backup-file-restore pve-qemu-kvm shim-signed shim-signed-common
[/COD]

and on proxmox backup server:

[CODE]
The following packages will be upgraded:
  proxmox-backup-client proxmox-backup-docs proxmox-backup-server

these packages was not updated via apt upgrade / apt upgrade dont show to upgrade it,

i will upgrade everything and retry backup, ill be back here with answer in few minutes
 
ok, i also installed pve-edk2-firmware,
pveversion -v
Code:
proxmox-ve: 8.2.0 (running kernel: 6.8.8-2-pve)
pve-manager: 8.2.4 (running version: 8.2.4/faa83925c9641325)
proxmox-kernel-helper: 8.1.0
pve-kernel-6.2: 8.0.5
proxmox-kernel-6.8: 6.8.8-2
proxmox-kernel-6.8.8-2-pve-signed: 6.8.8-2
proxmox-kernel-6.8.4-2-pve-signed: 6.8.4-2
proxmox-kernel-6.2.16-20-pve: 6.2.16-20
proxmox-kernel-6.2: 6.2.16-20
ceph-fuse: 16.2.11+ds-2
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown: residual config
ifupdown2: 3.2.0-1+pmx8
intel-microcode: 3.20240514.1~deb12u1
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.1
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.3
libpve-access-control: 8.1.4
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.0.7
libpve-cluster-perl: 8.0.7
libpve-common-perl: 8.2.1
libpve-guest-common-perl: 5.1.3
libpve-http-server-perl: 5.1.0
libpve-network-perl: 0.9.8
libpve-rs-perl: 0.8.9
libpve-storage-perl: 8.2.3
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.2.7-1
proxmox-backup-file-restore: 3.2.7-1
proxmox-firewall: 0.4.2
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.3
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.6
proxmox-widget-toolkit: 4.2.3
pve-cluster: 8.0.7
pve-container: 5.1.12
pve-docs: 8.2.2
pve-edk2-firmware: 4.2023.08-4
pve-esxi-import-tools: 0.7.1
pve-firewall: 5.0.7
pve-firmware: 3.12-1
pve-ha-manager: 4.0.5
pve-i18n: 3.2.2
pve-qemu-kvm: 9.0.0-5
pve-xtermjs: 5.3.0-3
qemu-server: 8.2.1
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.4-pve1

bug still exists
Code:
INFO: thread 'tokio-runtime-worker' has overflowed its stack
INFO: fatal runtime error: stack overflow

pbs version: 3.2-7
 
Please try to see if setting the threads min stack size in for proxmox-backup-client helps. You can do so by running
Code:
export RUST_MIN_STACK=8388608; vzdump 288 --storage backupstorage --mode snapshot --node proxmoxhost
This should bump the default 2M to 8M.
 
ill try it and give feedback soon,
meanwhile i found this in journal on proxmox backup server directly:

Code:
Jul 05 11:40:40 backupstorage proxmox-backup-proxy[1649]: POST /dynamic_chunk
Jul 05 11:40:40 backupstorage proxmox-backup-proxy[1649]: upload_chunk done: 16777216 bytes, 0ca5a1090277c97bea9a1f39a40f70c2fc70b64dc2d49d09d9b17a9976c1b940
Jul 05 11:40:40 backupstorage proxmox-backup-proxy[1649]: POST /dynamic_chunk
Jul 05 11:40:41 backupstorage proxmox-backup-proxy[1649]: upload_chunk done: 7484535 bytes, a9424d5e97c9186e13af924e39af68f051300f3b7de61210ee6003151fe81e32
Jul 05 11:40:42 backupstorage proxmox-backup-proxy[1649]: POST /dynamic_chunk
Jul 05 11:40:42 backupstorage proxmox-backup-proxy[1649]: upload_chunk done: 5861272 bytes, 71767934d9be76a7ddad88971e3c99f89b4d92acfc938b0260c49dbff75f4717
Jul 05 11:40:42 backupstorage proxmox-backup-proxy[1649]: backup ended and finish failed: backup ended but finished flag is not set.
Jul 05 11:40:42 backupstorage proxmox-backup-proxy[1649]: removing unfinished backup
Jul 05 11:40:42 backupstorage proxmox-backup-proxy[1649]: removing backup snapshot "/storage/pbs_datastore/ns/<HOSTNAME>/ct/288/2024-07-05T09:37:31Z"
Jul 05 11:40:42 backupstorage proxmox-backup-proxy[1649]: TASK ERROR: backup ended but finished flag is not set.
 
ill try it and give feedback soon,
meanwhile i found this in journal on proxmox backup server directly:

Code:
Jul 05 11:40:40 backupstorage proxmox-backup-proxy[1649]: POST /dynamic_chunk
Jul 05 11:40:40 backupstorage proxmox-backup-proxy[1649]: upload_chunk done: 16777216 bytes, 0ca5a1090277c97bea9a1f39a40f70c2fc70b64dc2d49d09d9b17a9976c1b940
Jul 05 11:40:40 backupstorage proxmox-backup-proxy[1649]: POST /dynamic_chunk
Jul 05 11:40:41 backupstorage proxmox-backup-proxy[1649]: upload_chunk done: 7484535 bytes, a9424d5e97c9186e13af924e39af68f051300f3b7de61210ee6003151fe81e32
Jul 05 11:40:42 backupstorage proxmox-backup-proxy[1649]: POST /dynamic_chunk
Jul 05 11:40:42 backupstorage proxmox-backup-proxy[1649]: upload_chunk done: 5861272 bytes, 71767934d9be76a7ddad88971e3c99f89b4d92acfc938b0260c49dbff75f4717
Jul 05 11:40:42 backupstorage proxmox-backup-proxy[1649]: backup ended and finish failed: backup ended but finished flag is not set.
Jul 05 11:40:42 backupstorage proxmox-backup-proxy[1649]: removing unfinished backup
Jul 05 11:40:42 backupstorage proxmox-backup-proxy[1649]: removing backup snapshot "/storage/pbs_datastore/ns/<HOSTNAME>/ct/288/2024-07-05T09:37:31Z"
Jul 05 11:40:42 backupstorage proxmox-backup-proxy[1649]: TASK ERROR: backup ended but finished flag is not set.
This is expected, as the client vanishes without the backup being completed. This is correct behavior for the Proxmox Backup Server, nothing wrong on its side. The issue here is on the client (so in this case PVE) side.
 
Additionally, you could also try to invoke the proxmox-backup-client with it's debug output enabled (this has to be executed on the host running the container of course).
Code:
export PBS_LOG=debug; vzdump <VMID> -storage <YOUR-PBS-STORAGE>
That prints at least the current file being processes before the stack overflows, which might be useful to narrow down where the issue lies.
 
  • Like
Reactions: kaer
Please try to see if setting the threads min stack size in for proxmox-backup-client helps. You can do so by running
Code:
export RUST_MIN_STACK=8388608; vzdump 288 --storage backupstorage --mode snapshot --node proxmoxhost
This should bump the default 2M to 8M.
changing min stack to 8M didnt help :-(
but to 32M worked fine :)
i need to set it now globally
 
changing min stack to 8M didnt help :-(
but to 32M worked fine :)
i need to set it now globally
Okay, that at least tells us that it is not an infinite recursion. But it would be of interest to understand why your stack gets this big. Could you run also the other command with the PBS_LOG=debug variable set, but without the RUST_MIN_STACK? Maybe this gives us a hint after all?
 
Okay, that at least tells us that it is not an infinite recursion. But it would be of interest to understand why your stack gets this big. Could you run also the other command with the PBS_LOG=debug variable set, but without the RUST_MIN_STACK? Maybe this gives us a hint after all?
Yes, you are right, now we have a reason:


Code:
INFO: "home/<USER>/domains/<DOMAIN>/public_html/cache/smarty/cache/crossselling/shoppingcart/420/446/449/479/480/496/505/586/592/606/607/622/683/871/872/887/898/912/978/959/1046/1048/1053/1056/1064/311/80/947/7/393/78/1461/514/515/516/517/521/523/524/535/587/590/600/605/616/617/634/636/676/679/685/731/758/759/760/764/827/829/830/831/863/568/60/1495/186/1112/977/188/189/1337/1327/1330/1107/1138/744/743/742/1111/1086/729/1117/542/557/859/880/881/1049/1088/349/1209/1133/1383/1455/1067/1062/1014/944/950/693/692/765/936/962/963/1258/971/1260/946/867/1357/1358/1055/1477/1511/119/52/1167/26/175/1126/1496/1109/900/1197/1198/1518/748/1334/77/1435/1096/763/1437/1155/690/1373/984/1221/1222/312/733/628/635/1385/205/965/1007/1010/1083/423/435/454/483/1329/484/952/958/601/602/609/610/624/672/677/869/870/893/894/899/903/579/580/700/306/320/1231/461/407/1331/525/340/1214/519/627/739/752"
INFO: thread 'tokio-runtime-worker' has overflowed its stack
INFO: fatal runtime error: stack overflow
INFO: adding notes to backup

prestashop :D
 
  • Like
Reactions: Chris
Yes, you are right, now we have a reason:


Code:
INFO: "home/<USER>/domains/<DOMAIN>/public_html/cache/smarty/cache/crossselling/shoppingcart/420/446/449/479/480/496/505/586/592/606/607/622/683/871/872/887/898/912/978/959/1046/1048/1053/1056/1064/311/80/947/7/393/78/1461/514/515/516/517/521/523/524/535/587/590/600/605/616/617/634/636/676/679/685/731/758/759/760/764/827/829/830/831/863/568/60/1495/186/1112/977/188/189/1337/1327/1330/1107/1138/744/743/742/1111/1086/729/1117/542/557/859/880/881/1049/1088/349/1209/1133/1383/1455/1067/1062/1014/944/950/693/692/765/936/962/963/1258/971/1260/946/867/1357/1358/1055/1477/1511/119/52/1167/26/175/1126/1496/1109/900/1197/1198/1518/748/1334/77/1435/1096/763/1437/1155/690/1373/984/1221/1222/312/733/628/635/1385/205/965/1007/1010/1083/423/435/454/483/1329/484/952/958/601/602/609/610/624/672/677/869/870/893/894/899/903/579/580/700/306/320/1231/461/407/1331/525/340/1214/519/627/739/752"
INFO: thread 'tokio-runtime-worker' has overflowed its stack
INFO: fatal runtime error: stack overflow
INFO: adding notes to backup

prestashop :D
Okay, than this is most definitely the deep nesting folder structure causing the issue. Will have a look on how to takle this, for the time being I am afraid you will have to backup this particular container via a dedicated cron job, which bumps the stack size to the required limit instead of the regular pve-scheduler.
 
  • Like
Reactions: kaer
Okay, than this is most definitely the deep nesting folder structure causing the issue. Will have a look on how to takle this, for the time being I am afraid you will have to backup this particular container via a dedicated cron job, which bumps the stack size to the required limit instead of the regular pve-scheduler.
i removed /public_html/cache/smarty/cache/crossselling/shoppingcart/ content, so we will se if it will happen again soon,
this site exists for a long time, probably cache growed accidentally and pve upgrade is just coincidence,
we'll see tommorow

thank you for really fast help to locate problem
 
i removed /public_html/cache/smarty/cache/crossselling/shoppingcart/ content, so we will se if it will happen again soon,
this site exists for a long time, probably cache growed accidentally and pve upgrade is just coincidence,
we'll see tommorow

thank you for really fast help to locate problem
No worries, thanks goes also to @fabian for pinpointing this so quickly.
 
  • Like
Reactions: kaer

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!