PBS stops working while backing up VM at the same point

startaq

Member
Aug 5, 2021
9
2
23
46
Hello,

I'm currently in the process of switching to PBS as backup solution, however I seem to be hitting some kind of bug that causes PBS to hang while backing up a VM. It seems to always happen around the 10 GB mark. Here's the first part of the backup log:

Code:
INFO: starting new backup job: vzdump 500 --node greystones --notification-mode notification-system --notes-template '{{guestname}}' --remove 0 --storage pbs1 --mode snapshot
INFO: Starting Backup of VM 500 (qemu)
INFO: Backup started at 2026-05-20 11:19:06
INFO: status = running
INFO: VM Name: win10-x
INFO: include disk 'scsi0' 'local-zfs:vm-500-disk-0' 36G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/500/2026-05-20T09:19:06Z'
INFO: enabling encryption
INFO: issuing guest-agent 'fs-freeze' command
INFO: starting backup via QMP command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task '4845229e-1a03-4603-8585-696bb4b19871'
INFO: resuming VM again
INFO: scsi0: dirty-bitmap status: existing bitmap was invalid and has been cleared
INFO:   1% (372.0 MiB of 36.0 GiB) in 3s, read: 124.0 MiB/s, write: 118.7 MiB/s
INFO:   2% (760.0 MiB of 36.0 GiB) in 9s, read: 64.7 MiB/s, write: 64.7 MiB/s
INFO:   3% (1.1 GiB of 36.0 GiB) in 15s, read: 60.0 MiB/s, write: 60.0 MiB/s
INFO:   4% (1.5 GiB of 36.0 GiB) in 21s, read: 65.3 MiB/s, write: 65.3 MiB/s
INFO:   5% (1.8 GiB of 36.0 GiB) in 28s, read: 48.6 MiB/s, write: 48.6 MiB/s
INFO:   6% (2.2 GiB of 36.0 GiB) in 35s, read: 57.7 MiB/s, write: 57.7 MiB/s
INFO:   7% (2.5 GiB of 36.0 GiB) in 41s, read: 55.3 MiB/s, write: 55.3 MiB/s
INFO:   8% (2.9 GiB of 36.0 GiB) in 48s, read: 58.9 MiB/s, write: 58.3 MiB/s
INFO:   9% (3.3 GiB of 36.0 GiB) in 53s, read: 83.2 MiB/s, write: 83.2 MiB/s
INFO:  10% (3.6 GiB of 36.0 GiB) in 58s, read: 61.6 MiB/s, write: 61.6 MiB/s
INFO:  11% (4.0 GiB of 36.0 GiB) in 1m 4s, read: 58.7 MiB/s, write: 58.7 MiB/s
INFO:  12% (4.3 GiB of 36.0 GiB) in 1m 12s, read: 47.0 MiB/s, write: 47.0 MiB/s
INFO:  13% (4.7 GiB of 36.0 GiB) in 1m 20s, read: 49.5 MiB/s, write: 49.5 MiB/s
INFO:  14% (5.1 GiB of 36.0 GiB) in 1m 27s, read: 48.0 MiB/s, write: 48.0 MiB/s
INFO:  15% (5.4 GiB of 36.0 GiB) in 1m 33s, read: 60.0 MiB/s, write: 60.0 MiB/s
INFO:  16% (5.8 GiB of 36.0 GiB) in 1m 40s, read: 50.9 MiB/s, write: 50.9 MiB/s
INFO:  17% (6.1 GiB of 36.0 GiB) in 1m 47s, read: 54.3 MiB/s, write: 54.3 MiB/s
INFO:  18% (6.5 GiB of 36.0 GiB) in 1m 52s, read: 72.8 MiB/s, write: 72.8 MiB/s
INFO:  19% (6.9 GiB of 36.0 GiB) in 1m 59s, read: 57.1 MiB/s, write: 57.1 MiB/s
INFO:  20% (7.2 GiB of 36.0 GiB) in 2m 6s, read: 49.7 MiB/s, write: 49.7 MiB/s
INFO:  21% (7.6 GiB of 36.0 GiB) in 2m 12s, read: 60.0 MiB/s, write: 59.3 MiB/s
INFO:  22% (7.9 GiB of 36.0 GiB) in 2m 18s, read: 64.0 MiB/s, write: 64.0 MiB/s
INFO:  23% (8.3 GiB of 36.0 GiB) in 2m 25s, read: 57.7 MiB/s, write: 57.7 MiB/s
INFO:  24% (8.7 GiB of 36.0 GiB) in 2m 30s, read: 66.4 MiB/s, write: 66.4 MiB/s
INFO:  25% (9.0 GiB of 36.0 GiB) in 2m 36s, read: 64.7 MiB/s, write: 64.7 MiB/s
INFO:  26% (9.4 GiB of 36.0 GiB) in 2m 42s, read: 60.7 MiB/s, write: 60.7 MiB/s
INFO:  27% (9.7 GiB of 36.0 GiB) in 2m 49s, read: 47.4 MiB/s, write: 47.4 MiB/s
INFO:  28% (10.2 GiB of 36.0 GiB) in 2m 56s, read: 71.4 MiB/s, write: 69.7 MiB/s

At this point the PBS user interface stops responding. It can be resurrected by issuing the command systemctl restart proxmox-backup-proxy.service

The log continues like this then:

Code:
INFO:  28% (10.2 GiB of 36.0 GiB) in 5m 16s, read: 146.3 KiB/s, write: 146.3 KiB/s
ERROR: backup write data failed: command error: protocol canceled
INFO: aborting backup job
INFO: resuming VM again
ERROR: Backup of VM 500 failed - backup write data failed: command error: protocol canceled
INFO: Failed at 2026-05-20 11:24:24
INFO: Backup job finished with errors
INFO: notified via target `mail-to-root`
INFO: notified via target `pushover-x`
TASK ERROR: job errors

PBS is using the S3 backend with the OVH Object Storage. I'm using the newest (no-subscription) packages on both Proxmox and PBS.

The PBS task log doesn't show any errors, last lines are:

Code:
2026-05-20T11:22:05+02:00: Upload new chunk bab7f19c57d4d6127a37d6cc3e4ac986a8fdd2fb1a0922f3f4367746968f9d0d
2026-05-20T11:22:05+02:00: Upload new chunk 6c03478d05d2203285d89c1dcb1847edbe5b15d15f028068c0aeccb76617fc2c
2026-05-20T11:22:05+02:00: Upload new chunk b5f3dc27b4193b07f4632c240b3673a031673834c700b0cb993bb92934eb3312
2026-05-20T11:22:05+02:00: Upload new chunk 60b1a0fb0f59278074ddcc41dabd65025635019e051d878026f39085bb8bd805

Backups to other locations (non-PBS) of this same VM work fine.
 
Anything of interest in the systemd journal around the time the backup hangs? Does a backup to a local (non S3 backed) datastore work as expected?
 
There is nothing in the systemd journal.

I just created a local datastore for testing and backups to this local datastore finish successfully:

Code:
INFO: starting new backup job: vzdump 500 --notification-mode notification-system --notes-template '{{guestname}}' --node greystones --storage pbs1-test --mode snapshot --remove 0
INFO: Starting Backup of VM 500 (qemu)
INFO: Backup started at 2026-05-20 11:53:37
INFO: status = running
INFO: VM Name: win10-x
INFO: include disk 'scsi0' 'local-zfs:vm-500-disk-0' 36G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/500/2026-05-20T09:53:37Z'
INFO: enabling encryption
INFO: issuing guest-agent 'fs-freeze' command
INFO: starting backup via QMP command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task 'd03e9782-ae79-41c5-9dbd-f30d2d464427'
INFO: resuming VM again
INFO: scsi0: dirty-bitmap status: existing bitmap was invalid and has been cleared
INFO:   1% (376.0 MiB of 36.0 GiB) in 3s, read: 125.3 MiB/s, write: 120.0 MiB/s
INFO:   2% (796.0 MiB of 36.0 GiB) in 9s, read: 70.0 MiB/s, write: 70.0 MiB/s
INFO:   3% (1.2 GiB of 36.0 GiB) in 15s, read: 66.0 MiB/s, write: 66.0 MiB/s
INFO:   4% (1.5 GiB of 36.0 GiB) in 21s, read: 55.3 MiB/s, write: 55.3 MiB/s
INFO:   5% (1.8 GiB of 36.0 GiB) in 28s, read: 49.7 MiB/s, write: 49.7 MiB/s
INFO:   6% (2.2 GiB of 36.0 GiB) in 35s, read: 54.9 MiB/s, write: 54.9 MiB/s
INFO:   7% (2.5 GiB of 36.0 GiB) in 41s, read: 54.7 MiB/s, write: 54.7 MiB/s
INFO:   8% (2.9 GiB of 36.0 GiB) in 47s, read: 62.0 MiB/s, write: 61.3 MiB/s
INFO:   9% (3.3 GiB of 36.0 GiB) in 52s, read: 81.6 MiB/s, write: 81.6 MiB/s
INFO:  10% (3.6 GiB of 36.0 GiB) in 58s, read: 61.3 MiB/s, write: 61.3 MiB/s
INFO:  11% (4.0 GiB of 36.0 GiB) in 1m 4s, read: 62.7 MiB/s, write: 62.7 MiB/s
INFO:  12% (4.3 GiB of 36.0 GiB) in 1m 11s, read: 49.1 MiB/s, write: 49.1 MiB/s
INFO:  13% (4.7 GiB of 36.0 GiB) in 1m 18s, read: 49.7 MiB/s, write: 49.7 MiB/s
INFO:  14% (5.1 GiB of 36.0 GiB) in 1m 27s, read: 46.2 MiB/s, write: 46.2 MiB/s
INFO:  15% (5.5 GiB of 36.0 GiB) in 1m 33s, read: 61.3 MiB/s, write: 61.3 MiB/s
INFO:  16% (5.8 GiB of 36.0 GiB) in 1m 40s, read: 48.0 MiB/s, write: 48.0 MiB/s
INFO:  17% (6.1 GiB of 36.0 GiB) in 1m 46s, read: 59.3 MiB/s, write: 59.3 MiB/s
INFO:  18% (6.5 GiB of 36.0 GiB) in 1m 52s, read: 63.3 MiB/s, write: 63.3 MiB/s
INFO:  19% (6.9 GiB of 36.0 GiB) in 1m 58s, read: 61.3 MiB/s, write: 61.3 MiB/s
INFO:  20% (7.2 GiB of 36.0 GiB) in 2m 5s, read: 52.6 MiB/s, write: 52.6 MiB/s
INFO:  21% (7.6 GiB of 36.0 GiB) in 2m 12s, read: 51.4 MiB/s, write: 50.9 MiB/s
INFO:  22% (7.9 GiB of 36.0 GiB) in 2m 18s, read: 61.3 MiB/s, write: 61.3 MiB/s
INFO:  23% (8.3 GiB of 36.0 GiB) in 2m 26s, read: 50.5 MiB/s, write: 50.5 MiB/s
INFO:  24% (8.7 GiB of 36.0 GiB) in 2m 31s, read: 75.2 MiB/s, write: 75.2 MiB/s
INFO:  25% (9.0 GiB of 36.0 GiB) in 2m 36s, read: 72.8 MiB/s, write: 72.8 MiB/s
INFO:  26% (9.4 GiB of 36.0 GiB) in 2m 44s, read: 45.5 MiB/s, write: 45.5 MiB/s
INFO:  27% (9.7 GiB of 36.0 GiB) in 2m 50s, read: 54.7 MiB/s, write: 52.7 MiB/s
INFO:  28% (10.1 GiB of 36.0 GiB) in 2m 57s, read: 60.0 MiB/s, write: 50.3 MiB/s
INFO:  29% (10.4 GiB of 36.0 GiB) in 3m 2s, read: 64.0 MiB/s, write: 37.6 MiB/s
INFO:  30% (10.9 GiB of 36.0 GiB) in 3m 11s, read: 48.4 MiB/s, write: 46.2 MiB/s
INFO:  31% (11.2 GiB of 36.0 GiB) in 3m 19s, read: 39.5 MiB/s, write: 39.5 MiB/s
INFO:  32% (11.5 GiB of 36.0 GiB) in 3m 25s, read: 61.3 MiB/s, write: 61.3 MiB/s
INFO:  34% (12.3 GiB of 36.0 GiB) in 3m 34s, read: 82.2 MiB/s, write: 41.8 MiB/s
INFO:  37% (13.6 GiB of 36.0 GiB) in 3m 37s, read: 470.7 MiB/s, write: 46.7 MiB/s
INFO:  38% (13.8 GiB of 36.0 GiB) in 3m 40s, read: 44.0 MiB/s, write: 44.0 MiB/s
INFO:  39% (14.0 GiB of 36.0 GiB) in 3m 46s, read: 47.3 MiB/s, write: 46.7 MiB/s
INFO:  40% (14.4 GiB of 36.0 GiB) in 3m 54s, read: 49.0 MiB/s, write: 49.0 MiB/s
INFO:  41% (14.8 GiB of 36.0 GiB) in 4m 2s, read: 43.5 MiB/s, write: 43.5 MiB/s
INFO:  42% (15.2 GiB of 36.0 GiB) in 4m 11s, read: 44.9 MiB/s, write: 44.9 MiB/s
INFO:  43% (15.5 GiB of 36.0 GiB) in 4m 19s, read: 42.0 MiB/s, write: 42.0 MiB/s
INFO:  44% (15.9 GiB of 36.0 GiB) in 4m 27s, read: 46.0 MiB/s, write: 46.0 MiB/s
INFO:  45% (16.2 GiB of 36.0 GiB) in 4m 36s, read: 41.3 MiB/s, write: 41.3 MiB/s
INFO:  46% (16.6 GiB of 36.0 GiB) in 4m 40s, read: 89.0 MiB/s, write: 89.0 MiB/s
INFO:  47% (17.0 GiB of 36.0 GiB) in 4m 49s, read: 47.1 MiB/s, write: 47.1 MiB/s
INFO:  48% (17.3 GiB of 36.0 GiB) in 4m 57s, read: 43.0 MiB/s, write: 43.0 MiB/s
INFO:  49% (17.7 GiB of 36.0 GiB) in 5m 4s, read: 53.7 MiB/s, write: 52.6 MiB/s
INFO:  50% (18.0 GiB of 36.0 GiB) in 5m 13s, read: 40.9 MiB/s, write: 40.9 MiB/s
INFO:  51% (18.4 GiB of 36.0 GiB) in 5m 21s, read: 42.0 MiB/s, write: 42.0 MiB/s
INFO:  52% (18.8 GiB of 36.0 GiB) in 5m 25s, read: 102.0 MiB/s, write: 102.0 MiB/s
INFO:  53% (19.1 GiB of 36.0 GiB) in 5m 33s, read: 43.5 MiB/s, write: 41.0 MiB/s
INFO:  54% (19.5 GiB of 36.0 GiB) in 5m 42s, read: 40.4 MiB/s, write: 37.8 MiB/s
INFO:  55% (19.9 GiB of 36.0 GiB) in 5m 48s, read: 67.3 MiB/s, write: 64.7 MiB/s
INFO:  56% (20.2 GiB of 36.0 GiB) in 5m 51s, read: 114.7 MiB/s, write: 77.3 MiB/s
INFO:  57% (20.6 GiB of 36.0 GiB) in 5m 59s, read: 48.0 MiB/s, write: 40.5 MiB/s
INFO:  58% (21.0 GiB of 36.0 GiB) in 6m 4s, read: 91.2 MiB/s, write: 69.6 MiB/s
INFO:  59% (21.4 GiB of 36.0 GiB) in 6m 10s, read: 63.3 MiB/s, write: 56.0 MiB/s
INFO:  60% (21.7 GiB of 36.0 GiB) in 6m 13s, read: 93.3 MiB/s, write: 56.0 MiB/s
INFO:  61% (22.1 GiB of 36.0 GiB) in 6m 17s, read: 102.0 MiB/s, write: 73.0 MiB/s
INFO:  62% (22.5 GiB of 36.0 GiB) in 6m 20s, read: 158.7 MiB/s, write: 53.3 MiB/s
INFO:  63% (22.9 GiB of 36.0 GiB) in 6m 23s, read: 114.7 MiB/s, write: 90.7 MiB/s
INFO:  64% (23.4 GiB of 36.0 GiB) in 6m 26s, read: 182.7 MiB/s, write: 57.3 MiB/s
INFO:  70% (25.2 GiB of 36.0 GiB) in 6m 29s, read: 622.7 MiB/s, write: 61.3 MiB/s
INFO:  71% (25.7 GiB of 36.0 GiB) in 6m 41s, read: 39.3 MiB/s, write: 35.0 MiB/s
INFO:  72% (26.2 GiB of 36.0 GiB) in 6m 45s, read: 127.0 MiB/s, write: 71.0 MiB/s
INFO:  73% (26.6 GiB of 36.0 GiB) in 6m 48s, read: 158.7 MiB/s, write: 88.0 MiB/s
INFO:  74% (26.7 GiB of 36.0 GiB) in 6m 51s, read: 32.0 MiB/s, write: 32.0 MiB/s
INFO:  75% (27.0 GiB of 36.0 GiB) in 6m 58s, read: 44.0 MiB/s, write: 34.3 MiB/s
INFO:  76% (27.5 GiB of 36.0 GiB) in 7m 2s, read: 129.0 MiB/s, write: 35.0 MiB/s
INFO:  77% (27.9 GiB of 36.0 GiB) in 7m 6s, read: 94.0 MiB/s, write: 66.0 MiB/s
INFO:  79% (28.5 GiB of 36.0 GiB) in 7m 9s, read: 202.7 MiB/s, write: 46.7 MiB/s
INFO:  80% (28.8 GiB of 36.0 GiB) in 7m 17s, read: 39.0 MiB/s, write: 32.5 MiB/s
INFO:  81% (29.2 GiB of 36.0 GiB) in 7m 29s, read: 32.0 MiB/s, write: 31.0 MiB/s
INFO:  82% (29.6 GiB of 36.0 GiB) in 7m 40s, read: 37.8 MiB/s, write: 37.8 MiB/s
INFO:  85% (30.6 GiB of 36.0 GiB) in 7m 44s, read: 269.0 MiB/s, write: 1.0 MiB/s
INFO:  88% (31.7 GiB of 36.0 GiB) in 7m 47s, read: 372.0 MiB/s, write: 49.3 MiB/s
INFO:  89% (32.3 GiB of 36.0 GiB) in 7m 57s, read: 59.2 MiB/s, write: 28.4 MiB/s
INFO:  98% (35.3 GiB of 36.0 GiB) in 8m, read: 1.0 GiB/s, write: 0 B/s
INFO: 100% (36.0 GiB of 36.0 GiB) in 8m 1s, read: 692.0 MiB/s, write: 0 B/s
INFO: backup is sparse: 12.38 GiB (34%) total zero data
INFO: backup was done incrementally, reused 12.38 GiB (34%)
INFO: transferred 36.00 GiB in 481 seconds (76.6 MiB/s)
INFO: adding notes to backup
INFO: Finished Backup of VM 500 (00:08:03)
INFO: Backup finished at 2026-05-20 12:01:40
INFO: Backup job finished successfully
INFO: notified via target `mail-to-root`
TASK OK

So it looks like it's S3 datastore specific.
 
Here's the output:

Code:
root@pbs1:~# gdb --batch --ex 't a a bt' -p $(pidof proxmox-backup-proxy) > proxy.backtrace
warning: Missing auto-load script at offset 0 in section .debug_gdb_scripts
of file /usr/lib/x86_64-linux-gnu/proxmox-backup/proxmox-backup-proxy.
Use `info auto-load python-scripts [REGEXP]' to list them.
Recursive internal problem.
root@pbs1:~# cat proxy.backtrace
[New LWP 9580]
[New LWP 9578]
[New LWP 9577]
[New LWP 9574]
[New LWP 9572]
[New LWP 9570]
[New LWP 9569]
[New LWP 9568]
[New LWP 9567]
[New LWP 9566]
[New LWP 9565]
[New LWP 9225]
[New LWP 9224]
[New LWP 9182]
[New LWP 9181]
[New LWP 9084]
[New LWP 9066]
[New LWP 9064]
[New LWP 9063]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x000076a1ff31a7b9 in syscall () from /lib/x86_64-linux-gnu/libc.so.6

Doesn't seem very useful - am I missing something?
 
Yes, that was in the hanging state. I've attached a backtrace while its running normally (without any backup jobs) - that one shows useful backtraces.
 

Attachments

Please retry in the hanging state. If that still result in the gdb recursion issue, try generating a short strace output instead (note that this might contain sensitive information, so best shared via a DM) strace -ftt -p $(pidof proxmox-backup-proxy)
 
Retrying results in the same error ("Recursive internal problem."). Here's the strace output (doesn't appear to contain any sensitive information):

Code:
strace: Process 9781 attached with 20 threads
[pid 10112] 13:42:37.463571 futex(0x7547b800e598, FUTEX_WAIT_BITSET_PRIVATE, 4294967295, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid 10111] 13:42:37.463649 flock(54, LOCK_EX <unfinished ...>
[pid 10110] 13:42:37.463672 futex(0x7547c4049c08, FUTEX_WAIT_BITSET_PRIVATE, 4294967295, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid 10105] 13:42:37.463686 futex(0x7547c80060a8, FUTEX_WAIT_BITSET_PRIVATE, 4294967295, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid 10102] 13:42:37.463699 futex(0x7547d8136818, FUTEX_WAIT_BITSET_PRIVATE, 4294967295, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid 10099] 13:42:37.463727 futex(0x7547b80061b8, FUTEX_WAIT_BITSET_PRIVATE, 4294967295, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid 10098] 13:42:37.463741 futex(0x7547f438e6c8, FUTEX_WAIT_BITSET_PRIVATE, 4294967295, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid 10097] 13:42:37.463753 futex(0x7547cc002bc8, FUTEX_WAIT_BITSET_PRIVATE, 4294967295, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid 10096] 13:42:37.463771 futex(0x7547d0004b28, FUTEX_WAIT_BITSET_PRIVATE, 4294967295, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid 10095] 13:42:37.463785 futex(0x7547ec0045a8, FUTEX_WAIT_BITSET_PRIVATE, 4294967295, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid 10094] 13:42:37.463797 futex(0x7547e4008148, FUTEX_WAIT_BITSET_PRIVATE, 4294967295, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid 10091] 13:42:37.463810 futex(0x7547f01778a8, FUTEX_WAIT_BITSET_PRIVATE, 4294967295, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid 10088] 13:42:37.463822 flock(50, LOCK_EX <unfinished ...>
[pid 10087] 13:42:37.463837 futex(0x7547f809a9e8, FUTEX_WAIT_BITSET_PRIVATE, 4294967295, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid 10017] 13:42:37.463857 futex(0x7547f002f2b8, FUTEX_WAIT_BITSET_PRIVATE, 4294967295, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid  9791] 13:42:37.463879 futex(0x7547f80824b8, FUTEX_WAIT_BITSET_PRIVATE, 4294967295, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid  9786] 13:42:37.463901 restart_syscall(<... resuming interrupted clock_nanosleep ...> <unfinished ...>
[pid  9784] 13:42:37.463931 futex(0x5a9de7068238, FUTEX_WAIT_BITSET_PRIVATE, 4294967295, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid  9783] 13:42:37.463954 futex(0x5a9de7059568, FUTEX_WAIT_BITSET_PRIVATE, 4294967295, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid  9781] 13:42:37.463982 futex(0x7547fe6670f8, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid  9786] 13:42:39.353680 <... restart_syscall resumed>) = 0
[pid  9786] 13:42:39.353787 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=3, tv_nsec=0}, 0x7547fceed7f0) = 0
[pid  9786] 13:42:42.354096 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=3, tv_nsec=0}, 0x7547fceed7f0) = 0
[pid  9786] 13:42:45.354354 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=3, tv_nsec=0}, 0x7547fceed7f0) = 0
[pid  9786] 13:42:48.354589 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=3, tv_nsec=0}, 0x7547fceed7f0) = 0
[pid  9786] 13:42:51.354883 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=3, tv_nsec=0}, 0x7547fceed7f0) = 0
[pid  9786] 13:42:54.355193 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=3, tv_nsec=0}, 0x7547fceed7f0) = 0
[pid  9786] 13:42:57.355516 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=3, tv_nsec=0}, 0x7547fceed7f0) = 0
[pid  9786] 13:43:00.355800 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=3, tv_nsec=0}, 0x7547fceed7f0) = 0
[pid  9786] 13:43:03.356083 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=3, tv_nsec=0}, 0x7547fceed7f0) = 0
[pid  9786] 13:43:06.356358 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=3, tv_nsec=0}, strace: Process 9781 detached
strace: Process 10112 detached
strace: Process 10111 detached
strace: Process 10110 detached
strace: Process 10105 detached
strace: Process 10102 detached
strace: Process 10099 detached
strace: Process 10098 detached
strace: Process 10097 detached
strace: Process 10096 detached
strace: Process 10095 detached
strace: Process 10094 detached
strace: Process 10091 detached
strace: Process 10088 detached
strace: Process 10087 detached
strace: Process 10017 detached
strace: Process 9791 detached
strace: Process 9786 detached
 <detached ...>
strace: Process 9784 detached
strace: Process 9783 detached
 
Can you try to get the strace output from the start of the hang?