S3 GC Phase 2: nsswitch/passwd overhead

mehmetali

New Member
May 20, 2026
1
0
1
Hi everyone,

I wanted to share a significant performance bottleneck I identified and resolved during the Garbage Collection Phase 2 process on a PBS setup using an XFS(NVMe drive r_await = 0.09, w_await = 0.02 when i traced processes).

nsswitch.conf and /etc/passwd Spamming: Aggressively queries /etc/nsswitch.conf and /etc/passwd for user/group verification thousands of times per second.
Installing nscd forced these infinite /etc/passwd lookups into RAM, stopping the physical file re-reading loop. Read IOPS drop to 2.9k from 4.1k

Here is a strace log:
Code:
0.000033 openat(AT_FDCWD, "/run/proxmox-backup/locks/CephS3/.chunks/539b/539b028acc7...", O_RDWR|O_APPEND|O_CLOEXEC) = 21 <0.000015>
0.000042 flock(21, LOCK_EX|LOCK_NB) = 0 <0.000011>
0.000034 fstat(21, {st_mode=S_IFREG|0660, st_size=0, ...}) = 0 <0.000011>
0.000035 newfstatat(AT_FDCWD, "/run/proxmox-backup/locks/CephS3/.chunks/539b/539b028acc7...", {st_mode=S_IFREG|0660, st_size=0, ...}, 0) = 0 <0.000013>
0.000044 statx(AT_FDCWD, "/mnt/s3cache/datastore/.chunks/539b/539b028acc7...", AT_STATX_SYNC_AS_STAT, STATX_ALL, ...) = 0 <0.000169>
0.000214 close(21)                 = 0 <0.000012>
0.000037 mkdir("/run/proxmox-backup/locks/CephS3/.chunks/539b", 0777) = -1 EEXIST (File exists) <0.000013>
0.000040 statx(AT_FDCWD, "/run/proxmox-backup/locks/CephS3/.chunks/539b", AT_STATX_SYNC_AS_STAT, STATX_ALL, ...) = 0 <0.000012>
0.000042 newfstatat(AT_FDCWD, "/etc/nsswitch.conf", {st_mode=S_IFREG|0644, st_size=526, ...}, 0) = 0 <0.000012>
0.000039 openat(AT_FDCWD, "/etc/passwd", O_RDONLY|O_CLOEXEC) = 21 <0.000013>
0.000038 fstat(21, {st_mode=S_IFREG|0644, st_size=1457, ...}) = 0 <0.000012>
0.000040 lseek(21, 0, SEEK_SET)    = 0 <0.000011>
0.000032 read(21, "root:x:0:0:root:/root:/bin/bash\n"..., 4096) = 1457 <0.000012>
0.000040 close(21)                 = 0 <0.000011>
 
Last edited:
Hi,
thanks for the report. From a first glance this is caused by the user and group lookups with the file locks used for exclusive chunk access. Most likely it makes sense to add caching for these. Please open an enhancement request in https://bugzilla.proxmox.com referencing this thread so this is tracked properly and not lost in the forum.
 
  • Like
Reactions: fabian