Hi,
I have a setup which has been working for a couple of months : a secondary PBS server which sync my main PBS server every day. I'm running it with what I had on hand, so it's a typical HP server, and my data is stored on a NFS datastore on a synology NAS. My datastore is around 15TB of data. Garbage collect are notoriously slow, but i'm OK with that (it's a secondary after all) - 4 to 5 days. My PBS is now running PBS 3.3 (previously, it was running PBS 2.4, but the update didn't solve the issue)
My last garbage collect from early january started fine, but it then got stuck at 7%. I've restarted it multiple times, and it's always stuck at the same percentage. However, when I look with lsof, it's not always stuck on the same backup.
For example, this is one of the output of lsof :
This file doesn't seem to have an issue (i can copy it, display it ...), and everytime it get stuck, it's stuck on a different file - always on the clustan namespace (but I think this is due to how my namespace are named, and how PBS traverses its backups), but not always the same machine, and when it's the same machine, it's different backups (at different times).
Here's how my NFS is mounted :
And the exports from my synology :
(the id 34 is PBS's backup user ID which is 34 on my PBS)
Any idea on how to solve this ?
cheers,
I have a setup which has been working for a couple of months : a secondary PBS server which sync my main PBS server every day. I'm running it with what I had on hand, so it's a typical HP server, and my data is stored on a NFS datastore on a synology NAS. My datastore is around 15TB of data. Garbage collect are notoriously slow, but i'm OK with that (it's a secondary after all) - 4 to 5 days. My PBS is now running PBS 3.3 (previously, it was running PBS 2.4, but the update didn't solve the issue)
My last garbage collect from early january started fine, but it then got stuck at 7%. I've restarted it multiple times, and it's always stuck at the same percentage. However, when I look with lsof, it's not always stuck on the same backup.
For example, this is one of the output of lsof :
Code:
# lsof /nas1-backups/
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
proxmox-b 1058 backup mem REG 0,47 298576 551092 /nas1-backups/ns/clustan/host/pve1/2022-08-28T03:00:01Z/root.pxar.didx (10.1.9.11:/volume1/pbs-backups)
proxmox-b 1058 backup 18r REG 0,47 298576 551092 /nas1-backups/ns/clustan/host/pve1/2022-08-28T03:00:01Z/root.pxar.didx (10.1.9.11:/volume1/pbs-backups)
proxmox-b 1058 backup 19uW REG 0,47 0 260 /nas1-backups/.lock (10.1.9.11:/volume1/pbs-backups)
Here's how my NFS is mounted :
Code:
# mount -l|grep nfs
10.1.9.11:/volume1/pbs-backups on /nas1-backups type nfs4 (rw,relatime,vers=4.1,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.1.9.202,local_lock=none,addr=10.1.9.11)
Code:
# cat /etc/exports
/volume1/pbs-backups 10.1.9.202/32(rw,async,no_wdelay,all_squash,insecure_locks,sec=sys,anonuid=34,anongid=34)
Any idea on how to solve this ?
cheers,