We chased a wave of false-positive spam quarantines on PMG 8.2.11 down to this pattern.
Sharing in case it saves someone a week, but, even better, maybe you spot a rabbit hole.
(Summary by Claude-Code: I believe using claude in this forum is not an issue.)
Symptom: legitimate mail held in spam quarantine. Headers show:
and KAM_DMARC_REJECT (+7) firing. With a quarantine threshold of 5 this alone tips legit mail over.
Root cause is two layers, not one:
1. Local unbound in default recursive mode — cold-cache TXT lookups 300-1500 ms, full SPF chains up to 16 s for popular senders → hits SA's spf_timeout
5s → temperror. Fixable.
2. Remote authoritative DNS that simply doesn't answer — sendgrid.net, ab.sendgrid.net, eu.mailgun.org, enotice.ieee.org, accenture.com, plus a long
tail of mass-mailer and questionable senders. Unfixable from our side.
Three layered fixes:
(1) Tune local unbound — /etc/unbound/unbound.conf.d/pmg-tuning.conf:
Then systemctl reload unbound. Cold-cache recursion drops to <50 ms after first hit; prefetch keeps common sender records always warm. Do NOT forward
unbound to 1.1.1.1 / 8.8.8.8 / 9.9.9.9 — the major DNSBLs (Spamhaus zen + DBL, URIBL, SURBL, DNSWL) still return "blocked public resolver" sentinels
for shared-upstream queries, silently breaking your spam filter in both directions at once. Pure recursive is the right stance per the PMG wiki since
at least 2022.
(2) Raise spf_timeout from 5 s to 15 s — /etc/mail/spamassassin/local.cf:
(3) Surgical SA meta-rule — same file, append:
Real spoof (SPF fail + DMARC reject) still scores +7 → quarantined. Temperror + DMARC reject scores +7 − 4 = +3 → delivered. Preserves anti-spoof
protection where it matters; compensates only where SPF couldn't actually be evaluated.
Honest observation from 90 hours of measurement: (1) is necessary but doesn't visibly reduce the temperror quarantine count on its own (the DNS layer
heals — synthetic probe p95 from ~1300 ms to ~0 ms — but the remote-DNS residual stays). (2) helps a thin slice. (3) is the actual move-the-needle
change.
PMG 8.2.11 on Debian 12, KAM ruleset installed. Should apply unchanged to PMG 9.
Sharing in case it saves someone a week, but, even better, maybe you spot a rabbit hole.
(Summary by Claude-Code: I believe using claude in this forum is not an issue.)
Symptom: legitimate mail held in spam quarantine. Headers show:
Code:
Received-SPF: temperror (...: Time-out on DNS 'TXT' lookup of '...')
and KAM_DMARC_REJECT (+7) firing. With a quarantine threshold of 5 this alone tips legit mail over.
Root cause is two layers, not one:
1. Local unbound in default recursive mode — cold-cache TXT lookups 300-1500 ms, full SPF chains up to 16 s for popular senders → hits SA's spf_timeout
5s → temperror. Fixable.
2. Remote authoritative DNS that simply doesn't answer — sendgrid.net, ab.sendgrid.net, eu.mailgun.org, enotice.ieee.org, accenture.com, plus a long
tail of mass-mailer and questionable senders. Unfixable from our side.
Three layered fixes:
(1) Tune local unbound — /etc/unbound/unbound.conf.d/pmg-tuning.conf:
Code:
server:
msg-cache-size: 64m
rrset-cache-size: 128m
neg-cache-size: 16m
prefetch: yes
prefetch-key: yes
harden-glue: yes
harden-dnssec-stripped: yes
harden-referral-path: yes
num-threads: 2
unbound to 1.1.1.1 / 8.8.8.8 / 9.9.9.9 — the major DNSBLs (Spamhaus zen + DBL, URIBL, SURBL, DNSWL) still return "blocked public resolver" sentinels
for shared-upstream queries, silently breaking your spam filter in both directions at once. Pure recursive is the right stance per the PMG wiki since
at least 2022.
(2) Raise spf_timeout from 5 s to 15 s — /etc/mail/spamassassin/local.cf:
Code:
spf_timeout 15
Then systemctl restart pmg-smtp-filter. Default 5 s is too aggressive for chains with slow includes. Anything still failing at 15 s is a dead remote
auth that no further value helps.
(3) Surgical SA meta-rule — same file, append:
Code:
meta KAM_TEMPERROR_RESCUE (KAM_DMARC_REJECT && T_SPF_TEMPERROR)
score KAM_TEMPERROR_RESCUE -4.0
describe KAM_TEMPERROR_RESCUE Rescue when KAM_DMARC_REJECT was due to SPF temperror, not a real fail
protection where it matters; compensates only where SPF couldn't actually be evaluated.
Honest observation from 90 hours of measurement: (1) is necessary but doesn't visibly reduce the temperror quarantine count on its own (the DNS layer
heals — synthetic probe p95 from ~1300 ms to ~0 ms — but the remote-DNS residual stays). (2) helps a thin slice. (3) is the actual move-the-needle
change.
PMG 8.2.11 on Debian 12, KAM ruleset installed. Should apply unchanged to PMG 9.