Auto-retry vs duplicate charge risk on payment timeouts

When our gateway throws intermittent 502/504s, we can either auto-retry 2-3 times (saves about 15% of failed collects but we’ve seen the occasional duplicate when a merchant ignores idempotency keys) or stop after one fail and kick to manual review (slower recovery and more client support touchpoints). For those running high-volume cycles, where do you set the line - and are you tagging attempts with a request fingerprint or leaning on gateway idempotency only?

‍‌‌⁠‍‌​‍‍⁠​​​‍‌‍‍‍‍⁠⁠‍‍‌‌‍‍⁠​‍​⁠‍⁠​‌​⁠‍‌​⁠​‍‌⁠​⁠​‍‌‍⁠‌⁠​⁠‍‍‌​⁠⁠‍‍‌​‍‍⁠‌‌‌‍‍‌‍​‍‌​‍‌‍‍‌​​‌⁠⁠‍‌⁠‌⁠‍​‌‌‌⁠​⁠‌‍‍‌​⁠​​​⁠​‍‍⁠​​‌‍‍​​⁠​‍‌⁠‌‍​‍‌‍⁠‌⁠‌‌‍⁠​‍‌​⁠‍‌​‍‌⁠​⁠​‍‌‍⁠‌‌​‌‍‍‌​⁠⁠‍‍‌​‍‌⁠‍‌​⁠‌​‌​​‍‌⁠​‌​⁠‌​‌​‍‍⁠⁠‍​​⁠‍‍‍‌​‍‌⁠​⁠​‍‌‍⁠‌‍‌⁠‍‍‌​⁠⁠‍‍‌​‍‍‍‍⁠‍‌​‍⁠‌​⁠‍⁠‌​‍​⁠​‌‌⁠​‌​‌‌‍‌​‍⁠​‍‌​‌​⁠‌‌‌‌‌⁠‌‍‌⁠​​​⁠‌​‌‌‌​‍‍‍‍​‍‌‍⁠‍⁠​​

Ended up capping auto-retry at 2 with jittered backoff <60s on 502/504, only if we didn’t get a gateway txn_id; otherwise it goes straight to manual. That kept about 15% recovery like yours and we squash dupes by hashing amount+merchant+invoice and auto-voiding any second settle inside 5 min. If a merchant ignores idempotency keys twice, we hard-fail and notify — harsh, but our support pings fell off.

‌⁠‍⁠​‍​‍‌⁠‌​​‍​‍​⁠‍‍​‍​‍‌‍‌⁠‌‍​⁠‌‍‍‍​⁠‌‌​‍​‍​‍⁠​​‍​‍‌‍‍⁠​‍​‍​⁠‍‍​‍​‍‌⁠​‍‌‍‌‌‌⁠​​‌‍⁠​‌⁠‍‌​‍​‍​‍⁠​​‍​‍‌‍‍‌‌‍‌​​‍​‍​⁠‍‍​⁠‌​​⁠‌‌​⁠‌​​⁠‍‌​‍⁠​​‍​‍‌‍‌​​‍​‍​⁠‍‍​‍​‍​⁠​‍​⁠​​​⁠​‍​⁠‌‌​⁠​​​⁠‍‌​⁠​‌​⁠‍‌​‍​‍​‍⁠​​‍​‍‌‍‍​​‍​‍​⁠‍‍​‍​‍​⁠‌⁠‌‌‌⁠‌⁠‌​‌‍⁠‍​⁠​⁠​⁠‌⁠‌‍‍⁠‌‍‍⁠‌⁠​⁠‌⁠‌​​⁠​⁠​⁠​​‌​‌⁠​⁠‌⁠​⁠‌‍‌‍​⁠​‍​‍‌⁠⁠‌