Exponential backoff
In networked and distributed systems, remote calls are known to fail from time to time for any number of reasons. Clients often deal with such Transient Failures by implementing a retry-strategy, where failed calls are simply retried until they either succeed, or error too many times and the system eventually gives up.
In a naive implementation where retries are simply performed in a fixed interval (say, each 1 second apart) this is likely to result in a Thundering Herd situation.
Exponential backoff is a well-known retry strategy in control theory and computer science which aims to make such retries more efficient for the system as a whole by doubling the time interval between consecutive retries.
The retry delay is usually capped at some maximum. For example, once the retry delay hits 1 minute, the next rety might again be 1 minute instead of doubling to 2 minutes. Formally, this is called truncated exponential backoff.
For example, if a network call fails, it is retried 1 second later. If it then fails again, the system waits 2 seconds before retrying, and so on.
sequenceDiagram
HR assistant->>HR system: Export employee records
HR system->>Database: Query records
Database-->>HR system: Error: overloaded
Note over HR system, Database: Sleep 1 second (+jitter)
HR system-->>Database: Query records
Database-->>HR system: Error: connection refused
Note over HR system, Database: Sleep 2 seconds (+jitter)
HR system-->>Database: Query records
Database-->>HR system: Error: system starting
Note over HR system, Database: Sleep 4 seconds (+jitter)
HR system-->>Database: Query records
Database->>HR system: Return results
HR system->>HR assistant: Download export
Jitter
Robust, mature implementations also introduce a slight, randomized delay (called jitter) alongside each retry. This aims to spread out different clients so that they are not all retrying requests at the exact same moment, but gradually become equally distributed over time.
An good explanation of why this is important is available on the AWS Architecture blog in Exponential Backoff And Jitter.