how to implement truncated exponential backoff

May 08, 2021

exponential backoff screenshot

Exponential backoff is an algorithm that uses a feedback method to decrease the rate of some process. In practice it’s mostly used in various network applications to space out repeated retransmissions of the same block of data, mostly to avoid blocking and congestion. Often it is also called truncated exponential backoff, due to the fact that after a certain number of tries, the exponentiation stops.

 Why should you care

Learning about exponential backoff is going to come in handy if you’re planning on working with distributed systems. It’s also a quite popular senior engineer interview question. Most of all though, it’s one of those elegant ideas that is really fun to implement.

 Basics of the algorithm

Exponential backoff algorithm retries requests in a … you guessed it - exponential manner. What it means in practice is that the time between subsequent requests is going to be exponentially growing until it hits a predefined limit of maximum backoff time.


  1. Make a network request to some service. Let’s refer to it as ‘the service’.
  2. If the request fails - the service responded with a message that somehow signalizes to us that we can’t get a response now, wait 1 second before the next request.
  3. If the request fails again, wait 2 seconds before attempting the next request.
  4. If it fails again, wait 4 seconds, until you hit some maximum_backoff_time.
  5. Continue retries until some maximum amount of retries without increasing the wait time between retries.

Thundering herds

Since exponential backoff is supposed to limit the load on a server receiving the requests, our implementation so far would do quite well assuming there’s only one caller making those requests. If we are thinking in cloud scale, there could be dozens of thousands of processes making those requests and assuming they all started at approximately the same time, we are going to be bombarding the server with dozens of thousands of requests at the same time. The fact that they’re spread out exponentially is not changing anything in terms of crazy load we’re putting the server through. For this we need to introduce a concept of jitter - a random amount of delay between the requests.

 The formula

If you look closely you can probably devise the formula for how long to wait when backing off: every time we’re waiting n^2 seconds and adding a random number of milliseconds. Therefore the method to devise the wait time would look somewhat like this: (2^n) + rand_ms where n is the current retry, and rand_ms is a random amount of jitter we discussed in the previous paragraph.


Here’s an example of a contrived function in javascript that accomplishes such task:

 * This function pauses script execution by `backoffTime`, with an amount of
 * jitter added or subtracted from `backoffTime`.
 * @param {Number} backoffTime - number of milliseconds to wait
 * @param {Number} jitterAmount - suggested jitterAmount
 * @returns {undefined}
const backoffBeforeNextAttempt = async (backoffTime, jitterAmount) => {
  logEvent(`backing off`)
  // if jitterAmount specified by user is too small, use a default starting point of
  // half of the backoff time
  if (backoffTime / jitterAmount < 2) jitterAmount = backoffTime / 2

  jitterAmount = Math.floor(calculateJitter(jitterAmount))

  backoffTime += jitterAmount
  backoffTime = Math.floor(backoffTime)

    `${jitterAmount}ms jitter amount applied, ${backoffTime}ms total backoff time`
  await sleep(backoffTime)


My full implementation of this can be found here. It’s a fun thing to watch happen! I’ve included a mock server implementation to introduce a fair degree of randomness while observing the behavior of this algorithm. Simply run node index.js to see exponential backoff in action.

If you want to play around with various situations, you can configure the chance of failure in mockServer.js by changing the value of failureChance and the logical test:

// mockServer.js
// truncated code ...
const callMockServer = () => {
  const responseTime = randInt(300, 3000);
  const failureChance = randInt(1, 100);       //<-- right here!

  return new Promise((resolve, reject) => {
    setTimeout(() => {
      if (failureChance > 15) {               // <-- and here!
        const howManyFailureMessages = CONFIG.REJECTION_MESSAGES.length - 1;
// truncated code ...

Further reading

In researching various methods and examples I have used a few guides, here are the ones that I have found the most useful:

Written by Daniel Kaczmarczyk, a software engineer and educator. you can find me on twitter or email me at

a pale blue and yellow circle