The “thundering herd” problem

A term used in computer science, a “thundering herd” occurs when an event triggers a large number of clients to make a request on a contented resource.

Traditionally this was used in the context of operating system threads/processes waking up to IO events or locks being released, but it can be applied generally to distributed/client-server/actor systems. For example:

  • Multiple processes waking up when a lock/mutex is released.
  • Clients being disconnected from a server and all trying to reconnect at the exact same time.
  • Multiple scheduled jobs all waking up at the same time (such a cronjobs triggering on the hour, or at midnight).