"Optimal Probabilistic Cache Stampede Prevention" - 2015 [PDF]

ucsd.edu

We do a lot with caches at work, and just recently learned about this paper. The upshot is that it gives a very simple algorithm for adding jitter to your cache expiration to avoid "thundering herd" problems when a cache entry expires during a time of frequent requests for that entry.

Our previous algorithm was to pick a random number between 1 and 60 seconds (rand()) for each cache hit and refresh the cache entry that much earlier, but it turns out that the refreshes were almost always happening ni the first few seconds of that window.

The paper uses something like 1 - log(rand()) to expire very late in the window, but not so late as to risk another request invoking the cache refresh while one is already in progress.

Go check it out!