A place for programmers of all languages to come and chat, show off what they've done, and general shop talk.
"Optimal Probabilistic Cache Stampede Prevention" - 2015 [PDF]
We do a lot with caches at work, and just recently learned about this paper. The upshot is that it gives a very simple algorithm for adding jitter to your cache expiration to avoid "thundering herd" problems when a cache entry expires during a time of frequent requests for that entry.
Our previous algorithm was to pick a random number between 1 and 60 seconds (rand()) for each cache hit and refresh the cache entry that much earlier, but it turns out that the refreshes were almost always happening ni the first few seconds of that window.
The paper uses something like 1 - log(rand()) to expire very late in the window, but not so late as to risk another request invoking the cache refresh while one is already in progress.
Go check it out!




Note that the algorithm presented in the paper (see top of the 6th page, p.890) attempts to be adaptive to the time required to refresh a cache entry. Our system has highly variable times for refresh, so we just picked a constant near-worst-case value. That means that most of our refreshes happen a couple seconds earlier than they need to, but with a much higher safety margin.