My FIFO/LRU caching solution.

First, read Isaac Dealey’s A New Model for Caching in ColdFusion.

Then, read Ben Nadel’s Experimenting With Flat-File ColdFusion CFML Caching.

What I’ve done in the past is a combination of FIFO and LRU rolled into a CFC that other CFCs extend. I won’t go into the code because it is exceedingly dull, but here’s the explanation.

You need three main data points:

  • DataFromKey: A structure pointing from the key to the data. This is a traditional cache.
  • FetchTimes: A 2-column array, where the first column is the key and the second column is the time that key’s data was last fetched. Really, it’s just an in-memory log of fetch times. Both cache-hits and cache-misses are logged.
  • LastFetchFromKey: A structure containing the last fetch time for each key. Think of this as a flattened version of that previous array.

Seems like overkill, right? Here’s how it works, in general:

newCache(maxsize, purgeSize)

  • maxsize: The maximum number of elements you want to cache before you start a purge cycle.
  • purgeSize: The number of elements to purge when you start a purge cycle.

Controlling the purge size is important. Purging is typically an intensive operation, so you don’t want to get into an add-then-purge-then-add-then-purge situation. Instead, set your purge size to be up to 5% of your cache size. If you have a 500-item cache, then each purge cycle would drop 25 elements at a time.

get(key)

  1. Set LastFetchFromKey[key] = now().
  2. Append now() onto FetchTimes log.
  3. Check DataFromKey for a cache hit. If so, return the data. If it’s a cache miss, get the data to be cached and keep going.
  4. If you need to free up space, call purge().
  5. Fill DataFromKey with the data.

purge()

  1. Loop, deleting FetchTimes[1] until LastFetchFromKey[FetchTimes[1][1]] equals FetchTimes[1][2]. This removes any stale entries from your log where the key has been fetched more recently than the first entry for that key in your log.
  2. The first key in your FetchTimes log can now be purged. Remove it from there and from the other two structures.
  3. Repeat from the start until you have the desired amount of space.

Basically, you are being really lazy with your log array and only cleaning it up when you also need to free up space. If you have a low cache churn, you may want to clean up the log array whenever it hits a certain size, too.

Essentially, it’s a low-rent version of Isaac’s query-based cache.

This model uses more memory than a traditional cache, probably roughly the same as the query-based cache, but I’ve found it’s extremely fast. Better yet, if you allow the cache to be purged asynchronously (such as on a scheduled job) then you remove the slowest point of the solution from the user page loads. Everything else is just structure lookups.

This is all fine and good, but what about an optimization where you want certain keys to be pinned in the cache? That is, you know that they’ll probably never drop out of the cache, so you want to optimize for that case.

Easy: add a PinnedKeys structure. In your get() function, if the key is in the PinnedKeys structure, then don’t log it. The log controls what drops out of the cache, so if it’s not logged then it will never get dropped.

Published by

Rick Osborne

I am a web geek who has been doing this sort of thing entirely too long. I rant, I muse, I whine. That is, I am not at all atypical for my breed.