Varnish and squid, *again*

Just listened to Artur railing against squid and preaching the virtues of varnish. He quoted what most people quoted, which is how varnish performs serving out of *memory*.

It must be nice to have a working set that small. Until someone can show me numbers of disk-intensive (meaning, full caches, LRU eviction churning all the time) varnish numbers, then squid does us quite fine.

Squid patch for making “time” stats more meaningful.

Thanks to Mark, squid’s got a patch I’ve been wanting for a gazillion years: time-to-serve statistics that don’t include the client’s location

http://www.squid-cache.org/bugs/show_bug.cgi?id=2345

Normally, squid’s kept statistics that included the “time” to serve an object, whether it be a HIT, MISS, NEAR HIT, etc. The clock starts for this time when the first headers are received by the client that are validated as a legit squid request, but then doesn’t stop until the client has every last bit of the response.

What this means is that if you have servers in the US and your traffic pattern follows the NY/SF pattern (peaks from around 9am-4pm) and your overseas traffic (i.e. clients really far from your boxes) has a pattern the inverse of that, then you might see ‘time-to-serve’ in squid to be worse during your lowest traffic. Which is confusing, to say the least. 🙂

This patch changes the stopwatch to start at the same time (when squid’s received headers from the client) but stop when squid’s preparing the headers for the response. This measures ONLY the time that squid had the object in its hands, for a hit or a miss, which IMHO is a much better measure of how squid is actually performing with the hardware’s resources.

Yay! Thanks Mark.

Caches and Eviction Policies

Caching systems are finite in size. So what happens when your cache is filled with objects ?

No more objects ? Game over ?

Hopefully, no. Most modern caches have some form of replacement or eviction policy. What means that based on some criteria, it’ll figure out what objects to throw out the window so it can make room for incoming objects. I would say that the replacement policy most commonly found is at least some derivative of the LRU method, or “Least-Recently-Used” policy. Squid can use it, memcached uses it, and most modern CPU caches use it.

There are a lot of alternative replacement algorithms out there, but I would say that for most cases of forward and reverse-proxying caches, they are not all that much different, performance wise. But they can be very different, efficiency wise.
One place where they do matter a lot is when the working set is always growing and the window of hot requests could be anywhere inside that increasing set. The cache is spending more time evicting objects, since it’s
always full.

Damn those users and their photos! 🙂

I’ve tried the traditional types of replacement policies (at least those supported by squid) and I found some interesting things that, as usual, probably only apply to situations where you’re dealing with a constantly full cache:

Greedy Dual-Size Frequency: this is designed to hold hotter and smaller
objects, and lean towards getting rid of big objects that might be very
cacheable but might take up too much room. The idea is to hold more
objects, cause they’re smaller. Which is a great idea in theory, but for
Flickr, it didn’t work. Just cause you’re keeping more objects doesn’t
mean that they’ll be requested. 🙂 # of objects went up, hit ratio stayed
the same, and load went up.

LFDA (Least Frequently Used with Dynamic Aging): this one favors popularity, but it means that huge objects
could just fill up the cache and take the space that could be used
by smaller, (faster to serve) objects. It didn’t work for us either.

LRU (Least Recently Used): this was about the best we had seen in production, and we’ve stuck with it. Although we do employ a sort of “poor man’s” GDSF: we limit the size of objects kept in memory and on disk, which means that for objects of a certain size, we purposefully MISS on that object, in order to keep the cache clear of large objects that (currently) aren’t being requested all that often.