Varnish and squid, *again*

Just listened to Artur railing against squid and preaching the virtues of varnish. He quoted what most people quoted, which is how varnish performs serving out of *memory*.

It must be nice to have a working set that small. Until someone can show me numbers of disk-intensive (meaning, full caches, LRU eviction churning all the time) varnish numbers, then squid does us quite fine.

I like making things go! At the moment, I'm SVP of Infrastructure and Operations at Etsy, and I'm currently pursuing a Master's degree in Human Factors and Systems Safety at Lund University.

10 comments

  1. Kevin Murphy   •  

    There will always be people who will love things because they are beautiful, not because they work. Varnish is conceptually cleaner than Squid, but currently less reliable and useful for fewer workloads.

    If people love it enough for its beauty, perhaps someday it will work.

  2. Mark Nottingham   •  

    Exactly. Varnish is fast, but considering that Squid can saturate a gigE from memory with around 10k responses, it’s only a big deal if your response size is considerably less than that.

    I’d take Squid’s stability and features over VCL and Varnish’s lack of documentation any day. IMO they should position Varnish as a bespoke intermediary construction toolkit, not as a “HTTP cache” or even reverse proxy.

    BTW, according to the list recently, Varnish buffers responses from the origin until they’re complete; it doesn’t stream them to the server. Ouch.

  3. Bryan Migliorisi   •  

    I am considering Varnish to front our servers for a site that servers several million pages a day. Ive been doing some tests and just started blogging about my findings.

    All in all, Varnish is simply faster. There still isnt a ton of documentation, but VCL is so simple to setup that you dont NEED a ton of documentation. You are given a few functions and variables and you do what you want with them.

    Varnish 2.x is a large improvement over the older versions and in my tests, it blows Squid out of the water.

    @Mark – Varnish both streams as well as buffers, depending on what you want it to do.

    Part one of my performance comparison can be found here: http://deserialized.com/reverse-proxy-performance-varnish-vs-squid-part-1/

    This first series of tests simple tests the throughput of the two applications.

  4. allspaw   •     Author

    Great test results. For me, varnish is still a non-starter for a couple of reasons, mostly fault tolerance issues.

    If varnish is using multiple spindles, it will crash upon losing a drive. Squid will simply detect the bad cache dir and continue on without it, using the remaining drives. With the amount of caching machines we have, our MTBF shrinks, so that’s not good.

    The other reason is that because we have such a large working set, it’s non-ideal to lose an entire cache machine when it crashes or reboots, which varnish does.

    Please correct any of the above information about varnish if I’m mistaken. I’m actually more looking forward to using Yahoo Traffic Server once there’s some minor changes made to it to suit our needs. :) 

    Brian: any chance you can test those fault conditions in your next round of tests? :)
    (disk loss, reboot) I’d love to see if those are fixed.

  5. Andy   •  

    allspaw,

    Can you explain why Varnish is less fault-tolerant than Squid? Do you find Varnish to be less stable than Squid in general? Thanks.

  6. allspaw   •     Author

    Andy: what I mean is…(please correct if I’m wrong)

    - if Varnish crashes, or the machine is rebooted, it will lose all of what was in its cache and will have to start with am empty cache again
    - if using multiple drives (non-RAID) for cache, if one disk dies, Varnish will crash and will lose all its cache

    Squid can withstand both of those failure scenarios.

  7. Pingback: Durability, Scalability, Availability ·

  8. syadnom   •  

    @allspaw, true enough that varnish will loose its cache on a reboot because everything is in virtual memory while squid keeps a lot on disk.

    As far as Varnish loosing cache if a disk fails, this is the operating systems job to handle. If you are running a high performance cache without RAID then the performance difference between varnish and squid is really not any concern of yours. With both squid and varnish, disk performance and ram size is key to performance. The OS needs to handle RAID and if there is a catastrophic failure then the system is very very likely to be non-function anyway so the point is moot.

    The idea behind varnish is that you should leave kernel and OS processes to the kernel and OS. Dont manage memory because the kernel should do that, dont manage raid because the kernel and OS should do that. is varnish perfect? no. It would be nice for cache to survive a reboot or rather a hard reboot. You could write that cache from RAM to disk and then reload it if you do a graceful shutdown.

    I like both myself. varnish has some very nice modern features like VCL for instance as well as cleaner logs etc. I like squid because it is a very good forward AND reverse proxy. I can also cache to disk for longer term caching on squid. Varnish cant do this.

  9. Pingback: ehcache.net

  10. unkn0wn   •  

    Non-RAID situation is not accurate. If you have site with 100.000+ visitors and you have no money for RAID-controller a-la Adaptec (~100$), it’s just fantastic. Also, squid (2.7 in my situation) have a bad cache integrity. For example, i had a 30Gb cache, after two power crashes squid start to rebuild it. It take ~3 hours, in this time all work in my company was paralyzed. The key was a cache flushing, because squid took 100% CPU and LA ~4.
    So i think, varnish is more reliable: cache stored in RAM (low cost for data access), RAM price is not so great, so you can buy 4Gb RAM for 100$ and put all your site in memory for ~15min. Squid is more heavy – disk cache (50ms disk vs 50us RAM), non-procedure config file (yeah, you can manipulate with cache alghorythm, but you cannt create different reaction for different situations like varnish). Squid is a good value for large sites like microsoft because of great amount of data, if you have site with cache less than 4Gb, varnish is more suitable..

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>