Advanced Squid Caching in Scribd: Hardware + Software Used
After the previous post in this caching related series I’ve received many questions on hardware and software configuration of our servers so in this post I’ll describe our server’s configs and the motivation behind those configs.
Hardware Configuration
Since in our setup Squid server uses one-process model (with an asynchronous requests processing) there was no point in ordering multi-core CPUs for our boxes and since we have a lots of pages on the site and the cache is pretty huge all the servers ended up being highly I/O bound. Considering these facts we’ve decided to use the following hardware specs for the servers:
CPU: One pretty cheap dual-core Intel Xeon 5148 (no need in multiple cores or really high frequencies – even these CPUs have ~1% avg load)
RAM: 8Gb (basically to reduce I/O pressure by caching hot content in RAM)
Disks: 4 x small SAS 15k drives in JBOD mode (no RAIDS – we’ve tried all kinds of RAID configs and it did not help with the I/O performance)
So, once again: nothing is as important in a squid box as I/O throughput.
Here is a sample CPU load graph from one of the boxes:
Software Configuration
This could be a long story, but in a few words our experience with different squid versions was the following.
First, when I’ve started working on this caching project I’ve just installed squid using Debian’s apt-get install squid command. As the result we’ve got some ancient squid 2.6 release that for some reason (still unclear to me) was painfully slow in I/O operations and it had some leaking file descriptors problem so after a few hours under production load the box would simply stop processing requests.
When the first approach failed, I’ve decided to go to the squid web site, download the latest production release and install it from sources (yes, we do it all the time when OS vendor ships too old or buggy releases). Result – freaking fast and stable squid 3.0 which worked flawlessly for about 5 months.
Few months ago we’ve found out about the stale-* extensions available in squid 2.7 and I’ve started wondering if we should change our perfectly stable 3.0 setup to 2.7. And some time later I’ve decided to use Vary HTTP header in our caching architecture and then I found out that vary-caching correctly implemented only in 2.7 and since 3.0 is a complete rewrite of the 2.X branch, vary-caching is not yet implemented there (or not in a way we’d want it to be implemented).
So, the final result: at this moment in time we’re using custom-built Squid 2.7STABLE6 and really happy with it, it is stable, fast and feature-rich caching proxy server.
Caching Cluster Configuration
Apparently we have more than one squid server in scribd and this makes it a bit harder to use those servers (comparing to one box when you’d send all requests to one IP:port pair). We’ve tried to use round-robin balancing for the squid boxes + ICP-based neighbor checks but it was adding more latency to our responses and we’ve decided to put haproxy load balancer between nginx and squid farm and set up URL hash based balancing to distribute requests evenly amongst squid backends.
This scheme worked pretty nice, but we had one serious problem with this setup: if one squid box would go down, haproxy would quickly detect the problem and would remove it from the pool… And here comes the problem – removing a server from the pool completely changes hashing keys space and all cached requests become invalid. To solve this problem we’ve developed a nginx balancer module that performs consistent hashing of URLs and we’re testing this module now in production. What is really good about this module is that it removes one hop from the chain if http proxies between the site and a user.
So, this was a short description of what hardware we use for our caching cluster and why do we use it. In the next posts of this series we’ll talk about cache control and objects invalidation.
UPDATE: squid-3.0 was not a rewrite of Squid-2. It’s a continuation of Squid-2.5, with a conversion to compile under C++.
Related posts:

14 Responses to this entry
G’day,
squid-3.0 was not a rewrite of Squid-2. It’s a continuation of Squid-2.5, with a conversion to compile under C++.
HTH,
Adrian
Also, Squid-2.7 has a bunch of nifty features and a lot of performance related changes. You may want to investigate the COSS filestore, for example, which handles small objects a lot better than AUFS.
Are you able to publish traffic/request load graphs at all?
Oh, thanks, will check COSS filestore out, AFAIR wikipedia guys use it, right?
As for the graphs, what graphs would you like to see?
Have you looked into Varnish rather then Squid?
I’ve discussed this question with Percona guys and according to them they had some really weird problems with Varnish when tried to use it on one of their (pretty large) projects. So I’ve decided to go with Squid (especially after reading wikipedia and fickr guys presentations on their architecture).
Nice. Fast IO makes sense, though I’m puzzled that there was no reasonable RAID config faster than JBOD. Would you mind sharing a bit more around that?
Sharding the squids in a deterministic fashion via a consistent hash is also a nice touch. I’m curious how you handle failover and recovery for a given squid — I’m guessing you hash into a large number of shards, assign multiple shards to a particular squid, then dynamically balance shards across however many live squids you have at a given moment … but that’s just guessing. Could you say a bit more about that, too?
Thanks for the excellent series!
I concur on the Varnish statement. No matter how hard I try to get Varnish to work out for our production needs, Squid still produces better results, even if it is limited to a single core. Here are some of the issues the Varnish developers fail to mention.
1. Varnish needs to buffer the response, there is no streaming responses like Squid. This is a huge performance loss on cache misses.
2. Varnish has contention issues with small object allocations with it’s mmaped file. This creates a lot of issues when your throwing lots of small thumbnails into the cache.
3. Varnish at least in my experience has issues with keepingsockets in CLOSE_WAIT for extended periods of time. For example, Squid would only keep 430 sockets in CLOSE_WAIT while varnish would have it around keep 3,000 or so.
4. Squid has two features that make it a killer cache; stale-if-error and stale-while-revalidate. You can keep your services extremely responsive if you play your cards right.
Look at the “Shopping List” http://varnish.projects.linpro.no/wiki/PostTwoShoppingList
Search for “Streaming pass/fetch” and “Small object handling”
However, all that said I would rather ideally use Varnish than Squid. Varnish has a lot more flexibility when it comes down to HTTP acceleration. Too bad in practice Squid wins out.
Just my two cents.
[...] Alexey Kovyrin’s Blog に Scribd での Squid によるキャッシング方法が紹介されていました。 [...]
Nice piece of information. Thanks for sharing.
Nice. Fast IO makes sense, though I’m puzzled that there was no reasonable RAID config faster than JBOD. Would you mind sharing a bit more around that?
Hmmm. Seems to be an echo in here. ;]
Nice info. I have marked in StumbleUpon.
Do you mind to share the info on http://highscalability.com/
I don’t mind, what do I need to do to share it there?
Great post, Would you please give us an idea about the perfect ideal configuration for the squid.conf file, memory, cache size etc..
PS: I’m running over a 150 clients.
Thanks