Archive for the ‘Squid’ Category
A few people have asked me what the deal is with Squid-2 and Squid-3.
“Why are you developing on Squid-2 when Squid-3 is now out?”
“Should I upgrade to Squid-3 now that its released?”
I’m focusing on Squid-2 for a few reasons, namely:
Its what people running high-traffic sites are currently running, and Squid-3 doesn’t work at all for them;
I was fed up waiting for Squid-3 to be released and for it to become mature enough for users to migrate to before I started my performance work. I gave up about 12 months ago and began planning out the work thats currently going on.
I’m personally much more familiar with the Squid-2 codebase than the Squid-3 codebase.
So what exactly am I doing to Squid-2? Well, I’m doing all the things to Squid-2 which I personally believe we should’ve done in the C++ Squid-3 branch before all the “new stuff” was added. You can find it all at http://devel.squid-cache.org/changesets/squid/s27_adri.html . A summary of what I’m doing in this first round:
I’m taking a very sharp scalpel to the codebase and removing all of the extra data copies and buffering which is going on;
I’m reworking the buffer management so arbitrary sized data buffers can be used, rather than fixed 4k buffers for network/disk traffic;
I’m reworking the Strings interface to use reference counting and reference underlying buffers, saving on memcpy() and malloc() calls, cutting down on the amount of transient memory used to handle requests and dropping the CPU and memory bus utilisation quite dramatically;
I’m reworking the dataflow between server->store and store->client to use the above reference counted buffers, so data isn’t memcpy()’ed between layers, again dropping CPU and memory bus utilisation;
And I’m going to break out as much of the code into external libraries with well-understood dependencies, as preparation for documentation, unit testing and further profiling.
My aim is to fix whatever bugs show up in Squid-2.7 and then in Squid-2.HEAD (which has some of the above included already.) I’ll then start bringing across my changes as they’ve been tested and been found stable. My aim is to have the bulk of the above done within the next month or so and get it into Squid-2.HEAD and concentrate on making it stable before I continue tidying up the dataflow and restructuring the ugly bits of code.
Whats this mean for Squid-3? The Squid-3 guys are doing some great work with things such as ICAP and IPv6 and I hope that they’ll gain more experience with their codebase over the next 12 months or so. I’m certainly not bringing ICAP support into Squid-2 until I’ve reworked the dataflow and tidied up the code enough for ICAP to sit comfortably in the data pipeline, rather than have it bolted onto the side and hooking into strange places where it shouldn’t. (I may bring in IPv6 into Squid-2 soon though!)
Hopefully my work and their work will culminate with the development of the next Squid major version over the next 12 to 24 months. There’s a long way to go though and my main aim here is to get faster, better and shinier code out to the majority of Squid users now so they can benefit from the development, rather than repeating the 4-odd year gap between Squid-2.5 and Squid-2.6. Users hated that.
So whats it mean for you?
If you want to try out Squid-3; if you want supported ICAP services then try it out.
Squid-2.X will continue being developed over the next 12 months as time permits, so don’t feel like you have to move to Squid-3.
If you feel adventurous, try out Squid-2.7. Initial reports are that its stable and slightly less CPU intensive.
Squid-2.7 is the first version to include changes to allow Youtube and Microsoft Updates caching. It doesn’t do it out of the box, but the support is there, and I’ll be publishing test rules soon to let people start caching this stuff.
If you feel really adventurous then try out Squid-2.HEAD and report back if you have any issues. It should be even less CPU intensive, but only under certain workloads.
Squid-2.6.STABLE18 fixes a silly bug (thanks to yours truely fixing another bug!) which may cause your Squid to crash under certain circumstances.
Squid-2.6.STABLE18-RC1 (release candidate 1) tarballs are available from the Squid website – http://www.squid-cache.org/Versions/v2/2.6/ – the release should be in a day or two.
Henrik has branched Squid-2.7 – it hasn’t been formally announced yet but it should be any day now.I’ve begun rolling in infrastructure changes with an eye towards improved performance in Squid. Squid-2 is my testbed at the moment – I’m leaving Squid-3 alone for now to let the codebase mature and the C++ guys to, well, do their C++ “thing”. The first round of patches to Squid-2.HEAD remove one of the major CPU and memory bottlenecks – memcpy()’ing of data as it passes from the store (so from anywhere, really) back to the client. This may or may not improve performance with your workload but its the beginning of sensible dataflow inside Squid.(I estimate this brings Squid up to the late 90′s in terms of network application coding..)My next trick will be reference counted buffers and strings, to avoid more memcpy()ies, memory allocation/frees, and general L2 cache busting. More on that later.
Its been a long wait, but Duane has released Squid-3.0.STABLE1. Features include integrated ICAP support. You can find more information at the release website.
Well folks, things are getting underway again just in time for the new year.
Starting with the Dec 16th daily snapshot of squid3-HEAD includes the long-awaited squid3-ipv6 branch of squid.
To build the feature just add –enable-ipv6 to your configure options. There are other IPv6 settings for some setups, but most will not need them. Expect it to accept your existing 3.0 squid.conf while allowing you to tweak it slightly for IPv6 purposes if you have a v6/NG connection or desire to do so.
The new releases coupled with an IPv6 link as simple as a single-host tunnel add the ability to:
* source traffic from either IPv4 or IPv6 as needed or provided
* proxy web traffic between IPv4 and IPv6 seamlessly
* gateway an IPv4 or IPv6 -native network to the full transitioning web
* accelerate a website on both IPv4 and IPv6 Internets even if the web server itself is stuck without access to one protocol.
* measure network availbility over both IPv4 and IPv6 for peers and source selection
Some expected configuration problems and their solutions can be found in the Squid wiki FAQ
Youtube is (one of) the bane of small-upstream network administrators. The flash files are megabytes in size, and a popular video can be downloaded by half the people in the office or student residential college in one afternoon.
It is, at the present time, very difficult to cache. Lets see why.
There’s actually two different methods employed to serve the actual flash media files that I’ve seen. The first method involves fetching from youtube.com servers; the second involves fetching from IP addresses in Google IP space.
The first method is very simple: the URL form is:
XXX is the pop name; YYY is I’m guessing either a server or a cluster name.
This is pretty standard stuff – and If-Modified-Since requests seem to also be handled badly too! The query string “?” in the URL makes it uncachable to Squid by default, even though its a flash video. Its probably not going to change very often.
The second method involves a bit more work. First the video is requested from a google server. This server then issues a HTTP 302 reply pointing the content at a changing IP address. This request looks somewhat like this:
Again, the “?” query string. Again, the origin, but its encoded in the URL. Finally, not only are If-Modified-Since requests not handled correctly, the replies include ETags and requests with an If-None-Match revalidation still return the whole object! Aiee!
So how to cache it?
Firstly, you have to try and cache replies with a “?” reply. It would be nice if they handled If-Modified-Since and If-None-Match requests correctly when the object hasn’t been modified – revalidation is cheap and its basically free bandwidth. They could set the revalidation to be, say, after even 30 minutes – they’re already handling all the full requests for all the content, so the request rate would stay the same but the bandwidth requirements should drop.
The URLs also have to rewritten, much like they do to cache google maps content. The “canonical” form URL will then reference a “video” regardless of which server the client is asking.
Now, how do you do this in Squid? I’ve got some beta code to do this and its in the Squid-2 development tree. Take a look here for some background information. It works around the multiple-URL-referencing-same-file problem but it won’t unfortunately work around their broken HTTP/1.1 validation code. If they fixed that then Youtube may become something which network administrators stop asking to filter.
(ObNote: the second method uses lighttpd as the serving software; and it replies with a HTTP/1.1 reply regardless of whether the request was HTTP/1.0 or HTTP/1.1. Grr!)
I’m looking at how cachable Google content is with an eye to make Squid cache some of it better. Contrary to popular belief, a lot of the google content (that I’ve seen!) is dynamically generated “static” content – images, videos – which could be cached but unfortunately aren’t.
Google Maps works by breaking up the “map” into multiple square tiled images. The various compositing that occurs (eg maps on top of a satellite image) are rendered by the browser and not dynamically generated by Google.
We’ll take one image URL as an example:
A few things to notice:
- The query string is a 1:1 mapping between query and tile, regardless of which keyhole server they’re coming from.
- The use of a query string negates all possible caching, even though…
- .. the CGI returns Expires and Last-Modified headers!
Now, the reply headers (via a local Squid):
HTTP/1.0 200 OK
Expires: Sat, 15 Nov 2008 02:44:29 GMT
Last-Modified: Fri, 17 Dec 2004 04:58:08 GMT
Server: Keyhole Server 2.4
Date: Fri, 16 Nov 2007 02:44:29 GMT
X-Cache: HIT from violet.local
Via: 1.0 violet.local:3128 (squid/2.HEAD-CVS)
The server returns a Last-Modified header and Expires header; but as it has a query identifier in the URL (ie, the “?”) then plenty of caches and I’m guessing some browsers will not cache the response, regardless of the actual cachability of the content. See RFC2068 13.9 and RFC2616 13.9. Its unfortunate, but what we have to deal with.
Finally, assuming the content is cached, it will need to be periodically revalidated via an If-Modified-Since request. Unfortunately the keyhole server doesn’t handle IMSes correctly, always returning a 200 OK with the entire object body. This means that revalidation will always fail and the entire object will be fetched in the reply.
So how to fix it?
Well, by default (and for historical reasons!) Squid will not cache anything with “cgi-bin” or “?” in the path. Thats for a couple of reasons – firstly, replies from HTTP/1.0 servers with no expiry information shouldn’t be cached if it may be a CGI (and “?”‘s generally are); and secondly intermediate proxies in the path may “hide” the version of the origin server and you never quite know whether it was HTTP/1.0 or not.
Secondly, since the same content can come from one of four servers:
- You’ve got a 1 in 4 chance that you’ll get the same google host for the given tile; and
- You’ll end up caching the same tile data four times.
I’m working on Squid to work around these shortcomings. Ideally Google could fix the second one by not using query-strings but instead using URL paths with correct cachability information and handling IMS, eg:
That response would be cachable (assuming that they didn’t vary the order of the query parameters!) and browsers/caches would be able to handle that without modification.
I’ve got a refresh pattern to cache that content but its still a work in progress. Here’s an example:
refresh_pattern ^ftp: 1440 20% 10080
refresh_pattern ^gopher: 1440 0% 1440
refresh_pattern cgi-bin 0 0% 0
refresh_pattern \? 0 0% 4320
refresh_pattern . 0 20% 4320
I then remove the “cache deny QUERY” line and simply use a cache allow all; then I use refresh_pattern’s to match on which patterns shouldn’t be cachable if no expiry information is given (ie – if a URL with cgi-bin or ? in the path returns expiry information then Squid will cache it.)
[UPDATE: We have now merged the results of Adrians work here into Squid-2.7 and 3.1+. The new requirement for refresh_patterns are:
refresh_pattern ^ftp: 1440 20% 10080
refresh_pattern ^gopher: 1440 0% 1440
refresh_pattern -i (/cgi-bin/|\?) 0 0% 0
refresh_pattern . 0 20% 4320
hierarchy_stoplist cgi-bin ?
It’d then be nice if Google handled IMS requests by the keyhole server correctly!
Secondly, Squid needs to be taught that certain URLs are “equivalent” for the purposes of cache storage and retrieval. I’m working on a patch which will take a URL like this:
Match on the URL via a regular expression, eg:
And mapping that to a fixed URL regardless of the keyhole server number, eg:
The idea, of course, is that there won’t ever be a valid URL normally fetched whose host part ends in .SQUIDINTENRAL and thus we can use it as an “internal identifier” for local storage lookups.
This way we can then request the tile from any kh server ending in any country, so the following URLs would be equivalent from the point of view of caching:
Its important to note here that the content is still fetched from the requested host, its just stored in the cache under a different URL.
I’ll next talk about caching Google Images and finally how to cache Youtube.
In case you didn’t know, there’s a work in progress for IPv6 support in Squid-2.6. You’ll find a patch here which, reportedly, is being used in production at a few sites.
If you’d like to see IPv6 in a future Squid-2 release – its a very large change to introduce in the squid-2.6 release so it would appear in a 2.7 or 2.8 release – then please join the squid-users mailing list and let us know.
(I hear a lot of people complaining about how Squid doesn’t “support IPv6″ and yet won’t try Squid-3+IPv6 or even try googling for alternatives. The truth is that there’s been unofficial patches to Squid-2 to support IPv6 in some fashion for a number of years now – heck, there was an IPv6 patch to Squid-1! – but noone volunteered to stand up, tidy it up and get it in shape for inclusion into the main tree. If IPv6 is important to you then please say so; please test the stuff thats out there and don’t hesitate to donate to the Squid project with a note saying “for IPv6!”.)
My main focus at the moment is to tidy up areas of Squid-2 with an eye towards both nicer code internals and better performance. Over the last year I’ve committed work to Squid-2 which has eliminated a large part of the multiple data copying and multiple request/reply parsing which went on in Squid-2.6 and earlier.
Unfortunately I don’t run any busy Squids, and so I’m not always confident my changes are correct. Squid has a lot of baggage and even with 10 years experience I still hit the occasional “wha? I didn’t know it did that” case.
My recent work is focusing on eliminating all the extra memory copying that goes on. Part of this will involve changing the internal dataflow and some inclusion of new buffer and string management code. Yes, C++ would help here (I do like the idea of the compiler enforcing my reference counting semantics to be correct!) but Squid-3 is still in beta and squid-2 is still C. I’d rather not break Squid-3 when its not yet released and this work is “small” jump for people running Squid-2.HEAD or Squid-2.6.
The “store_copy” branch in Sourceforge will focus on converting one of the heaviest users of memcpy() – the storeClientCopy() API which allows the client-side to fetch data from the memory store – to use reference counted read-only buffers rather than copying the data into a supplied buffer. Reference counting in C is “tricky” even at the best of times. Its eliminated almost 5% of CPU use due to memcpy() but this only applies on my workbench (memory-only workload, small objects, local client/server.) It may work for you, it may not. Its part of a bigger goal – to avoid copying data where possible inside Squid – which will result in a leaner, faster proxy cache for everyone.
Its main noticable savings should be RAM use – temporary 4k buffers aren’t used anymore to store data as its being written to the client. This may be more noticable than the CPU savings. Regardless, I’d like to know how it runs for you!
- Grab a Squid-2.HEAD tree, something reasonably recent;
- Grab the patch from the store_copy branch at Sourceforge and patch Squid-2.HEAD;
- Compile and run it!
- Let me know how it runs – is it running smoothly? Is it leaking memory? Crashing? Serving incorrect data?
It “works for me” in my test bench at home. I’d love to know this is stable enough to commit to Squid-2.HEAD and move onto the next work.