Archive for 2007

Squid-2.7 Branched; performance work has begun!

December 22, 2007

Henrik has branched Squid-2.7 – it hasn’t been formally announced yet but it should be any day now.I’ve begun rolling in infrastructure changes with an eye towards improved performance in Squid. Squid-2 is my testbed at the moment – I’m leaving Squid-3 alone for now to let the codebase mature and the C++ guys to, well, do their C++ “thing”. The first round of patches to Squid-2.HEAD remove one of the major CPU and memory bottlenecks – memcpy()’ing of data as it passes from the store (so from anywhere, really) back to the client. This may or may not improve performance with your workload but its the beginning of sensible dataflow inside Squid.(I estimate this brings Squid up to the late 90’s in terms of network application coding..)My next trick will be reference counted buffers and strings, to avoid more memcpy()ies, memory allocation/frees, and general L2 cache busting. More on that later. 

Squid-3.0.STABLE1 released

December 22, 2007

Its been a long wait, but Duane has released Squid-3.0.STABLE1. Features include integrated ICAP support. You can find more information at the release website

IPv6 going mainstream in squid

December 17, 2007

Well folks, things are getting underway again just in time for the new year.

Starting with the Dec 16th daily snapshot of squid3-HEAD includes the long-awaited squid3-ipv6 branch of squid.

http://www.squid-cache.org/Versions/v3/HEAD/

To build the feature just add –enable-ipv6 to your configure options. There are other IPv6 settings for some setups, but most will not need them. Expect it to accept your existing 3.0 squid.conf while allowing you to tweak it slightly for IPv6 purposes if you have a v6/NG connection or desire to do so.

The new releases coupled with an IPv6 link as simple as a single-host tunnel add the ability to:

* source traffic from either IPv4 or IPv6 as needed or provided

* proxy web traffic between IPv4 and IPv6 seamlessly

* gateway an IPv4 or IPv6 -native network to the full transitioning web

* accelerate a website on both IPv4 and IPv6 Internets even if the web server itself is stuck without access to one protocol.

* measure network availbility over both IPv4 and IPv6 for peers and source selection

Some expected configuration problems and their solutions can be found in the Squid wiki FAQ

http://wiki.squid-cache.org/SquidFaq/ConfiguringSquid

How cachable is google (part 2) – Youtube content

November 17, 2007

Youtube is (one of) the bane of small-upstream network administrators. The flash files are megabytes in size, and a popular video can be downloaded by half the people in the office or student residential college in one afternoon.

It is, at the present time, very difficult to cache. Lets see why.

There’s actually two different methods employed to serve the actual flash media files that I’ve seen. The first method involves fetching from youtube.com servers; the second involves fetching from IP addresses in Google IP space.

The first method is very simple: the URL form is:

http://XXX-YYY.XXX.youtube.com/get_video?video_id=VIDEO_ID

XXX is the pop name; YYY is I’m guessing either a server or a cluster name.

This is pretty standard stuff – and If-Modified-Since requests seem to also be handled badly too! The query string “?” in the URL makes it uncachable to Squid by default, even though its a flash video. Its probably not going to change very often.

The second method involves a bit more work. First the video is requested from a google server. This server then issues a HTTP 302 reply pointing the content at a changing IP address. This request looks somewhat like this:

http://74.125.15.83/get_video?video_id=HrLFb47QHi0&origin=dal-v37.dal.youtube.com

Again, the “?” query string. Again, the origin, but its encoded in the URL. Finally, not only are If-Modified-Since requests not handled correctly, the replies include ETags and requests with an If-None-Match revalidation still return the whole object! Aiee!

So how to cache it?

Firstly, you have to try and cache replies with a “?” reply. It would be nice if they handled If-Modified-Since and If-None-Match requests correctly when the object hasn’t been modified – revalidation is cheap and its basically free bandwidth. They could set the revalidation to be, say, after even 30 minutes – they’re already handling all the full requests for all the content, so the request rate would stay the same but the bandwidth requirements should drop.

The URLs also have to rewritten, much like they do to cache google maps content. The “canonical” form URL will then reference a “video” regardless of which server the client is asking.

Now, how do you do this in Squid? I’ve got some beta code to do this and its in the Squid-2 development tree. Take a look here for some background information. It works around the multiple-URL-referencing-same-file problem but it won’t unfortunately work around their broken HTTP/1.1 validation code. If they fixed that then Youtube may become something which network administrators stop asking to filter.

(ObNote: the second method uses lighttpd as the serving software; and it replies with a HTTP/1.1 reply regardless of whether the request was HTTP/1.0 or HTTP/1.1. Grr!)

How cachable is google (part 1): Google Maps

November 16, 2007

I’m looking at how cachable Google content is with an eye to make Squid cache some of it better. Contrary to popular belief, a lot of the google content (that I’ve seen!) is dynamically generated “static” content – images, videos – which could be cached but unfortunately aren’t.

Google Maps works by breaking up the “map” into multiple square tiled images. The various compositing that occurs (eg maps on top of a satellite image) are rendered by the browser and not dynamically generated by Google.

We’ll take one image URL as an example:

http://kh3.google.com.au/kh?n=404&v=23&t=trtqtt

A few things to notice:

  1. The first part of the hostname – kh3 – can and does change (I’ve see kh0 -> kh3.) All the tiles as far as I can tell can be fetched from each of these servers. This is done to increase concurrency in the browser: the Javascript will select one of four servers for each tile so the concurrency limit is reached for multiple servers (ie, N times the concurrency limit) rather than just to one server.
  2. The query string is a 1:1 mapping between query and tile, regardless of which keyhole server they’re coming from.
  3. The use of a query string negates all possible caching, even though…
  4. .. the CGI returns Expires and Last-Modified headers!

Now, the reply headers (via a local Squid):

HTTP/1.0 200 OK
Content-Type: image/jpeg
Expires: Sat, 15 Nov 2008 02:44:29 GMT
Last-Modified: Fri, 17 Dec 2004 04:58:08 GMT
Server: Keyhole Server 2.4
Content-Length: 15040
Date: Fri, 16 Nov 2007 02:44:29 GMT
Age: 531
X-Cache: HIT from violet.local
Via: 1.0 violet.local:3128 (squid/2.HEAD-CVS)
Proxy-Connection: close

The server returns a Last-Modified header and Expires header; but as it has a query identifier in the URL (ie, the “?”) then plenty of caches and I’m guessing some browsers will not cache the response, regardless of the actual cachability of the content. See RFC2068 13.9 and RFC2616 13.9. Its unfortunate, but what we have to deal with.

Finally, assuming the content is cached, it will need to be periodically revalidated via an If-Modified-Since request. Unfortunately the keyhole server doesn’t handle IMSes correctly, always returning a 200 OK with the entire object body. This means that revalidation will always fail and the entire object will be fetched in the reply.

So how to fix it?

Well, by default (and for historical reasons!) Squid will not cache anything with “cgi-bin” or “?” in the path. Thats for a couple of reasons – firstly, replies from HTTP/1.0 servers with no expiry information shouldn’t be cached if it may be a CGI (and “?”‘s generally are); and secondly intermediate proxies in the path may “hide” the version of the origin server and you never quite know whether it was HTTP/1.0 or not.

Secondly, since the same content can come from one of four servers:

  • You’ve got a 1 in 4 chance that you’ll get the same google host for the given tile; and
  • You’ll end up caching the same tile data four times.

I’m working on Squid to work around these shortcomings. Ideally Google could fix the second one by not using query-strings but instead using URL paths with correct cachability information and handling IMS, eg:

http://kh3.google.com.au/kh?n=404&v=23&t=trtqtt

might become:

http://kh3.google.com.au/kh/n=404/v=23/t=trtqtt

That response would be cachable (assuming that they didn’t vary the order of the query parameters!) and browsers/caches would be able to handle that without modification.

I’ve got a refresh pattern to cache that content but its still a work in progress. Here’s an example:

refresh_pattern    ^ftp:            1440 20% 10080
refresh_pattern    ^gopher:    1440 0% 1440
refresh_pattern    cgi-bin        0 0% 0
refresh_pattern    \?                0 0% 4320
refresh_pattern    .                    0 20% 4320

I then remove the “cache deny QUERY” line and simply use a cache allow all; then I use refresh_pattern’s to match on which patterns shouldn’t be cachable if no expiry information is given (ie – if a URL with cgi-bin or ? in the path returns expiry information then Squid will cache it.)

[UPDATE: We have now merged the results of Adrians work here into Squid-2.7 and 3.1+. The new requirement for refresh_patterns are:

refresh_pattern    ^ftp:        1440  20% 10080
refresh_pattern    ^gopher:        1440   0% 1440
refresh_pattern    -i (/cgi-bin/|\?)        0   0% 0
refresh_pattern    .        0   20% 4320

hierarchy_stoplist cgi-bin ?

]

It’d then be nice if Google handled IMS requests by the keyhole server correctly!

Secondly, Squid needs to be taught that certain URLs are “equivalent” for the purposes of cache storage and retrieval. I’m working on a patch which will take a URL like this:

http://kh3.google.com.au/kh?n=404&v=23&t=trtqtt

Match on the URL via a regular expression, eg:

m/^http:\/\/kh(.*?)\.google\.com(.*?)\/(.*?)$/

And mapping that to a fixed URL regardless of the keyhole server number, eg:

http://keyhole-server.google.com.au.SQUIDINTERNAL/kh?n=404&v=23&t=trtqtt

The idea, of course, is that there won’t ever be a valid URL normally fetched whose host part ends in .SQUIDINTENRAL and thus we can use it as an “internal identifier” for local storage lookups.

This way we can then request the tile from any kh server ending in any country, so the following URLs would be equivalent from the point of view of caching:

http://kh3.google.com.au/kh?n=404&v=23&t=trtqtt
http://kh2.google.com.au/kh?n=404&v=23&t=trtqtt
http://kh0.google.com/kh?n=404&v=23&t=trtqtt

Its important to note here that the content is still fetched from the requested host, its just stored in the cache under a different URL.

I’ll next talk about caching Google Images and finally how to cache Youtube.

Squid-2.6 IPv6

September 30, 2007

In case you didn’t know, there’s a work in progress for IPv6 support in Squid-2.6. You’ll find a patch here which, reportedly, is being used in production at a few sites.

If you’d like to see IPv6 in a future Squid-2 release – its a very large change to introduce in the squid-2.6 release so it would appear in a 2.7 or 2.8 release – then please join the squid-users mailing list and let us know.

(I hear a lot of people complaining about how Squid doesn’t “support IPv6” and yet won’t try Squid-3+IPv6 or even try googling for alternatives. The truth is that there’s been unofficial patches to Squid-2 to support IPv6 in some fashion for a number of years now – heck, there was an IPv6 patch to Squid-1! – but noone volunteered to stand up, tidy it up and get it in shape for inclusion into the main tree. If IPv6 is important to you then please say so; please test the stuff thats out there and don’t hesitate to donate to the Squid project with a note saying “for IPv6!”.)

Squid-2 performance work

September 30, 2007

My main focus at the moment is to tidy up areas of Squid-2 with an eye towards both nicer code internals and better performance. Over the last year I’ve committed work to Squid-2 which has eliminated a large part of the multiple data copying and multiple request/reply parsing which went on in Squid-2.6 and earlier.

Unfortunately I don’t run any busy Squids, and so I’m not always confident my changes are correct. Squid has a lot of baggage and even with 10 years experience I still hit the occasional “wha? I didn’t know it did that” case.

My recent work is focusing on eliminating all the extra memory copying that goes on. Part of this will involve changing the internal dataflow and some inclusion of new buffer and string management code. Yes, C++ would help here (I do like the idea of the compiler enforcing my reference counting semantics to be correct!) but Squid-3 is still in beta and squid-2 is still C. I’d rather not break Squid-3 when its not yet released and this work is  “small” jump for people running Squid-2.HEAD or Squid-2.6.

The “store_copy” branch in Sourceforge will focus on converting one of the heaviest users of memcpy() – the storeClientCopy() API which allows the client-side to fetch data from the memory store – to use reference counted read-only buffers rather than copying the data into a supplied buffer. Reference counting in C is “tricky” even at the best of times. Its eliminated almost 5% of CPU use due to memcpy() but this only applies on my workbench (memory-only workload, small objects, local client/server.) It may work for you, it may not. Its part of a bigger goal – to avoid copying data where possible inside Squid – which will result in a leaner, faster proxy cache for everyone.

Its main noticable savings should be RAM use – temporary 4k buffers aren’t used anymore to store data as its being written to the client. This may be more noticable than the CPU savings. Regardless, I’d like to know how it runs for you!

To help:

  • Grab a Squid-2.HEAD tree, something reasonably recent;
  • Grab the patch from the store_copy branch at Sourceforge and patch Squid-2.HEAD;
  • Compile and run it!
  • Let me know how it runs – is it running smoothly? Is it leaking memory? Crashing? Serving incorrect data?

It “works for me” in my test bench at home. I’d love to know this is stable enough to commit to Squid-2.HEAD and move onto the next work.

Squid-2 updates: Logfiles and buffers

September 23, 2007

I’ve made three changes to the Squid-2.HEAD codebase this weekend.

First up – I’ve modified the memory allocator to not zero every sort of buffer. This can be quite expensive for large buffers, especially on older machines or very busy Squid servers. Squid-2.HEAD now has the “zero_buffers” option which currently defaults to “on”. To disable zero’ing buffers please add “zero_buffers off” to your squid.conf file. I’ve seen up to 10% CPU savings on my testbed at home but this may vary wildly depending upon work load.

Secondly – the ‘logtype’ configuration option has been removed and replaced with the ability to define logging types per logfile. You can now prefix your log line with “daemon:”, “stdio:”, “udp:” or “syslog:”. “syslog:” works the same as before; “stdio:” and “daemon:” just take a path, and “udp:” takes an IP:port URL.

To log to a UDP socket, try:

access_log udp://192.168.1.101:1234

Please note though that the default UDP payload size (defined in src/logging_mod_udp.c) is 1400 bytes and any application you decide to use to dump the logfile entries must be able to receive UDP packets that big. There’s a system-wide UDP packet limit in some operating systems (for example, sysctl net.inet.udp.maxdgram under FreeBSD) to also consider. If in doubt, do a tcpdump on both sides and make sure you’re seeing the packets of the right size getting there.

Note too you can’t use these options for the cache_log – it must always be a normal file path.

Logfile improvements in Squid-2-HEAD

September 19, 2007

I’ve committed my logfile handling improvements to Squid-2-HEAD. Essentially, it lets people write self-contained code modules to implement different logging methods. The three supported methods now are:

  • STDIO, which is how Squid currently does its logging;
  • Syslog, which is compiled in if you enable it; and
  • Daemon, which uses a simple external helper to write logfiles to disk.

Those of you who have run Squid may have noticed that it couldn’t support writing more than a hundred or so requests a second to disk before performance suffered. There’s no reason it shouldn’t handle this – a hundred requests a second is only 16 kilobytes a second to write – but the use of STDIO routines to do this had a negative impact on performance.

The logfile daemon allows the blocking disk IO to occur outside of the main Squid process; which basically means Squid can continue doing what its doing well (all the other stuff) and any blocking disk activity occurs in a seperate process.

To use? Compile and install Squid-2-HEAD, then include the following line into your configuration:

logtype daemon

In reality, Squid with the logging daemon can now handle writing -thousands of requests a second- to disk without any performance impact. Furthermore, if the logging daemon can’t write to disk fast enough Squid will log a error message stating its falling behind and drop logging entries.

I’ve tested this up to three thousand requests a second over the course of a few hours (to a dedicated logging disk however) and it handles it without a problem.

If enterprising souls wished, they could write a UDP logging helper, or a MySQL external logging helper, without needing to modify the Squid codebase.

This code will eventually also appear in Squid-3 after 3.0 is released.

Squid Sighting: Advproxy!

September 15, 2007

Another Squid sighting: the IPCop AdvProxy add-on is really just Squid-2.6 in disguise!

Chalk up another one for Squid.

It has a rather interesting addon – the “updates cache” which caches windows and symantec updates through a clever use of redirectors. Cute!