Archive for September, 2007

Squid-2.6 IPv6

September 30, 2007

In case you didn’t know, there’s a work in progress for IPv6 support in Squid-2.6. You’ll find a patch here which, reportedly, is being used in production at a few sites.

If you’d like to see IPv6 in a future Squid-2 release – its a very large change to introduce in the squid-2.6 release so it would appear in a 2.7 or 2.8 release – then please join the squid-users mailing list and let us know.

(I hear a lot of people complaining about how Squid doesn’t “support IPv6” and yet won’t try Squid-3+IPv6 or even try googling for alternatives. The truth is that there’s been unofficial patches to Squid-2 to support IPv6 in some fashion for a number of years now – heck, there was an IPv6 patch to Squid-1! – but noone volunteered to stand up, tidy it up and get it in shape for inclusion into the main tree. If IPv6 is important to you then please say so; please test the stuff thats out there and don’t hesitate to donate to the Squid project with a note saying “for IPv6!”.)

Squid-2 performance work

September 30, 2007

My main focus at the moment is to tidy up areas of Squid-2 with an eye towards both nicer code internals and better performance. Over the last year I’ve committed work to Squid-2 which has eliminated a large part of the multiple data copying and multiple request/reply parsing which went on in Squid-2.6 and earlier.

Unfortunately I don’t run any busy Squids, and so I’m not always confident my changes are correct. Squid has a lot of baggage and even with 10 years experience I still hit the occasional “wha? I didn’t know it did that” case.

My recent work is focusing on eliminating all the extra memory copying that goes on. Part of this will involve changing the internal dataflow and some inclusion of new buffer and string management code. Yes, C++ would help here (I do like the idea of the compiler enforcing my reference counting semantics to be correct!) but Squid-3 is still in beta and squid-2 is still C. I’d rather not break Squid-3 when its not yet released and this work isĀ  “small” jump for people running Squid-2.HEAD or Squid-2.6.

The “store_copy” branch in Sourceforge will focus on converting one of the heaviest users of memcpy() – the storeClientCopy() API which allows the client-side to fetch data from the memory store – to use reference counted read-only buffers rather than copying the data into a supplied buffer. Reference counting in C is “tricky” even at the best of times. Its eliminated almost 5% of CPU use due to memcpy() but this only applies on my workbench (memory-only workload, small objects, local client/server.) It may work for you, it may not. Its part of a bigger goal – to avoid copying data where possible inside Squid – which will result in a leaner, faster proxy cache for everyone.

Its main noticable savings should be RAM use – temporary 4k buffers aren’t used anymore to store data as its being written to the client. This may be more noticable than the CPU savings. Regardless, I’d like to know how it runs for you!

To help:

  • Grab a Squid-2.HEAD tree, something reasonably recent;
  • Grab the patch from the store_copy branch at Sourceforge and patch Squid-2.HEAD;
  • Compile and run it!
  • Let me know how it runs – is it running smoothly? Is it leaking memory? Crashing? Serving incorrect data?

It “works for me” in my test bench at home. I’d love to know this is stable enough to commit to Squid-2.HEAD and move onto the next work.

Squid-2 updates: Logfiles and buffers

September 23, 2007

I’ve made three changes to the Squid-2.HEAD codebase this weekend.

First up – I’ve modified the memory allocator to not zero every sort of buffer. This can be quite expensive for large buffers, especially on older machines or very busy Squid servers. Squid-2.HEAD now has the “zero_buffers” option which currently defaults to “on”. To disable zero’ing buffers please add “zero_buffers off” to your squid.conf file. I’ve seen up to 10% CPU savings on my testbed at home but this may vary wildly depending upon work load.

Secondly – the ‘logtype’ configuration option has been removed and replaced with the ability to define logging types per logfile. You can now prefix your log line with “daemon:”, “stdio:”, “udp:” or “syslog:”. “syslog:” works the same as before; “stdio:” and “daemon:” just take a path, and “udp:” takes an IP:port URL.

To log to a UDP socket, try:

access_log udp://

Please note though that the default UDP payload size (defined in src/logging_mod_udp.c) is 1400 bytes and any application you decide to use to dump the logfile entries must be able to receive UDP packets that big. There’s a system-wide UDP packet limit in some operating systems (for example, sysctl net.inet.udp.maxdgram under FreeBSD) to also consider. If in doubt, do a tcpdump on both sides and make sure you’re seeing the packets of the right size getting there.

Note too you can’t use these options for the cache_log – it must always be a normal file path.

Logfile improvements in Squid-2-HEAD

September 19, 2007

I’ve committed my logfile handling improvements to Squid-2-HEAD. Essentially, it lets people write self-contained code modules to implement different logging methods. The three supported methods now are:

  • STDIO, which is how Squid currently does its logging;
  • Syslog, which is compiled in if you enable it; and
  • Daemon, which uses a simple external helper to write logfiles to disk.

Those of you who have run Squid may have noticed that it couldn’t support writing more than a hundred or so requests a second to disk before performance suffered. There’s no reason it shouldn’t handle this – a hundred requests a second is only 16 kilobytes a second to write – but the use of STDIO routines to do this had a negative impact on performance.

The logfile daemon allows the blocking disk IO to occur outside of the main Squid process; which basically means Squid can continue doing what its doing well (all the other stuff) and any blocking disk activity occurs in a seperate process.

To use? Compile and install Squid-2-HEAD, then include the following line into your configuration:

logtype daemon

In reality, Squid with the logging daemon can now handle writing -thousands of requests a second- to disk without any performance impact. Furthermore, if the logging daemon can’t write to disk fast enough Squid will log a error message stating its falling behind and drop logging entries.

I’ve tested this up to three thousand requests a second over the course of a few hours (to a dedicated logging disk however) and it handles it without a problem.

If enterprising souls wished, they could write a UDP logging helper, or a MySQL external logging helper, without needing to modify the Squid codebase.

This code will eventually also appear in Squid-3 after 3.0 is released.

Squid Sighting: Advproxy!

September 15, 2007

Another Squid sighting: the IPCop AdvProxy add-on is really just Squid-2.6 in disguise!

Chalk up another one for Squid.

It has a rather interesting addon – the “updates cache” which caches windows and symantec updates through a clever use of redirectors. Cute!

Why even bother making cachable content?

September 8, 2007

I see so many sites pop up in some Squid logs which seem to try and avoid any attempt at caching. I’m not sure why, but I’m going to try and cover a few points here.

  1. I want to know exactly how many bits I’m shipping! This is especially prevalent in the American internet scene. Everyone’s about shipping bits. The more bits you ship the “better” you are. (There’s some talk about the “number of prefixes you advertise” also being linked to how “big” your network is; or maybe people are just lazy at trying to aggregate their BGP announcements. I digress..) Sure, if you graph your outbound links this is true. But you can do HTTP tricks to know exactly how many requests you’re handling without shifting the whole object out. Just set the objects to “must revalidate” rather than being immediately expired; let the web cache always revalidate the request via an If-Modified-Since request. You’ll get the IMS and can send back a “not-modified” reply; you can then synthesise a graph based on what you -would- be serving. Voila, free bits. This can be quite substantial if you have lots and lots of images on your site.
  2. I want to know how many people are accessing my site! This is definitely a left-over from the 90s and even then the problem was solved. If you absolutely positively need to know about page impressions then just embed a non-cachable 1×1 transparent gif somewhere where it won’t slow the page rendering down. Leave the rest of the site cachable. Really though, these days people should just use javascript and cookies (a la the Google “urchin”) if they want accurate “people” and “impression” counts. Trying to do it based on page accesses and unique IPs just isn’t going to cut it.
  3. I don’t want people to cache the data; they have to login first! You can tell proxy caches that they must first revalidate the authentication information from the origin server before serving out content. You can have your cake and eat it too.
  4. Making my content cachable is too damned hard! How do I know what headers when and where? Its not all that difficult. Mark Nottingham’s Caching Tutorial covers a lot of useful information about building cachable websites. You can keep control of your authenticated content and push out more content than you’re actually buying transit for.

Just remember a few simple rules:

  • Don’t hide static content behind query URLs (ie, stuff with a ‘?’ in them). Caches won’t cache them (unless, of course, they’re built by me. But then, I am pretty evil.) I see plenty of websites which hide all of their images and flash videos behind a CGI script with a ? in the path – caches just won’t bother trying to cache it. Amusingly, most of those sites hide static content behind CGI scripts! Just imagine what it’d be like to be able to push five or ten times the amount of content to clients behind proxy caches.
  • Don’t be afraid to ask for help in how to optimise your site for forward caching. Heck, even asking on the squid-users mailing list will probably get you sorted out without too much trouble.
  • There are people behind proxy caches – the developing world for one, but there’s plenty of caches to be found in schools, offices, wired buildings, wireless mesh networks and the like. Bandwidth isn’t free and never will be. You might be able to buy a 40gbit pipe to your favourite transit provider in North America but that won’t help people in South Africa or Australia where international bandwidth is still expensive and will remain so for the forseeable future. And yes, we like watching Youtube as much as the next person.

LDAP improvements in Squid-3!

September 6, 2007

One of the OpenLDAP Core Developers – Pierangelo Masarati – has offered to help out by submitting Squid-3 LDAP authentication and session improvements.

The first patch improves the code organisation and (if I read his email right) begins to support SASL bind (external ldapi://).


Squid-2.6.STABLE16 is out!

September 6, 2007

Henrik has released Squid-2.6.STABLE16. This resolves a number of bugs, including a crash bug introduced in Squid-2.6.STABLE15.

The changeset list explains whats changed; the release page includes downloads and other useful stuff. Don’t forget to read the release notes if you’re updating from 2.5 to 2.6!

And don’t forget the Squid-2.6 Configuration Manager!

Squid-3.0.PRE7 is out (silently..)

September 3, 2007

Duane/Alex and the gang have released Squid-3.0.PRE7. Its shaping up to be a pretty good release. If you’re running Squid-3.0 or you’re interested in helping out testing the release then please visit here to download.

I’m not sure what exactly has been improved since PRE6 but they’ve been busy trying to track down and fix all the bugs they can. Alex has been committing chunks to improve the ICAP support quite significantly. Amos has also been doing more IPv6 work in his branch, which will hopefully be merged post Squid-3.0.STABLE1.

Don’t upgrade to Squid-2.6.STABLE15, skip straight to Squid-2.6.STABLE16

September 3, 2007

The title pretty much says it all. Squid-2.6.STABLE15 has a number of important fixes to the previous version but has a couple of teething problems and may be unstable in the real world.

Leave Squid-2.6.STABLE15 right well alone and wait until Henrik rolls the Squid-2.6.STABLE16 release.

(You can blame me for one of the bugs introduced into Squid-2.6.STABLE14 if you run select(); oops. Henrik’s corrected that in STABLE15 and STABLE16.)

If you run NTLM authentication then please upgrade to STABLE16 and report any bugs or bad behaviour you see. Some people have reported broken behaviour between earlier squid-2.6 STABLE versions and the current STABLE release – if you just stay on the old version and don’t tell us why you’re doing it we can’t fix the bugs!