Archive for the ‘Web-Caching’ Category

Whats going on with Squid 4

July 1, 2016

Those that have been paying attention will have noticed that Squid-4 beta cycle has been going on for a very, very long time. A whole year now in fact. There are several reasons for this.

* The ecosystem that Squid is deployed into has a somewhat mixed situation in regards to C++11.

Even 5 years after it was standardized compiler support is still not readily available in some popular OS distributions. In particular there are a large number of people clinging to the outdated but still supported RHEL 6 and its derived family of OS which do not easily provide recent versions of GCC. So we are procrastinating on the deprecation of Squid-3.5 to give more people a chance to move on. The clock is ticking though.

* We had a lack of early adopters testing the early versions of Squid-4.

So bugs that only show up in real usage have been very slow to be found. We don’t make a new version start its beta cycle until the developers are reasonably sure that its stable enough to be used. The beta process is supposed to be just a confirmaton of that. Indeed a year ago 4.x looked like it had no bugs at all. Which is kind of suspicious, but not impossible. As the betas progressed though the bug reports started rolling in. It is somewhat a testament to how our modern build systems are catching out minor and trivial bugs that these tester reported issues have largely all been difficult to resolve.

* We are experimenting with a new management process for Squid releases.

Previously Squid-5 would have been branched for development and Squid-4 starting to stagnate, er, become “stable”. This time though we have delayed the branching and instead kept Squid-4 as main development branch for minor changes while the beta cycle goes on. That has made it a little more volatile than most of the Squid-3 series betas.

All up, we are down to the last few major bugs to be resolved in the new code. Progress on those is slow but steady. So Squid-4 production “stable” release should be not to far ahead on the calendar.

Squid-3.2 mythbusting NAT

December 19, 2014

One of the more frequently mentioned “problems” with Squid-3.2 since its release is a change in how it handles NAT failures.

The Myths

“Squid used to work when I NAT traffic to it from my router.”

“Squid used to work with one port when I configure the browser and NAT traffic.”

The Reality

No. Squid up until 3.1 would silence the NAT errors and treat the router as if it were the client browser. Any differences between the Host header and the requested URL were also completely ignored. All of this would be invisible to you the administrator, hidden at debug levels not normally shown. With several of the problems being completely unrecorded as well.

This last part seems to be behind a fair bit of skepticism about whether the problem we solved really exists. Nothing was showing up in the logs before and even a full packet trace from Squid or a gateway server would not reveal how JavaScript hijacks a client browser to pass internal documents to an external attacker.

What Changed?

Squid-3.2 finally added Host header validation to protect against the nasty behaviour and a few 0-day attacks using CVE-2009-0801 security vulnerability. Which has been a known flaw with NAT interception by HTTP proxies for at least 12 years now. This meant two things had to change:

* ignoring NAT errors had to stop. We can’t rely on validation results if the TCP details are already known to be corrupted.

* traffic directly from a browser to a NAT intercept port on Squid had to stop. There is no way to separate the NAT lookup result of error and unknown entry. A pity really, but that is what we are forced to work with.

During testing we uncovered a lot of really quite nasty behaviour by various clients, and indeed some public services setup to take advantage of the NAT bug as if it were a feature. On the whole a lot of client software sends a Host header with values strangely unrelated to what is being requested of the proxy. All of this had to go, or did it? for most of the year or so of testing it was banned completely and a HTTP error message returned for any such garbage. But in the end it was clear that we had to let it through somehow or we would be pitting Squid against the biggest players on the Internet … yeah.

(Almost-) Final Results

Squid-3.2 and later will reject traffic where NAT lookups fail on an intercept port. This includes NAT done on external devices and browsers directly sending proxy requests to the Squid intercept port. At the very least accurate reporting about the traffic and what it contains is the critical factor. If we let this traffic into Squid it would seriously compromise the validation results and also allow malicious clients to attack internal network resources with a large measure of anonymity. Neither of which is acceptible for a proxy trusted with controlling user traffic.

The best practice guideline for some years has been that NAT MUST be done on the Squid device. Squid-3.2 are now enforcing it as a basic requirement. If you are one of the network administrators running into this requirement change, please investigate the Policy Routing functionality of the device you originally had doing the NAT.

Squid-3.2 and later will accept invalid Host headers and produce a response to the client. But they will not cache the untrustworthy transaction results. They will contact the same server which the client TCP connection would have originally reached had Squid not been there (visible as ORIGINAL_DST in the logs). Notice that this is now deserving of the name “transparent proxy” a lot more than previous intercepting Squid. Even so NAT interception is still only half-transparent with the server being able to easily identify the proxies existence.

The loss of caching is intended to be a temporary solution until we can properly implement per-client caching of objects. As the saying goes there is nothing so permanent as the temporary – Squid still contains this workaround several series later. Support with this work is very welcome.

The situation when upstream peers are involved is also quite dangerous. For now we have had to permit Squid to pass the traffic to peers, which opens all multi-hop systems of proxies to the same vulnerabilities that were previously possible wth a single intercepting proxy. The solution is also going to require a substantial amount more work and be some years away. For the meanwhile it is a good idea to avoid passing intercepted traffic to cache_peer when possible.

More information on the problems, log entries, when they occur and what can be done about each is detailed in the Squid wiki Host header forgery page.

Squid-3.2: Pragma, Cache-Control, no-cache versus storage

October 16, 2012

The no-cache setting in HTTP has always been a misunderstood beastie. The instinctual reaction for developers everywhere is to believe that it prevents caching or cache handling or some such myth.

This is not true.

By definition it merely forces caches to revalidate existing content before use (ie it tells the proxy to “be ultra, super-duper conservative. Do not send anything from cache without first contacting the server to double check it.”).

When sent on a client (browser) request:

  • Pragma:no-cache instructs HTTP/1.0 caches to revalidate any cached response before using it.
  • Cache-Control:no-cache instructs HTTP/1.1 caches to revalidate any cached response before using it.
  • Pragma:no-cache only works for HTTP/1.1 caches when Cache-Control is missing.
  • all other values of Pragma are undefined and to be ignored.

When sent on a server response:

  • Pragma in all its forms has no meaning whatsoever. It must be ignored.
  • Cache-Control:no-cache instructs HTTP/1.1 caches to revalidate this response every time it is re-used.

If you read those bullet points above very carefully you will notice that at no point is store mentioned. None whatsoever. The closest it gets is mentioning what to do with already-stored content (revalidate it). In fact the HTTP/1.1 specification goes as far as to say explicitly that responses with no-cache MAY be stored – provided the revalidation is done as above.

no-cache in Squid

The well-known squid versions of the past have all been HTTP/1.0 compliant and advertised themselves as HTTP/1.0 software. These proxies both looked for Pragma:no-cache headers and obeyed them:

  • Squid being HTTP/1.0 that Pragma took precedence over Cache-Control.
  • Due to lack of full HTTP/1.1 revalidation in very old versions Squid has traditionally treated no-cache in either header as if it were Cache-Control:no-store.
  • Due to some old server software Pragma:no-cache on responses was treated as a mistaken form of Cache-Control:no-store.

Starting with version 3.2 Squid is advertising and attempting to fully support HTTP/1.1 specifications. This is a game changer.

All of the above is about to be up-ended, assumptions can be thrown away and some funky cool proxy behaviour allowed to take place.

Hiding in the background is the instruction that Pragma only applies when Cache-Control is missing from a request. We can ignore it – almost completely. When we do have to pay attention we only need to notice the no-cache value and can treat it as if we received Cache-Control:no-cache.

The other change is a potential game changer: The object being transfered is stored now, revalidated later.

Some implications from storing no-cache responses:

  • servers can utilize 304 responses instead of generating new content. Saving a lot of bandwidth and CPU cycles.
  • all those configuration hacks for ignoring or stripping no-cache are no longer needed. Also, the harm they do will become more visible as revalidation is skipped.
  • cache HIT ratio can potentially rise above 50% for forward proxies. As a side effect of the HIT counting market a large portion of web traffic is utilizing no-cache instead of no-store or private. This large portion is cacheable but until now Squid has been dropping it.

Before the marketing department panics about the end of the world lets be clear on one important point:

revalidation means every client request will still reach the end server doing HIT counting, traffic control, whatever – but in a way which allows 304 bandwidth optimization on the responses.

Do not expect a sudden rise of TCP_HIT in the proxy logs though. It is more likely to show up as TCP_REFRESH_HIT or the nasty TCP_REFRESH_MODIFIED/TCP_REFRESH_MISS which is produced by broken web applications always sending out new unchanged content.

Squid Proxy Server 3.1: Beginner’s Guide

February 25, 2011

For those who have been waiting and asking there is now a beginners guide to Squid-3.1 available for sale from Packt Publishing. Authored by Kulbir Saini.

This book seeks to be an introductory guide to Squid and specifically to the features available in the Squid-3 series so far. It covers both the basics and tricky details admin need to be aware of and understand when working with Squid. From configuring access controls through deployment scenarios to managing and monitoring the proxy operation this book has it.

It does not seek to be an update to the O’Rielly book, so there are many fine details and advanced technical descriptions missing. Although even experienced Squid admin may find new topics and features mentioned here that they were unaware of.

Continuous Integration

August 18, 2009

For the last few years there has been a slow growing improvement to the testing and QA Squid is subject to. This last week has seen the construction and rollout  of a full-scale build farm to replace some of our simple internal testing. Robert Collins covers the growth process in his blog.

Here is the initial release notice:

Hi, a few of us dev’s have been working on getting a build-test environment up and running. We’re still doing fine tuning on it but the basic facility is working.

We’d love it if users of squid, both individuals and corporates, would consider contributing a test machine to the buildfarm.

The build farm is at http://build.squid-cache.org/ with docs about it at http://wiki.squid-cache.org/BuildFarm.

What we’d like is to have enough machines that are available to run test builds, that we can avoid having last-minute scrambles to fix things at releases.

If you have some spare bandwidth and CPU cycles you can easily volunteer.

We don’t need test slaves to be on all the time – if they aren’t on they won’t run tests, but they will when the come on. We’d prefer machines that are always on over some-times on.

We only do test builds on volunteer machines after a ‘master’ job has passed on the main server. This avoids using resources up when something is clearly busted in the main source code.

Each version of squid we test takes about 150MB on disk when idle, and when a test is going on up to twice that (because of the build test scripts).

We currently test:

  • 2.HEAD
  • 3.0
  • 3.1
  • 3.HEAD

I suspect we’ll add 2.7 to that list. So I guess we’ll use abut 750MB of disk if a given slave is testing all those versions.

Hudson, our build test software, can balance out the machines though – if we have two identical platforms they will each get some of the builds to test.

So, if your favorite operating system is not currently represented in the build farm, please let us know – drop a mail here or to noc @ squid-cache.org – we’ll be delighted to hear from you, and it will help ensure that squid is building well on your OS!

-Rob

That just about covers everything. Hardware and build software requirements are listed in the build farm page.

Hi, a few of us dev's have been working on getting a build-test
environment up and running. We're still doing fine tuning on it but the
basic facility is working.

We'd love it if users of squid, both individuals and corporates, would
consider contributing a test machine to the buildfarm.

The build farm is at http://build.squid-cache.org/ with docs about it at
http://wiki.squid-cache.org/BuildFarm.

What we'd like is to have enough machines that are available to run test
builds, that we can avoid having last-minute scrambles to fix things at
releases.

If you have some spare bandwidth and CPU cycles you can easily
volunteer. 

We don't need test slaves to be on all the time - if they aren't on they
won't run tests, but they will when the come on. We'd prefer machines
that are always on over some-times on.

We only do test builds on volunteer machines after a 'master' job has
passed on the main server. This avoids using resources up when something
is clearly busted in the main source code.

Each version of squid we test takes about 150MB on disk when idle, and
when a test is going on up to twice that (because of the build test
scripts).

We currently test
2.HEAD
3.0
3.1
3.HEAD

and I suspect we'll add 2.7 to that list. So I guess we'll use abut
750MB of disk if a given slave is testing all those versions.

Hudson, our build test software, can balance out the machines though -
if we have two identical platforms they will each get some of the builds
to test.

So, if your favorite operating system is not currently represented in
the build farm, please let us know - drop a mail here or to noc @
squid-cache.org - we'll be delighted to hear from you, and it will help
ensure that squid is building well on your OS!

-Rob

Life of a Beta

July 11, 2009

From early inception when the developers have nothing but dreams for it.  Through the coding and arguments about what should be included and how. Through the alpha testing with its harrowing hours pondering obscure code from last decade. Even the odd period of panic as security bugs are whispered about behind closed doors. Such is the early life of software.

Two weeks ago word went out that 3.1 was reaching end-game.

This part of the release lifecycle seems to be going well. Packages appearing very slowly as QA throws demanding eyes on the code and making us actually fix things. Don’t be fooled by the packages out already, they have been in QA for a few months to get this far. On that note:

NetBSD, Gentoo, Ubuntu, FreeBSD and RedHat already have packages ready and available for at least testing use if you know where to look (ie the links right there might be a good start).

Debian has a bit more QA to go as of the writing, but the maintainer tells me there will be packages out soon.

OpenBSD and Mac turned out at the last minute to be running split-stack IPv6 implementations (for security apparently). All the documentation read in two years left the impression it was a Windows XP anarchism (and who runs XP Pro on a server?), so support was delayed and delayed.  The OpenBSD maintainer and someone interested from Mac are working with myself on closing that gap in the features.

There may be more OS with 3.1 packages. I’ve only begun working my way down the distrowatch.org popularity list to see which OS do and who to contact. Squid has bundles on over 600 OS apparently.

If you know who does the official packaging for your OS and whether there are 3.1 packages ready, please do me a favor and mention it. I’m seeking a web page where to find the squid (or squid3/squid30/squid31) package information and also the place where distro bug reports about Squid might end up.

Release 3.1

November 4, 2008

Kinkie pointed out Linus Torvalds blog today to the rest of us here working on Squid. As the release maintainer for Squid-3 this year I kind of agree, its a sad time to cutting a new version. For me its more of a reflection that for all the high hopes we have of this new release, we had the same or similar hopes of the earlier one. Just 12 months ago now.

On that sad note, yes its finally happened. 3.0 has aged into a full blown stable package. Most of a month and no new bugs. Perfect time for something shiny and new for the neo-tech fanclub. And so with that for an intro we are gone for 3.1 !

3.1 is available for beta testing in the form of 3.1.0.1. see the Release Notes for further details on the finer details of change.

This release has gained from the experiences of 3.0 and 2.6, starting from a much more stable base of code than the initial. 3.0 had a long period of years with few active developers, an interminably long period of testing releases, and in hindsight a premature birth.

Alongside the code this release has a wider collaboration with active users. For the first time in many years we held a Developer meeting that included Users. We who were there certainly took in a lot of feedback from all sides. I hope those users who talked to us can see in this release that their comments, even those made in passing, have been listened to and worked on.

The small comment from one user when asked what their biggest itch with squid was “we don’t like these being called STABLE, when its obvious they are not.” has led to the most notable change made to 3.1.  That comment and similar feelings by others lead us into discussions on the release naming and numbering. From which we have produced – 3.1.0.1 – the second milestone point of the branch we are calling 3.1. Where the developers have everything done and working for us.

no more DEVEL, PRE, or RC, no more premature labels guessing when things might be STABLE.  Just 3.1.0.1. Further testing from the rest of you will show whether anyone can consider it stable, unstable, usable or as buggy as raw earth.

From the developers; We use it. We love it. Try it, and see for yourselves.

Some of the stuff you will find there is;

  • a lot of small changes aimed towards easier use and configuration (three cheers to those who nagged long an hard for this).
  • a lot of network RFC compliance extensions, making 3.1 much more capable of meeting modern network needs. The future still holds improvements, but 3.1 is definitely better in many respects than everything that came before.
  • a lot of things to make Squid a better experience for your own users. More seamless network recovery tricks than ever before. We have even tagged along behind the international localization bandwagon in our own way to make the errors squid does have to show both pretty and readable.

Sadly, careful readers will notice a section of the Release Notes labeled “Regressions against 2.7”.  Yes, those of you who moved to 2.7 because you needed some brand new feature there may still have trouble migrating up to 3.1. What we have done is to port as many of the 2.6 features and fixes as we could. A few did not make it in time, but will be coming in 3.2, alongside the features added as experimental in 2.7.

On the overview:

  • 2.5 has disappeared over the horizon into the long dark night of obsoletion.
  • 2.6 is itself officially aging out now. Supported, but the developer first response is “can you try something newer?”.
  • 2.7 is being maintained for the few extremely high-performance accelerator setups. But in general the Squid-2 sequence is aging out for us developers.
  • 3.0 has reached a point of stability, though not fully-featured.
  • 3.1 is available for testing as the next step up. You should be planning to migrate up to 3.1 or later release.

If there are any features holding you to Squid-2, or even an issues you find with testing Squid-3 speak up, we rely on your input to choose the most needed features for porting.

Thank you all, and enjoy your use of Squid 3.1

Squid-2.6 + TPROXY + Debian

April 7, 2008

Jason Healy posted some useful information to the squid-users list a week or so ago.

Quoting:

I’ve been a happy user of Squid for the past 10 years or so, and I’d like to take a second to thank everyone who has worked so hard to make such a great piece of software!  I’d like to give back to the Squid community, but unfortunately I’m not much of a C hacker.  However, I’m hoping I can still help.

I’ve just spent a few days getting my school’s Squid install up to date (we were running 2.5 on Debian Woody).  I switched to using tproxy this time around (we used to do policy routing on our core, but it was spiking the CPU too much).  Thanks to the mailing list, some articles on the web, and a little messing around I was able to get the whole system up and running.  I’ve documented the steps here:

http://web.suffieldacademy.org/ils/netadmin/docs/software/squid/

The document is written for someone with a decent grasp of Linux, and is specifically geared to Debian Etch.  There are some tweaks that are pecific to our install (compile-time flags, mostly), but otherwise it’s pretty generic.  Hopefully, this will help someone else out who’s trying to build a similar system, so I’m posting so it will hit the archives.

Squid Updates – April 2008

April 6, 2008

University studies have begun for me and so my available time has been limited. But to summarise:

  • Squid-3.0 has been released, for people who are interested in playing with it
  • Kinkie has updated the Wiki theme in a big way – http://wiki.squid-cache.org/
  • Squid-3 development has migrated to bzr
  • Alex is looking to merge in the first set of eCAP related changes into Squid-3.HEAD
  • Squid-2.7 is on track to be released – there’s one outstanding bug and its unfortunately difficult to fix. http://www.squid-cache.org/bugs/show_bug.cgi?id=2160 is the bug to watch.
  • Funded Squid-2 development will continue for the time being; mostly from projects I’m working on. We’ll see how things progress there. The Squid-2 Roadmap is slowly changing, evolving and being completed.

Squid-2 performance work: graph #1

January 23, 2008