Welcoming a Squid Project sponsor: Rackspace

November 14, 2013 by

Testing is a very important part of the life of a project such as Squid, especially as we wish to support as many different operating systems and platforms as we can, as efficiently as we can.
This also means that we can’t restrict to well-known POSIX APIs, but we need to account for each OS’s particular quirks and optimizations, especially in those areas where there is the need of a deep integration with the OS.

One of the tools which we have added a few years ago is the build farm. When new code is committed to a followed branch, our Jenkins setup will notice it, and schedule a test run on all the eligible nodes. This means we need to rely on a mix of operating systems, libraries and compilers.

We are always on the lookout for sponsors, both individuals and companies, and back in July Rackspace volunteered to donate space for the Squid project us on their public cloud infrastructure. As of now, we have 11 virtual machines and one database hosted in their cloud, running mostly various flavors of Linux and FreeBSD. The experience has been very satisfying, in particular I’ve found the claim about “fanatical support” to be well-founded in the dedication and kindness of everyone who was involved in making this happen.

It was not all rose petals, of course. Our needs mean that we are better off with something resembling the bare metal, while the model Rackspace offers is more oriented towards having predefined images which configure certain aspects such as network addresses, admin users’ passwords etc.

Still, it is very refreshing that if an OS is among the list of supported images (as an example, Ubuntu Saucy was available about three days after Canonical released it) setting up a new build node in the farm takes literally 15 minutes, most of which is spent waiting for the server image to be cloned from the template.

The first effects of the newly expanded build farm are already apparent: we have reached a stable build status on a wide number of Linux, FreeBSD and OpenBSD versions, with gcc, clang and icc compilers. We are starting to set sight on MacOSX and to bring Squid 3 to Windows. All of this would have been much harder, or even impossible, without the Build Farm.

So on behalf of everyone who works on Squid, I’d like to thank Rackspace for their kind contribution to the Project.

Squid Software Foundation Board of Directors Position Vacancy

September 10, 2013 by

The Squid Software Foundation is seeking to expand the board of directors. We currently have three directors and looking for at least one more to join the team. For details about the position and what the directors do please see http://www.squid-cache.org/Foundation/director.html

Being a Squid Software Foundation Director is a serious responsibility, but also a cool gig! Not only can you have an immediate and significant impact on the Squid Project, but you can earn admiration and respect of your peers while doing more than just your usual software development, system administration, or support activities.

Do you want to brag about being more than a successful geek? Exercise the parts of your brain you did not know you had? Resolve real-world conflicts and balance real-world trade-offs? Then how about solving a few difficult Squid Project problems? Want to spice up your resume or simply learn to manage a popular open source project? Consider nominating yourself!

Applicants should contact board@squid-cache.org with nomination for the position of Director. Self-nominations are accepted and encouraged. Please indicate why you think the nominee would be a good Foundation director.

Please submit nominations by October 4th, 2013.
The Squid Software Foundation Board of Directors
Henrik Nordström,
Amos Jeffries,
Alex Rousskov.

The Squid Software Foundation

July 17, 2013 by

As many of you know by now, the Squid Project has formed the Squid Software Foundation, a non-profit organization to support Squid developer and user communities. The Foundation aims to provide community governance and representation; logistical and administrative assistance; infrastructure resources; as well as copyright ownership and licensing authority for the Squid Project.

http://www.squid-cache.org/Foundation/

The Foundation has been recognized by US government as a 501(c)3 public charity. This recognition is meant to assure the public of our not-for-profit status, exempt Foundation from paying most taxes, and to make qualified donations to the Foundation tax-deductible. This designation is a big deal, and it took a lot of effort to get this status. A big “Thank You” to those who made it possible!

The Foundation structure has been modeled after the FreeBSD Foundation, with various bits and pieces borrowed from the Apache Foundation and other well-run open source non-profits. You can find our bylaws and other documents at

http://www.squid-cache.org/Foundation/archive/

The Foundation is currently being run by three volunteers: Henrik Nordström, Amos Jeffries, and Alex Rousskov. There is a lot more work than the three of us can handle, and there are plans to expand the Board of Directors to at least four directors this year. Watch this space for further announcements on the subject.

While the Foundation is not going to participate in Squid development, there are plenty of infrastructure and community activities where we can help both Squid developers and administrators. The scope of possible projects ranges from release maintenance, to paid services coordination, to build farm administration, to automated regression testing, to experimental live deployments. With sufficient funding, we may even be able to organize developer meetups and user workshops.

The Foundation is funded by donations. The startup capital was provided by your PayPal payments that the Squid Project informally accumulated since 2007 and a Measurement Factory contribution. We will need a lot more funds to fulfill the Foundation mission though :-). Donations are now gratefully accepted at

http://www.squid-cache.org/Foundation/donate.html

If you have questions about the Foundation, please email them to

board@squid-cache.org.

Thank you,

The Squid Software Foundation Board of Directors
Henrik Nordström,
Amos Jeffries,
Alex Rousskov.

Squid-3.2: managing dynamic helpers

May 2, 2013 by

One of the new features brought in with Squid-3.2 is dynamic helpers. A brief name for a very useful administrative tool and like all tools can be both easy and tricky to use at the same time.

If you have a proxy using helper processes but only a small cache (or none) this is a feature for you.

The good news

Configuration is super easy – just set the initial startup, maximum number of helpers and an idle value for how many to start when new ones are needed to handle the request load.

Dying helpers have a higher threshold before they kill Squid. It is not perfectly tuned yet in the code so improvements will contnue to happen here, but already we see small bursts of helper failures being suppressed by re-started replacements without that all too familiar Squid halt with “helper dying too quickly” messages. Note that this is just higher, not gone.

The bad news

Determining what those values should be is no more easy or straightforward than before. Squid uses fork() to start new helpers. The main side effect of this is that helper instances started while Squid is running will require a virtual memory size equivalent to the Squid worker process memory at the time they are started. If your proxy is pushing the box to its limit on RAM, dynamically started helpers could easily push it over to swapping memory at the worst possible time (peak load starting to arrive). Also on the bad news side is that the helpers are run per-worker. Which has the potential to compound the RAM usage problems.

We do have a proposal put to the development team which will almost completely remove this problem. Having the coordinator or a special spawner kid do the forking instead of the heavy workers. But as of this writing nobody is working on it (volunteers welcome, please contact the squid-dev mailing list).

Practice Guidelines

While it may look like the bad news is worse than the good news it does turn out that most installations are small instances or non-caching worker proxies these days. All of which may need lots of helpers, but are not heavy on the RAM requirements. For all these installations dynamic helpers are ideal and in a lot of cases can even be set with zero helpers on startup for a very speedy delay to first request accepted time.

The caching proxy installations with higher memory requirements in the workers can still make use of the dynamic nature to avoid complete outages in the worst-case situations where peak traffic overloads the planned helpers. But should normally be configured as before with enough helpers to meet most needs started before the RAM requirements become too onerous on the worker.

Until at least the bad news problems above are resolved the default behaviour for Squid will continue to be starting all the maximum helpers on startup. So there are no unexpected surprises for upgrading, and the old advice on calculating helper requirements is still useful for determining that maximum.

Squid-3.2: Pragma, Cache-Control, no-cache versus storage

October 16, 2012 by

The no-cache setting in HTTP has always been a misunderstood beastie. The instinctual reaction for developers everywhere is to believe that it prevents caching or cache handling or some such myth.

This is not true.

By definition it merely forces caches to revalidate existing content before use (ie it tells the proxy to “be ultra, super-duper conservative. Do not send anything from cache without first contacting the server to double check it.”).

When sent on a client (browser) request:

  • Pragma:no-cache instructs HTTP/1.0 caches to revalidate any cached response before using it.
  • Cache-Control:no-cache instructs HTTP/1.1 caches to revalidate any cached response before using it.
  • Pragma:no-cache only works for HTTP/1.1 caches when Cache-Control is missing.
  • all other values of Pragma are undefined and to be ignored.

When sent on a server response:

  • Pragma in all its forms has no meaning whatsoever. It must be ignored.
  • Cache-Control:no-cache instructs HTTP/1.1 caches to revalidate this response every time it is re-used.

If you read those bullet points above very carefully you will notice that at no point is store mentioned. None whatsoever. The closest it gets is mentioning what to do with already-stored content (revalidate it). In fact the HTTP/1.1 specification goes as far as to say explicitly that responses with no-cache MAY be stored – provided the revalidation is done as above.

no-cache in Squid

The well-known squid versions of the past have all been HTTP/1.0 compliant and advertised themselves as HTTP/1.0 software. These proxies both looked for Pragma:no-cache headers and obeyed them:

  • Squid being HTTP/1.0 that Pragma took precedence over Cache-Control.
  • Due to lack of full HTTP/1.1 revalidation in very old versions Squid has traditionally treated no-cache in either header as if it were Cache-Control:no-store.
  • Due to some old server software Pragma:no-cache on responses was treated as a mistaken form of Cache-Control:no-store.

Starting with version 3.2 Squid is advertising and attempting to fully support HTTP/1.1 specifications. This is a game changer.

All of the above is about to be up-ended, assumptions can be thrown away and some funky cool proxy behaviour allowed to take place.

Hiding in the background is the instruction that Pragma only applies when Cache-Control is missing from a request. We can ignore it – almost completely. When we do have to pay attention we only need to notice the no-cache value and can treat it as if we received Cache-Control:no-cache.

The other change is a potential game changer: The object being transfered is stored now, revalidated later.

Some implications from storing no-cache responses:

  • servers can utilize 304 responses instead of generating new content. Saving a lot of bandwidth and CPU cycles.
  • all those configuration hacks for ignoring or stripping no-cache are no longer needed. Also, the harm they do will become more visible as revalidation is skipped.
  • cache HIT ratio can potentially rise above 50% for forward proxies. As a side effect of the HIT counting market a large portion of web traffic is utilizing no-cache instead of no-store or private. This large portion is cacheable but until now Squid has been dropping it.

Before the marketing department panics about the end of the world lets be clear on one important point:

revalidation means every client request will still reach the end server doing HIT counting, traffic control, whatever – but in a way which allows 304 bandwidth optimization on the responses.

Do not expect a sudden rise of TCP_HIT in the proxy logs though. It is more likely to show up as TCP_REFRESH_HIT or the nasty TCP_REFRESH_MODIFIED/TCP_REFRESH_MISS which is produced by broken web applications always sending out new unchanged content.

Happy Eyeballs

July 14, 2012 by

Geoff Huston wrote up a very interesting analysis of the RFC 6555 “Happy Eyeballs” features being added to web browsers recently.

As these features reach the mainstream stable browser releases and more people being using them Squid in the role of intercepting proxy are starting to face the same issues mentioned for CGN gateways. For all the same reasons. Whether or not you are operating an existing interception proxy or installing a new one this is one major new feature of the modern web which needs to be taken into account when provisioning the network and Squid socket/FD resources.

Squid operating as forward proxy do not face this issue as each browser only opens a limited number of connections to the proxy. Although Firefox implementation of the  “Happy Eyeballs” algorithm appears to have been instrumental in uncovering a certain major bug in Squids new connection handling recently.

A Squid Implementation

For those interested, Squid-3.2 does implement by default a variation of the “Happy Eyeballs” algorithm.

DNS lookups are performed in parallel now, as opposed to serial as they were in 3.1. As a result the maximum DNS lookup time is reduced from the sum of A and AAAA response times, to the maximum of both.

TCP connection attempts are still run in serial, but where older versions of Squid interspersed a DNS lookup with each set of TCP attempts the new 3.2 code identifies all the possible destinations first and tries each individual address until a working connection is found. Retries under the new version are also now limited per-address where in the older versions each retry meant a full DNS result set of addresses was re-tried.

As a result dns_timout is separated from connect_timeout which is now fully controlling only one individual TCP connection handshake.

Bugs Marathon to 3.2 release

March 24, 2012 by

The new features for Squid-3.2 are now decided and present, the latest builds seem to be running okay. Operating system distributors are starting to work on producing packages for the upcoming release.

So when do we get to see a stable release?

Yes, well. There is just one little problem. Before 3.2 can be released as stable for widespread production use we need to be sure that there are no serious bugs in the new or updated code. Right now we are aware of a few that have not been fixed.

We need assistance fixing bugs in the 3.2 beta.

The serious bug clearing focus actually began two months ago. The worst bugs have now been squashed and we are down to the last few dozen major bugs blocking a stable release. You can find these marked as major, critical, or blocker in our bugzilla. Any assistance finding the causes or working patches for the remaining bugs is very welcome and will help speed up the release process.

IMPORTANT: please ensure that the bugzilla gets your reports.

 

What is the fuss about Squid-2.7?

Squid 3.2 is a little bit unusual. Being the release where the Squid-3 series finally superceeds the Squid 2.6 and 2.7 fork in both common features and performance. Squid-2 has not been actively maintained for more than a year now. Features available in that alternate series of Squid are almost all available in Squid-3.2, the remaining features are expected to be ported over shortly after 3.2 is released stable and developer time becomes more available.

What this means in terms of bugs is that a lot of the 2.6 and 2.7 series bugs are being closed with target milestone of 3.2 when they are fixed or no longer relevant to 3.2 code. So if you are waiting for a 2.7 series bug to be closed, please do not be alarmed when its closed against 3.2 without a 2.7 fix being available.

We expect 3.2 to be useful wherever Squid 2.7 is currently running, if you find the upgrade not working that is a problem we need to resolve as soon as possible. So please give it a try and report problems. Just remember to read the 3.2 release notes carefully, and possibly the 3.1 release notes as well.

By and large these older squid-2 series bugs are not going to block the 3.2 release any more than old 3.0 and 3.1 bugs will. But identifying and closing bugs no longer relevant will benefit everyone by allowing us to focus more on the bugs which are still biting people.

There are also hundreds of minor bugs which can be worked on as well.

Proxying HTTPS for an internal service

June 18, 2011 by

Since version 2.6 changed the way http_port worked and let Squid service multiple different types of traffic simultaneously people have been struggling with one setup which should to all outward appearances be quite simple.

I’m speaking of the scenario where you have a proxy serving as both a forward-proxy gateway for the internal LAN users and as a reverse-proxy gateway for some SSL secured internal services (an HTTPS internal site).

Both setups are essentially simple. For the reverse-proxy you setup an origin cache_peer with SSL certificate options. Perhapse an https_port to receive external traffic.  For the forward-proxy you setup users browsers to contact the proxy for their HTTP and HTTPS requests. Perhapse with NAT interception to force those who refuse.

They you discover that Squid can’t seem to relay requests from internal users to your internal peer. You get warnings about clientNegotiateSSL failing on plain HTTP requests. Even though it may appear the user was opening HTTPS properly to contact it.

The problem is that when relaying through a known proxy browser wrap their SSL request, inside a CONNECT tunnel setup request. Which is plain-text HTTP. Squid passes this intact on to any cache_peers you have configured. Even the origin one which is expecting SSL. It may do the right thing and wrap it in a second layer of SSL. But that just makes things worse as the server at the other end gets this weird CONNECT request it cant do anything with.

Until recently the only fix has been to setup a bypass so that internal LAN users don’t use the proxy when visiting the internal HTTPS site. Which works perfectly for user access. But does cause problems on the recording and accounting systems which now have to track two sets of logs and filter proxy relayed requests out of one.

Or alternatively to set the LAN DNS to point users at the reverse-proxy port and figure some way to avoid forwarding loops by bypassing Squid like above or disabling the loop detection.

Both alternatives having the same problems at best. Worst case in the second you have opened some security vulnerabilities by ignoring loops.

In Squid-3.1 we have trialled two possible ways to fix this whole situation.

The first attempt was to simply not relay CONNECT to peers with origin type configured. This failed with a few unwanted side effects. One was that Squid would lookup the DNS and go to that server. Fine for most, but not all Squid have split-DNS available. Or Squid could relay it to a non-origin peer instead. Possibly halfway round the world with worse lag effects than a little extra calculation handling the logs.

The second attempt, which we are currently running with in 3.1.12 and later. Is to strip the CONNECT header and connect the tunnel straight to the peer. But only when the peer port matches the intended destination of that tunnel, and your access controls permit it for selection.

  • The port restriction is there as a simple check that the service is likely to match protocols. Even if we cant be sure which.
  • Traffic to that internal service does go through the proxy and traffic accounting only has to handle the proxy logs.
  •  Requests from LAN clients use the clients SSL certificates instead of the cache_peer configured ones.

This last point is one which can bite or confuse. If you have LAN users in this type of scenario and require all contact with the internal service to use the proxy configured certificates you will still need to configure those clients with the old methods.

 

Enjoy. And as always, if you have better ideas or problems please let us know.

Squid Proxy Server 3.1: Beginner’s Guide

February 25, 2011 by

For those who have been waiting and asking there is now a beginners guide to Squid-3.1 available for sale from Packt Publishing. Authored by Kulbir Saini.

This book seeks to be an introductory guide to Squid and specifically to the features available in the Squid-3 series so far. It covers both the basics and tricky details admin need to be aware of and understand when working with Squid. From configuring access controls through deployment scenarios to managing and monitoring the proxy operation this book has it.

It does not seek to be an update to the O’Rielly book, so there are many fine details and advanced technical descriptions missing. Although even experienced Squid admin may find new topics and features mentioned here that they were unaware of.

Disabling Authentication

February 10, 2011 by

As you may know Squid proxy builds and runs in some pretty interesting places. Amongst these is the WRT operating system. Although admittedly getting Squid small and light enough to run there is a major miniaturization job all by itself and its size still prevents other deserving applications from being used.

One of the problems that keeps popping up in this miniaturization effort is the  –disable-auth option. One would naturally expect such an option to disable authentication in Squid. It does not. So far all it does is prevent the additional helpers being built, and that only in the 3.2 series.

This month my spare-time task has been to make that directive work. To actually omit from the Squid binary as much code as possible which processes and manipulates the authentication information.

So far progress is looking good, around 900KB of the default binary size is removed.  100KB or so of run-time footprint. There are some not so obvious side effects of removing authentication which I’m going to cover here.

These changes have already landed in 3.HEAD and will be present in the 3.2 betas shortly.

In the security industry there are three terms: identification (of who or what something is), authentication (that some identification is true and correct) and authorization (that someone or thing is allowed to perform an action). Almost all access through Squid has a number of features acting to test these in various combinations. Luckily most of it takes the form of simple identification and authorization.

Authentication and ACLs

Naturally this being the goal of the project.

–disable-auth removes auth_param directives and also all ACL types which process usernames. Other than the ident username ACL which does not involve actual authentication.

IDENT

This protocol is all about identification and authorization. No authentication involved. As such its use has not been impacted in any visible way in the current Squid. However I’m noting it here since another side project underway is unifying the username details together.

–disable-auth removes the shared username objects and may in future affect storage of IDENT results.

External ACLs

External ACL helpers are permitted to perform what is called side-band authentication and return to Squid the username and password for an authenticated user. Since this is a form of authentication it dumps results into the username structures at various points.

–disable-auth will disable and remove the storage of these usernames as a possibility. Effectively disabling this type of authentication despite any support which may remain in the helper.

FTP

FTP protocol requires authentication credentials to be passed to the FTP server. Squid has several ways of doing this. Anonymous credentials are set in squid.conf and tried first. The URL may be sent containing an unsecured username and password in clear text.

Alternatively recent versions of Squid will look for HTTP Basic authentication credentials to use. Lacking any working credentials Squid will normally now reply with an HTTP authentication request, resulting in the user being able to enter their FTP login into a popup box or browser password manager.

–disable-auth or even just –disable-auth-basic will now prevent the HTTP authentication method from working in Squids FTP gateway.

Cache Manager

Squids HTTP management interface used by cachemgr.cgi, squidclient and other tools to fetch reports and perform administrative actions relies on authentication for certain actions.

Despite the fancy action@password style its URLs use this is converted into a real HTTP login header by the tools. The management interface relies on this authentication for all actions marked as protected.

–disable-auth or even just –disable-auth-basic will now prevent the Squid cache manager interface from receiving these credentials. Effectively blocking all access to the protected actions.

Delay Pools

The fancy new delay pools in Squid-3 rely on user credentials for some of their bucket types. If one thinks about it closely it becomes clear that without any authentication such pools are useless.

–disable-auth will prevent the class 4 delay pool from being built and available.

Peering

The cache_peer login directive which may normally pass on credentials to a peer via HTTP authentication headers has a complex and mixed relationship with authentication.

–disable-auth prevents login= from generating any HTTP headers. Since the auth encoding is disabled. However the PASSTHRU will still work. PASS is reduced to an exact equivalent of PASSTHRU since it may not generate a header for external ACL authenticated users.

–disable-auth-basic prevents login=username:password from working since the basic authentication header creation is disabled. However the NEGOTIATE and PASSTHRU remain working. PASS is reduced to an equivalent for PASSTHRU since the external ACL authenticated users require Basic encoding.

–disable-auth-negotiate prevents login=NEGOTIATE working since the negotiate authentication header creation is disabled. However all the other forms still work.


Follow

Get every new post delivered to your Inbox.

Join 30 other followers