Author Archive

Squid Software Foundation Board of Directors Position Vacancy

September 10, 2013

The Squid Software Foundation is seeking to expand the board of directors. We currently have three directors and looking for at least one more to join the team. For details about the position and what the directors do please see http://www.squid-cache.org/Foundation/director.html

Being a Squid Software Foundation Director is a serious responsibility, but also a cool gig! Not only can you have an immediate and significant impact on the Squid Project, but you can earn admiration and respect of your peers while doing more than just your usual software development, system administration, or support activities.

Do you want to brag about being more than a successful geek? Exercise the parts of your brain you did not know you had? Resolve real-world conflicts and balance real-world trade-offs? Then how about solving a few difficult Squid Project problems? Want to spice up your resume or simply learn to manage a popular open source project? Consider nominating yourself!

Applicants should contact board@squid-cache.org with nomination for the position of Director. Self-nominations are accepted and encouraged. Please indicate why you think the nominee would be a good Foundation director.

Please submit nominations by October 4th, 2013.
The Squid Software Foundation Board of Directors
Henrik Nordström,
Amos Jeffries,
Alex Rousskov.

Squid-3.2: managing dynamic helpers

May 2, 2013

One of the new features brought in with Squid-3.2 is dynamic helpers. A brief name for a very useful administrative tool and like all tools can be both easy and tricky to use at the same time.

If you have a proxy using helper processes but only a small cache (or none) this is a feature for you.

The good news

Configuration is super easy – just set the initial startup, maximum number of helpers and an idle value for how many to start when new ones are needed to handle the request load.

Dying helpers have a higher threshold before they kill Squid. It is not perfectly tuned yet in the code so improvements will contnue to happen here, but already we see small bursts of helper failures being suppressed by re-started replacements without that all too familiar Squid halt with “helper dying too quickly” messages. Note that this is just higher, not gone.

The bad news

Determining what those values should be is no more easy or straightforward than before. Squid uses fork() to start new helpers. The main side effect of this is that helper instances started while Squid is running will require a virtual memory size equivalent to the Squid worker process memory at the time they are started. If your proxy is pushing the box to its limit on RAM, dynamically started helpers could easily push it over to swapping memory at the worst possible time (peak load starting to arrive). Also on the bad news side is that the helpers are run per-worker. Which has the potential to compound the RAM usage problems.

We do have a proposal put to the development team which will almost completely remove this problem. Having the coordinator or a special spawner kid do the forking instead of the heavy workers. But as of this writing nobody is working on it (volunteers welcome, please contact the squid-dev mailing list).

Practice Guidelines

While it may look like the bad news is worse than the good news it does turn out that most installations are small instances or non-caching worker proxies these days. All of which may need lots of helpers, but are not heavy on the RAM requirements. For all these installations dynamic helpers are ideal and in a lot of cases can even be set with zero helpers on startup for a very speedy delay to first request accepted time.

The caching proxy installations with higher memory requirements in the workers can still make use of the dynamic nature to avoid complete outages in the worst-case situations where peak traffic overloads the planned helpers. But should normally be configured as before with enough helpers to meet most needs started before the RAM requirements become too onerous on the worker.

Until at least the bad news problems above are resolved the default behaviour for Squid will continue to be starting all the maximum helpers on startup. So there are no unexpected surprises for upgrading, and the old advice on calculating helper requirements is still useful for determining that maximum.

Squid-3.2: Pragma, Cache-Control, no-cache versus storage

October 16, 2012

The no-cache setting in HTTP has always been a misunderstood beastie. The instinctual reaction for developers everywhere is to believe that it prevents caching or cache handling or some such myth.

This is not true.

By definition it merely forces caches to revalidate existing content before use (ie it tells the proxy to “be ultra, super-duper conservative. Do not send anything from cache without first contacting the server to double check it.”).

When sent on a client (browser) request:

  • Pragma:no-cache instructs HTTP/1.0 caches to revalidate any cached response before using it.
  • Cache-Control:no-cache instructs HTTP/1.1 caches to revalidate any cached response before using it.
  • Pragma:no-cache only works for HTTP/1.1 caches when Cache-Control is missing.
  • all other values of Pragma are undefined and to be ignored.

When sent on a server response:

  • Pragma in all its forms has no meaning whatsoever. It must be ignored.
  • Cache-Control:no-cache instructs HTTP/1.1 caches to revalidate this response every time it is re-used.

If you read those bullet points above very carefully you will notice that at no point is store mentioned. None whatsoever. The closest it gets is mentioning what to do with already-stored content (revalidate it). In fact the HTTP/1.1 specification goes as far as to say explicitly that responses with no-cache MAY be stored – provided the revalidation is done as above.

no-cache in Squid

The well-known squid versions of the past have all been HTTP/1.0 compliant and advertised themselves as HTTP/1.0 software. These proxies both looked for Pragma:no-cache headers and obeyed them:

  • Squid being HTTP/1.0 that Pragma took precedence over Cache-Control.
  • Due to lack of full HTTP/1.1 revalidation in very old versions Squid has traditionally treated no-cache in either header as if it were Cache-Control:no-store.
  • Due to some old server software Pragma:no-cache on responses was treated as a mistaken form of Cache-Control:no-store.

Starting with version 3.2 Squid is advertising and attempting to fully support HTTP/1.1 specifications. This is a game changer.

All of the above is about to be up-ended, assumptions can be thrown away and some funky cool proxy behaviour allowed to take place.

Hiding in the background is the instruction that Pragma only applies when Cache-Control is missing from a request. We can ignore it – almost completely. When we do have to pay attention we only need to notice the no-cache value and can treat it as if we received Cache-Control:no-cache.

The other change is a potential game changer: The object being transfered is stored now, revalidated later.

Some implications from storing no-cache responses:

  • servers can utilize 304 responses instead of generating new content. Saving a lot of bandwidth and CPU cycles.
  • all those configuration hacks for ignoring or stripping no-cache are no longer needed. Also, the harm they do will become more visible as revalidation is skipped.
  • cache HIT ratio can potentially rise above 50% for forward proxies. As a side effect of the HIT counting market a large portion of web traffic is utilizing no-cache instead of no-store or private. This large portion is cacheable but until now Squid has been dropping it.

Before the marketing department panics about the end of the world lets be clear on one important point:

revalidation means every client request will still reach the end server doing HIT counting, traffic control, whatever – but in a way which allows 304 bandwidth optimization on the responses.

Do not expect a sudden rise of TCP_HIT in the proxy logs though. It is more likely to show up as TCP_REFRESH_HIT or the nasty TCP_REFRESH_MODIFIED/TCP_REFRESH_MISS which is produced by broken web applications always sending out new unchanged content.

Happy Eyeballs

July 14, 2012

Geoff Huston wrote up a very interesting analysis of the RFC 6555 “Happy Eyeballs” features being added to web browsers recently.

As these features reach the mainstream stable browser releases and more people being using them Squid in the role of intercepting proxy are starting to face the same issues mentioned for CGN gateways. For all the same reasons. Whether or not you are operating an existing interception proxy or installing a new one this is one major new feature of the modern web which needs to be taken into account when provisioning the network and Squid socket/FD resources.

Squid operating as forward proxy do not face this issue as each browser only opens a limited number of connections to the proxy. Although Firefox implementation of the  “Happy Eyeballs” algorithm appears to have been instrumental in uncovering a certain major bug in Squids new connection handling recently.

A Squid Implementation

For those interested, Squid-3.2 does implement by default a variation of the “Happy Eyeballs” algorithm.

DNS lookups are performed in parallel now, as opposed to serial as they were in 3.1. As a result the maximum DNS lookup time is reduced from the sum of A and AAAA response times, to the maximum of both.

TCP connection attempts are still run in serial, but where older versions of Squid interspersed a DNS lookup with each set of TCP attempts the new 3.2 code identifies all the possible destinations first and tries each individual address until a working connection is found. Retries under the new version are also now limited per-address where in the older versions each retry meant a full DNS result set of addresses was re-tried.

As a result dns_timout is separated from connect_timeout which is now fully controlling only one individual TCP connection handshake.

Bugs Marathon to 3.2 release

March 24, 2012

The new features for Squid-3.2 are now decided and present, the latest builds seem to be running okay. Operating system distributors are starting to work on producing packages for the upcoming release.

So when do we get to see a stable release?

Yes, well. There is just one little problem. Before 3.2 can be released as stable for widespread production use we need to be sure that there are no serious bugs in the new or updated code. Right now we are aware of a few that have not been fixed.

We need assistance fixing bugs in the 3.2 beta.

The serious bug clearing focus actually began two months ago. The worst bugs have now been squashed and we are down to the last few dozen major bugs blocking a stable release. You can find these marked as major, critical, or blocker in our bugzilla. Any assistance finding the causes or working patches for the remaining bugs is very welcome and will help speed up the release process.

IMPORTANT: please ensure that the bugzilla gets your reports.

 

What is the fuss about Squid-2.7?

Squid 3.2 is a little bit unusual. Being the release where the Squid-3 series finally superceeds the Squid 2.6 and 2.7 fork in both common features and performance. Squid-2 has not been actively maintained for more than a year now. Features available in that alternate series of Squid are almost all available in Squid-3.2, the remaining features are expected to be ported over shortly after 3.2 is released stable and developer time becomes more available.

What this means in terms of bugs is that a lot of the 2.6 and 2.7 series bugs are being closed with target milestone of 3.2 when they are fixed or no longer relevant to 3.2 code. So if you are waiting for a 2.7 series bug to be closed, please do not be alarmed when its closed against 3.2 without a 2.7 fix being available.

We expect 3.2 to be useful wherever Squid 2.7 is currently running, if you find the upgrade not working that is a problem we need to resolve as soon as possible. So please give it a try and report problems. Just remember to read the 3.2 release notes carefully, and possibly the 3.1 release notes as well.

By and large these older squid-2 series bugs are not going to block the 3.2 release any more than old 3.0 and 3.1 bugs will. But identifying and closing bugs no longer relevant will benefit everyone by allowing us to focus more on the bugs which are still biting people.

There are also hundreds of minor bugs which can be worked on as well.

Proxying HTTPS for an internal service

June 18, 2011

Since version 2.6 changed the way http_port worked and let Squid service multiple different types of traffic simultaneously people have been struggling with one setup which should to all outward appearances be quite simple.

I’m speaking of the scenario where you have a proxy serving as both a forward-proxy gateway for the internal LAN users and as a reverse-proxy gateway for some SSL secured internal services (an HTTPS internal site).

Both setups are essentially simple. For the reverse-proxy you setup an origin cache_peer with SSL certificate options. Perhapse an https_port to receive external traffic.  For the forward-proxy you setup users browsers to contact the proxy for their HTTP and HTTPS requests. Perhapse with NAT interception to force those who refuse.

They you discover that Squid can’t seem to relay requests from internal users to your internal peer. You get warnings about clientNegotiateSSL failing on plain HTTP requests. Even though it may appear the user was opening HTTPS properly to contact it.

The problem is that when relaying through a known proxy browser wrap their SSL request, inside a CONNECT tunnel setup request. Which is plain-text HTTP. Squid passes this intact on to any cache_peers you have configured. Even the origin one which is expecting SSL. It may do the right thing and wrap it in a second layer of SSL. But that just makes things worse as the server at the other end gets this weird CONNECT request it cant do anything with.

Until recently the only fix has been to setup a bypass so that internal LAN users don’t use the proxy when visiting the internal HTTPS site. Which works perfectly for user access. But does cause problems on the recording and accounting systems which now have to track two sets of logs and filter proxy relayed requests out of one.

Or alternatively to set the LAN DNS to point users at the reverse-proxy port and figure some way to avoid forwarding loops by bypassing Squid like above or disabling the loop detection.

Both alternatives having the same problems at best. Worst case in the second you have opened some security vulnerabilities by ignoring loops.

In Squid-3.1 we have trialled two possible ways to fix this whole situation.

The first attempt was to simply not relay CONNECT to peers with origin type configured. This failed with a few unwanted side effects. One was that Squid would lookup the DNS and go to that server. Fine for most, but not all Squid have split-DNS available. Or Squid could relay it to a non-origin peer instead. Possibly halfway round the world with worse lag effects than a little extra calculation handling the logs.

The second attempt, which we are currently running with in 3.1.12 and later. Is to strip the CONNECT header and connect the tunnel straight to the peer. But only when the peer port matches the intended destination of that tunnel, and your access controls permit it for selection.

  • The port restriction is there as a simple check that the service is likely to match protocols. Even if we cant be sure which.
  • Traffic to that internal service does go through the proxy and traffic accounting only has to handle the proxy logs.
  •  Requests from LAN clients use the clients SSL certificates instead of the cache_peer configured ones.

This last point is one which can bite or confuse. If you have LAN users in this type of scenario and require all contact with the internal service to use the proxy configured certificates you will still need to configure those clients with the old methods.

 

Enjoy. And as always, if you have better ideas or problems please let us know.

Squid Proxy Server 3.1: Beginner’s Guide

February 25, 2011

For those who have been waiting and asking there is now a beginners guide to Squid-3.1 available for sale from Packt Publishing. Authored by Kulbir Saini.

This book seeks to be an introductory guide to Squid and specifically to the features available in the Squid-3 series so far. It covers both the basics and tricky details admin need to be aware of and understand when working with Squid. From configuring access controls through deployment scenarios to managing and monitoring the proxy operation this book has it.

It does not seek to be an update to the O’Rielly book, so there are many fine details and advanced technical descriptions missing. Although even experienced Squid admin may find new topics and features mentioned here that they were unaware of.

Disabling Authentication

February 10, 2011

As you may know Squid proxy builds and runs in some pretty interesting places. Amongst these is the WRT operating system. Although admittedly getting Squid small and light enough to run there is a major miniaturization job all by itself and its size still prevents other deserving applications from being used.

One of the problems that keeps popping up in this miniaturization effort is the  –disable-auth option. One would naturally expect such an option to disable authentication in Squid. It does not. So far all it does is prevent the additional helpers being built, and that only in the 3.2 series.

This month my spare-time task has been to make that directive work. To actually omit from the Squid binary as much code as possible which processes and manipulates the authentication information.

So far progress is looking good, around 900KB of the default binary size is removed.  100KB or so of run-time footprint. There are some not so obvious side effects of removing authentication which I’m going to cover here.

These changes have already landed in 3.HEAD and will be present in the 3.2 betas shortly.

In the security industry there are three terms: identification (of who or what something is), authentication (that some identification is true and correct) and authorization (that someone or thing is allowed to perform an action). Almost all access through Squid has a number of features acting to test these in various combinations. Luckily most of it takes the form of simple identification and authorization.

Authentication and ACLs

Naturally this being the goal of the project.

–disable-auth removes auth_param directives and also all ACL types which process usernames. Other than the ident username ACL which does not involve actual authentication.

IDENT

This protocol is all about identification and authorization. No authentication involved. As such its use has not been impacted in any visible way in the current Squid. However I’m noting it here since another side project underway is unifying the username details together.

–disable-auth removes the shared username objects and may in future affect storage of IDENT results.

External ACLs

External ACL helpers are permitted to perform what is called side-band authentication and return to Squid the username and password for an authenticated user. Since this is a form of authentication it dumps results into the username structures at various points.

–disable-auth will disable and remove the storage of these usernames as a possibility. Effectively disabling this type of authentication despite any support which may remain in the helper.

FTP

FTP protocol requires authentication credentials to be passed to the FTP server. Squid has several ways of doing this. Anonymous credentials are set in squid.conf and tried first. The URL may be sent containing an unsecured username and password in clear text.

Alternatively recent versions of Squid will look for HTTP Basic authentication credentials to use. Lacking any working credentials Squid will normally now reply with an HTTP authentication request, resulting in the user being able to enter their FTP login into a popup box or browser password manager.

–disable-auth or even just –disable-auth-basic will now prevent the HTTP authentication method from working in Squids FTP gateway.

Cache Manager

Squids HTTP management interface used by cachemgr.cgi, squidclient and other tools to fetch reports and perform administrative actions relies on authentication for certain actions.

Despite the fancy action@password style its URLs use this is converted into a real HTTP login header by the tools. The management interface relies on this authentication for all actions marked as protected.

–disable-auth or even just –disable-auth-basic will now prevent the Squid cache manager interface from receiving these credentials. Effectively blocking all access to the protected actions.

Delay Pools

The fancy new delay pools in Squid-3 rely on user credentials for some of their bucket types. If one thinks about it closely it becomes clear that without any authentication such pools are useless.

–disable-auth will prevent the class 4 delay pool from being built and available.

Peering

The cache_peer login directive which may normally pass on credentials to a peer via HTTP authentication headers has a complex and mixed relationship with authentication.

–disable-auth prevents login= from generating any HTTP headers. Since the auth encoding is disabled. However the PASSTHRU will still work. PASS is reduced to an exact equivalent of PASSTHRU since it may not generate a header for external ACL authenticated users.

–disable-auth-basic prevents login=username:password from working since the basic authentication header creation is disabled. However the NEGOTIATE and PASSTHRU remain working. PASS is reduced to an equivalent for PASSTHRU since the external ACL authenticated users require Basic encoding.

–disable-auth-negotiate prevents login=NEGOTIATE working since the negotiate authentication header creation is disabled. However all the other forms still work.

Project Update: Bugs, bugs, betas and breakages

October 24, 2010

So much for a monthly update with hints and tips. Been a year and more! So I guess this post should start again by covering what has been going on over this past year. “Not much and far too much” just about covers it …

 

Bugs …

Business as usual has seen a steady parade of upgrades from earlier versions up to 3.0 and 3.1. Bringing with it a steady stream of bug reports for all sorts of varied things overlooked or not yet fixed. The increasing number of reports people are giving of old bugs that have annoyed them without being reported is pleasing in a way. I hope this means less overall and display stuff is relatively easy

Some people have had shocks; bugs reported over a year ago being fixed suddenly and some new reports not lasting the day. I can’t stress enough how important your help in researching the problems is.

The oldest bugs are simple “Squid is doing X”. Um, yeah well, we can assume what the expected behaviour was (sometimes). After prodding (a day or so later) up comes a version number and build options.

For example; I was looking at a typical old bug today. Still open from 2007 that never got any further than build options and behaviour description despite several requests for more info.  One day somebody is going to have to setup several machines build 3.0 a couple of times with multiple features on or off and try to replicate the described problem.  Then build 3.HEAD and try to replicate X all over again starting with the same tests. Then try to figure out how its happening and whether a fix might be possible.  Well, thanks for the report anyway, at least we know to watch for it meanwhile.

Every few months like today, I trawl through these, just over 100 bugs now, looking for ones which I could maybe replicate in less than a few days. Today I’m slightly happy. One died as invalid, and two died as no longer fixable. The whole area of code they might have been in is dead.

 

… bugs …

The shorted lived bugs by comparison, are reports that version a.b.c. is doing X when it should be doing Y. They include in the case of crashes and segmentation faults a backtrace of the code (with symbol names). They include header or packet traces if the problem was communicating to some other software. Some even came with information tracking down which feature and piece of the code was going wrong. Found, tracked, patched, tested, and fixed in the main code in mere hours of work.  A HUGE thank you to everyone who reports bugs like this.

 

… betas …

Squid-3.2 has started its beta releases this year. Following the fluffy “grand plan” of being a bit more predictable:  branch in Jun with monthly betas through until a stable series probably appears Dec or Jan.

The fluffy part is caused by some more structural cleanup projects we really really want to make the code easier to work with. Well, late Oct and about half of the cleanups have been done. The stable part is today looking more like next Jan/Feb provided the rest are done and working soon.

Another post closer to 3.2 stable release will cover whats going on there in the features. Some cool stuff. If you want a sneak preview the release notes are already available.

 

… and breakages.

Ah, well. Some may recall 3.1.5.1 and friends. The background there for the inquisitive is that two feature changes clashed head-on and killed stability for two months.

I have irregular but available contact with nine of the maintainers packaging Squid for popular systems. It started slowly, growing until half of them had informed me that important sections of their users were being hit by the lack of IPv6 split-stack support in 3.1.

It surprised me a bit how many people and what in situations are insisting that IPv6 be turned off, but only partially. This was a big problem for early 3.1 series since the build process actively probed IPv6 capabilities and turned various capabilities on or off inside Squid. So an essentially simple change was made, moving the active testing into Squid startup process. Squid builds with IPv6 anywhere and runs it when available. No such luck on simple, enter all the little quiet bits assuming IPv4-only or IPv6-only sockets.

The other feature change that made the recovery even messier was a version bump to libtool in our package bundling system. In short a very simple basic upgrade omission on our part broke the tarball package. A veritable storm of hacks appeared to work around the broken tarballs and extended the fix time as testers disappeared with “it works for me now” leaving the systemic problem unresolved.

3.1.8 finally got some pedantic methodical testing before release and a fix for both problems.

 

Background to all this drama we have had an increase in focus on HTTP/1.1 support over the year from The Measurement Factory. Many of the minor behaviour bug fixes made it into the 3.1 series and the last of the big changes being held back to 3.2.

Language Negotiation and the world-wide-Squid

September 30, 2009

From 3.1 Squid now supports Automatic Language Negotiation.  There seems to be a little bit of confusion over what this means and what should be configured.

Obviously we would like people to enable and use the automatics. For some very good reasons which you shall understand at the end of this post. I would hope you agree by then too.

Most software you and the rest of the world will be familiar with comes in two  forms: English, or translated into your own language. You might have your computer set to  non-English language and all the software that can changes text so you can more easily read it.

All of this is very you-centric and only affects whatever machine you are using. The www is a very different beast altogether. It has to deal with everyone. At the same time too.

The best example is search engine results. You may have noticed when you do a search that some results have little tags. cached, similar pages, more, … and sometimes one called ‘translate’.  This is nice, because it means the search engine has noticed that the page is in a language you may not know and its offering a link that will translate the page to one you can read.

Ever wondered ‘how does it know’? and more importantly;  what does all this have to do with Squid?

Lets start with the second one:  What does this have to do with Squid?  well Squid. The one I run, the one you probably run, and many others around the world generate error pages.  You are sure to have seen the “404 Not Found” at some point. Probably “Access Denied” and “Connection Failed” as well.

Until now Squid has been setup and managed by someone for a specific purpose. That person sets the language those pages are displaying to something they can read and see what problems are. And here is where the confusion seems to start.

One admin who setup the new Squid promptly changed the error_directory language to German (de). Quite rightly he thought. I’m German, my customers are German, who needs any other languages installed? It will only confuse me to see other language errors. And the server is set to German so it won’t show any others anyway.

At this point I’m guessing you might agree with some or all of that assumption. For your language in the same situation, you would probably do the same yes?

Lets take a look at that search engine question. We found a website. It is written strangely in Persian. We do not have a clue whats its about. Clicking on the ‘translate’ link and we read the page.

But wait, …

… we only saw one single ‘translate’ link and surely the engine knows many languages. We should see a whole bunch, one for every language the page might be translated into.

This is where we get closer to Squid again. The HTTP protocol has a header where the browser says what languages its current user would like things displayed in. The search engine is reading that header and only showing the translate link for most prefered language it can cope with.

This is precisely what Squid now does for the error pages it creates. The language displayed depends on the visitor doing the reading when the automatics are allowed to run.  The server Squid runs on has nothing to do with the language.

Our German admin if you recall set the error_directory to German so he could read it.

Too bad for us if you or I non-German readers had a problem getting to one of his customers websites. Or if we were visiting one of his customers and using their Internet access from our laptop.

What he should have done was leave error_directory unset. When he visits the proxy to test a problem it shows german language, because has browser says to. The user who reported the problem might be reading the same message in Chinese, or Korean.

Squid provides error pages for two reasons, to explain whats gone wrong, and to explain to someone what to do about the problem.  In this world of many international people your visitors and users could be coming from any kind of background with any kind of language needs. To help reduce the number of strange language half-understood complaints we all receive the Squid team have made Squid explain things in a language the visitor can read, so you don’t have to. All you have to do is turn it on.

http://wiki.squid-cache.org/Translations#What_has_been_done.3F

Squid now speaks in over 130 national languages and dialects. 100 more than this same time just last year. Some are more complete than others, improving all the time.

Kia Ora koe.


Follow

Get every new post delivered to your Inbox.

Join 30 other followers