Archive for the ‘Uncategorized’ Category

Squid-3.4 Transaction Annotations

October 20, 2015

Transaction Annotations is a feature added in Squid-3.4 which is being used solve some of the more annoying and difficult old problems with Squid configuration and performance. But it seems as yet has not made much of a splash in general usage.

The basic concept of these annotations grows out of the old external ACL helpers tag= feature. Originally the ACL helpers could add one tag to the client request state data and later ACLs could test for its value even in the “fast” type ACL checks without having to repeat any complex work the helper already did or risking unreliable match results.

With Squid-3.4 we took this nice little feature and extended it to the extreme.

  • the helper protocols got altered so any helper could produce key=value details and send them back to Squid. But not only that; they can send any key name not just ‘tag’, they can send multiple keys and even repeat one key multiple times.
  • the note directive was added so admin could configure some requests to always be marked with certain annotations.
  • logging codes were added to record annotations individually, or in groups to the log files.
  • a note ACL type to match these helper annotations. Replacing the original tag ACL type.

With Squid-4 external ACL have also been updated to accept any logformat code. Many of these are useful in themselves, but for this article we shall focus on the note format code.

Use Case #1: Re-checking authentication without 407 or 401

It is easy to find oneself writing access permissions that require testing the username but not wanting the client to be presented with a 407/401 or popup if the test fails.

In the past the only way to do this was to use a proxy_auth ACL with explicit username(s) listed. Followed by a non-authentication ACL test or the ‘all hack’.

With annotations there is now a third option. After an initial authentication ACL check has been passed the user= annotation has been added to the request. Simply checking the note ACL to test for the user key having been supplied by an authentication (or external ACL) helper with value being whatever username(s) to match.

Use Case #2: User based outgoing IP (or TOS)

Making Squid use a particular outgoing IP or TOS for one user but not for others has been very difficult for almost the whole exitence of Squid. If each user had an identifiable source IP it was not too bad, but once they used a downstream proxy all bets were off. The tcp_outgoing_* directives simply do not support helper lookups reliably.

With annotations, we can use the note ACL in a tcp_outgoing_addr or tcp_outgoing_tos access list to test for the user key having been supplied by an authentication (or external ACL) helper with value being whatever username we want to match.

Use Case #2: Fast group ACLs

In complex networks with many user groups being assigned and controlling different Squid functionality one may find oneself trying to optimize performance for a large number of separate external ACLs which only check for the users membership of a group.

Remembering that these are slow async lookups, and the resulting restriction to asynchronous (slow group) access controls can lead to administrative problems and some annoying workarounds in squid.conf.

With annotations, we can reduce the group lookup to a single helper query which returns a list of group=X annotations. Then use the note ACL again to test without any fast/slow group access control restrcictions.

If you are using a custom authenticator you could then even have it produce that list of groups alongside the user= credentials. Completely avoiding the need for an external ACL helper.

squid-cache.org outage

September 7, 2014

We are currently weathering both a PSU meltdown and disk failure (with full compliment of side effects) on the main squid-cache.org server. The Squid sysadmin and Foundation are all working on it as fast as possible.

Donations towards the purchase of a new server are greatly appreciated and will allow us to buy better hardware.

UPDATE: 2014-10-08: after weeks of late nights and very early mornings by the volunteer sysadmin team we are nearly all back up to full operational state again. The latest status of each affected major service is outlined below.

Mail and Mailing lists

The mail server for squid-cache.org was worst hit. Email has been down across most of September since the initial event. Any email sent to a squid-cache.org email address will have been held up and/or bounced.

Mail services are now back, but some spam control upgrades were forced on us that are still having fallout. Mailing lists are being migrated to a new domain name lists.squid-cache.org. Existing subscriptions have been automatically moved to the new list domain. You can expect to see an initial post explaining the change when the list you are subscribed to is recovered.

This change may require updates to mail filtering and rules outside our services. If you are aware of any in your domain or systems please see that they are updated.

IMPORTANT: some subscriptions have subsequently had to be removed due to backscatter spam from relays and corporate ticket logging systems. Posters to the list know who I am talking about. If you find your subscription has gone silent again recently please check the systems you are having mail delivered to and through then try re-subscribing to the new list.

Mail archives are currently split between the old hypermail + swift archival system and new pipermail. This is reflected on the website. If you are maintaining a mirror of the Squid mail archives please subscribe to our new mailing list for mirror operators and get in touch with the sysadmin team to sort out what is going to happen with mail mirrors in future.

DNS

We believe this is recovered. If anyone is still having issues resolving the domains please get in touch with noc @ lists.squid-cache.org.

Code Repository

The repository has been fully recovered and service on bzr.squid-cache.org and rsync is resumed.

FTP

The FTP service has been limping along with access but no updates. The main server is now in the process of being rebuilt from scratch. Please do not be surprised if you are suddenly challenged for login, try a mirror instead. Anonymous access to the main FTP will be resumed ASAP.

Website

The http://www.squid-cache.org site is mostly up and running. Mirrors have remained available for the duration, but were not being updated with daily contents. The updates should now have resumed, but there are still a few kinks to work out in the content. If you find any issues going forward please report it in our bugzilla under Project Services.

Mirror Services  and rsync

If you are running a WWW mirror please ensure you are using rsync access and your server is capable of serving the http://www.squid-cache.org name as outlined in the mirror guidelines. Similar goes for FTP mirrors. We are adding a new mailing list for mirror server contacts. Our database of registered contacts for HTTP and FTP mirrors will be automatically subscribed so please keep an eye on the mailbox you registered with us already. Anyone running a Squid mirror of any kind please subscribe and post your mirror details to the list.

The rsync service itself is running with some data shares temporarily disabled. These will be re-opened as the services are brought back to full functionality. There are no changes to remote configurations provided you have been following the current mirror guidelines. The dynamic website (http-files-dyn) will no longer be publicly available, please mirror the static (http-files) instead.

Apologies for the inconvenience.

… and Murphys Law has not finished with us yet:

Some security vulnerabilities were reported. A new squid-3.4.8 package has been released to resolve those. All users relying on SNMP or the pinger helper are advised to upgrade. The SNMP details can be found here, pinger details can be found here.

HTTP/1.1 update obsoleting RFC2616, is complete

June 7, 2014

If you have not been aware of the IETF HTTPbis Working Group and what we do, it is chartered to improve HTTP. For the last decade and a half  HTTP/1.1 has been defined by the monolithic and sometimes confusing RFC2616 document with a relatively few extensions. The WG has been putting in a lot of effort to simplify the texts and clarify how the protocol actually works.

If you have been putting off reading the HTTP/1.1 specification because of its enormous length now is a good time to dive in. The text has never been simper and easier to read. Changes from the old document have been kept minimal, but there are some listed in the Appendices.

Mark Nottingham the WG chairman made this formal announcement a few hours ago:

The revision of HTTP/1.1’s specification, obsoleting RFC2616, is complete.

See:
 http://tools.ietf.org/html/rfc7230 – Message Syntax and Routing
 http://tools.ietf.org/html/rfc7231 – Semantics and Content
 http://tools.ietf.org/html/rfc7232 – Conditional Requests
 http://tools.ietf.org/html/rfc7233 – Range Requests
 http://tools.ietf.org/html/rfc7234 – Caching
 http://tools.ietf.org/html/rfc7235 – Authentication

Along with the related documents:
 http://tools.ietf.org/html/rfc7236 – Authentication Scheme Registrations
 http://tools.ietf.org/html/rfc7237 – Method Registrations

Thanks to everyone who has commented upon, reviewed and otherwise contributed to them over this nearly seven-year(!) effort.

Special thanks to our Area Directors over the years: Lisa Dusseault, Alexey Melnikov, Peter Saint-Andre and Barry Leiba, along with Yves Lafon, who helped edit Range Requests.

Finally, please warmly thank both Roy Fielding and Julian Reschke the next time you see them (I believe beer would be appreciated); the amount of effort that they put into these documents is far, far more than they originally signed up for, and they’ve done an excellent job.

Now, onwards to HTTP/2

P.S. This document set’s completion also has enabled the publication of these related non-WG documents:
 http://tools.ietf.org/html/rfc7238 – The Hypertext Transfer Protocol Status Code 308 (Permanent Redirect)
 http://tools.ietf.org/html/rfc7239 – Forwarded HTTP Extension
 http://tools.ietf.org/html/rfc7240 – Prefer Header for HTTP

 

Oh! And one more thank you, to Mark Baker for serving as Shepherd for the Caching doc.

 

Squid Software Foundation Board of Directors Position Vacancy

September 10, 2013

The Squid Software Foundation is seeking to expand the board of directors. We currently have three directors and looking for at least one more to join the team. For details about the position and what the directors do please see http://www.squid-cache.org/Foundation/director.html

Being a Squid Software Foundation Director is a serious responsibility, but also a cool gig! Not only can you have an immediate and significant impact on the Squid Project, but you can earn admiration and respect of your peers while doing more than just your usual software development, system administration, or support activities.

Do you want to brag about being more than a successful geek? Exercise the parts of your brain you did not know you had? Resolve real-world conflicts and balance real-world trade-offs? Then how about solving a few difficult Squid Project problems? Want to spice up your resume or simply learn to manage a popular open source project? Consider nominating yourself!

Applicants should contact board@squid-cache.org with nomination for the position of Director. Self-nominations are accepted and encouraged. Please indicate why you think the nominee would be a good Foundation director.

Please submit nominations by October 4th, 2013.
The Squid Software Foundation Board of Directors
Henrik Nordström,
Amos Jeffries,
Alex Rousskov.

Squid-3.2: managing dynamic helpers

May 2, 2013

One of the new features brought in with Squid-3.2 is dynamic helpers. A brief name for a very useful administrative tool and like all tools can be both easy and tricky to use at the same time.

If you have a proxy using helper processes but only a small cache (or none) this is a feature for you.

The good news

Configuration is super easy – just set the initial startup, maximum number of helpers and an idle value for how many to start when new ones are needed to handle the request load.

Dying helpers have a higher threshold before they kill Squid. It is not perfectly tuned yet in the code so improvements will contnue to happen here, but already we see small bursts of helper failures being suppressed by re-started replacements without that all too familiar Squid halt with “helper dying too quickly” messages. Note that this is just higher, not gone.

The bad news

Determining what those values should be is no more easy or straightforward than before. Squid uses fork() to start new helpers. The main side effect of this is that helper instances started while Squid is running will require a virtual memory size equivalent to the Squid worker process memory at the time they are started. If your proxy is pushing the box to its limit on RAM, dynamically started helpers could easily push it over to swapping memory at the worst possible time (peak load starting to arrive). Also on the bad news side is that the helpers are run per-worker. Which has the potential to compound the RAM usage problems.

We do have a proposal put to the development team which will almost completely remove this problem. Having the coordinator or a special spawner kid do the forking instead of the heavy workers. But as of this writing nobody is working on it (volunteers welcome, please contact the squid-dev mailing list).

Practice Guidelines

While it may look like the bad news is worse than the good news it does turn out that most installations are small instances or non-caching worker proxies these days. All of which may need lots of helpers, but are not heavy on the RAM requirements. For all these installations dynamic helpers are ideal and in a lot of cases can even be set with zero helpers on startup for a very speedy delay to first request accepted time.

The caching proxy installations with higher memory requirements in the workers can still make use of the dynamic nature to avoid complete outages in the worst-case situations where peak traffic overloads the planned helpers. But should normally be configured as before with enough helpers to meet most needs started before the RAM requirements become too onerous on the worker.

Until at least the bad news problems above are resolved the default behaviour for Squid will continue to be starting all the maximum helpers on startup. So there are no unexpected surprises for upgrading, and the old advice on calculating helper requirements is still useful for determining that maximum.

Bugs Marathon to 3.2 release

March 24, 2012

The new features for Squid-3.2 are now decided and present, the latest builds seem to be running okay. Operating system distributors are starting to work on producing packages for the upcoming release.

So when do we get to see a stable release?

Yes, well. There is just one little problem. Before 3.2 can be released as stable for widespread production use we need to be sure that there are no serious bugs in the new or updated code. Right now we are aware of a few that have not been fixed.

We need assistance fixing bugs in the 3.2 beta.

The serious bug clearing focus actually began two months ago. The worst bugs have now been squashed and we are down to the last few dozen major bugs blocking a stable release. You can find these marked as major, critical, or blocker in our bugzilla. Any assistance finding the causes or working patches for the remaining bugs is very welcome and will help speed up the release process.

IMPORTANT: please ensure that the bugzilla gets your reports.

 

What is the fuss about Squid-2.7?

Squid 3.2 is a little bit unusual. Being the release where the Squid-3 series finally superceeds the Squid 2.6 and 2.7 fork in both common features and performance. Squid-2 has not been actively maintained for more than a year now. Features available in that alternate series of Squid are almost all available in Squid-3.2, the remaining features are expected to be ported over shortly after 3.2 is released stable and developer time becomes more available.

What this means in terms of bugs is that a lot of the 2.6 and 2.7 series bugs are being closed with target milestone of 3.2 when they are fixed or no longer relevant to 3.2 code. So if you are waiting for a 2.7 series bug to be closed, please do not be alarmed when its closed against 3.2 without a 2.7 fix being available.

We expect 3.2 to be useful wherever Squid 2.7 is currently running, if you find the upgrade not working that is a problem we need to resolve as soon as possible. So please give it a try and report problems. Just remember to read the 3.2 release notes carefully, and possibly the 3.1 release notes as well.

By and large these older squid-2 series bugs are not going to block the 3.2 release any more than old 3.0 and 3.1 bugs will. But identifying and closing bugs no longer relevant will benefit everyone by allowing us to focus more on the bugs which are still biting people.

There are also hundreds of minor bugs which can be worked on as well.

Language Negotiation and the world-wide-Squid

September 30, 2009

From 3.1 Squid now supports Automatic Language Negotiation.  There seems to be a little bit of confusion over what this means and what should be configured.

Obviously we would like people to enable and use the automatics. For some very good reasons which you shall understand at the end of this post. I would hope you agree by then too.

Most software you and the rest of the world will be familiar with comes in two  forms: English, or translated into your own language. You might have your computer set to  non-English language and all the software that can changes text so you can more easily read it.

All of this is very you-centric and only affects whatever machine you are using. The www is a very different beast altogether. It has to deal with everyone. At the same time too.

The best example is search engine results. You may have noticed when you do a search that some results have little tags. cached, similar pages, more, … and sometimes one called ‘translate’.  This is nice, because it means the search engine has noticed that the page is in a language you may not know and its offering a link that will translate the page to one you can read.

Ever wondered ‘how does it know’? and more importantly;  what does all this have to do with Squid?

Lets start with the second one:  What does this have to do with Squid?  well Squid. The one I run, the one you probably run, and many others around the world generate error pages.  You are sure to have seen the “404 Not Found” at some point. Probably “Access Denied” and “Connection Failed” as well.

Until now Squid has been setup and managed by someone for a specific purpose. That person sets the language those pages are displaying to something they can read and see what problems are. And here is where the confusion seems to start.

One admin who setup the new Squid promptly changed the error_directory language to German (de). Quite rightly he thought. I’m German, my customers are German, who needs any other languages installed? It will only confuse me to see other language errors. And the server is set to German so it won’t show any others anyway.

At this point I’m guessing you might agree with some or all of that assumption. For your language in the same situation, you would probably do the same yes?

Lets take a look at that search engine question. We found a website. It is written strangely in Persian. We do not have a clue whats its about. Clicking on the ‘translate’ link and we read the page.

But wait, …

… we only saw one single ‘translate’ link and surely the engine knows many languages. We should see a whole bunch, one for every language the page might be translated into.

This is where we get closer to Squid again. The HTTP protocol has a header where the browser says what languages its current user would like things displayed in. The search engine is reading that header and only showing the translate link for most prefered language it can cope with.

This is precisely what Squid now does for the error pages it creates. The language displayed depends on the visitor doing the reading when the automatics are allowed to run.  The server Squid runs on has nothing to do with the language.

Our German admin if you recall set the error_directory to German so he could read it.

Too bad for us if you or I non-German readers had a problem getting to one of his customers websites. Or if we were visiting one of his customers and using their Internet access from our laptop.

What he should have done was leave error_directory unset. When he visits the proxy to test a problem it shows german language, because has browser says to. The user who reported the problem might be reading the same message in Chinese, or Korean.

Squid provides error pages for two reasons, to explain whats gone wrong, and to explain to someone what to do about the problem.  In this world of many international people your visitors and users could be coming from any kind of background with any kind of language needs. To help reduce the number of strange language half-understood complaints we all receive the Squid team have made Squid explain things in a language the visitor can read, so you don’t have to. All you have to do is turn it on.

http://wiki.squid-cache.org/Translations#What_has_been_done.3F

Squid now speaks in over 130 national languages and dialects. 100 more than this same time just last year. Some are more complete than others, improving all the time.

Kia Ora koe.

Continuous Integration

August 18, 2009

For the last few years there has been a slow growing improvement to the testing and QA Squid is subject to. This last week has seen the construction and rollout  of a full-scale build farm to replace some of our simple internal testing. Robert Collins covers the growth process in his blog.

Here is the initial release notice:

Hi, a few of us dev’s have been working on getting a build-test environment up and running. We’re still doing fine tuning on it but the basic facility is working.

We’d love it if users of squid, both individuals and corporates, would consider contributing a test machine to the buildfarm.

The build farm is at http://build.squid-cache.org/ with docs about it at http://wiki.squid-cache.org/BuildFarm.

What we’d like is to have enough machines that are available to run test builds, that we can avoid having last-minute scrambles to fix things at releases.

If you have some spare bandwidth and CPU cycles you can easily volunteer.

We don’t need test slaves to be on all the time – if they aren’t on they won’t run tests, but they will when the come on. We’d prefer machines that are always on over some-times on.

We only do test builds on volunteer machines after a ‘master’ job has passed on the main server. This avoids using resources up when something is clearly busted in the main source code.

Each version of squid we test takes about 150MB on disk when idle, and when a test is going on up to twice that (because of the build test scripts).

We currently test:

  • 2.HEAD
  • 3.0
  • 3.1
  • 3.HEAD

I suspect we’ll add 2.7 to that list. So I guess we’ll use abut 750MB of disk if a given slave is testing all those versions.

Hudson, our build test software, can balance out the machines though – if we have two identical platforms they will each get some of the builds to test.

So, if your favorite operating system is not currently represented in the build farm, please let us know – drop a mail here or to noc @ squid-cache.org – we’ll be delighted to hear from you, and it will help ensure that squid is building well on your OS!

-Rob

That just about covers everything. Hardware and build software requirements are listed in the build farm page.

Hi, a few of us dev's have been working on getting a build-test
environment up and running. We're still doing fine tuning on it but the
basic facility is working.

We'd love it if users of squid, both individuals and corporates, would
consider contributing a test machine to the buildfarm.

The build farm is at http://build.squid-cache.org/ with docs about it at
http://wiki.squid-cache.org/BuildFarm.

What we'd like is to have enough machines that are available to run test
builds, that we can avoid having last-minute scrambles to fix things at
releases.

If you have some spare bandwidth and CPU cycles you can easily
volunteer. 

We don't need test slaves to be on all the time - if they aren't on they
won't run tests, but they will when the come on. We'd prefer machines
that are always on over some-times on.

We only do test builds on volunteer machines after a 'master' job has
passed on the main server. This avoids using resources up when something
is clearly busted in the main source code.

Each version of squid we test takes about 150MB on disk when idle, and
when a test is going on up to twice that (because of the build test
scripts).

We currently test
2.HEAD
3.0
3.1
3.HEAD

and I suspect we'll add 2.7 to that list. So I guess we'll use abut
750MB of disk if a given slave is testing all those versions.

Hudson, our build test software, can balance out the machines though -
if we have two identical platforms they will each get some of the builds
to test.

So, if your favorite operating system is not currently represented in
the build farm, please let us know - drop a mail here or to noc @
squid-cache.org - we'll be delighted to hear from you, and it will help
ensure that squid is building well on your OS!

-Rob

How cachable is google (part 2) – Youtube content

November 17, 2007

Youtube is (one of) the bane of small-upstream network administrators. The flash files are megabytes in size, and a popular video can be downloaded by half the people in the office or student residential college in one afternoon.

It is, at the present time, very difficult to cache. Lets see why.

There’s actually two different methods employed to serve the actual flash media files that I’ve seen. The first method involves fetching from youtube.com servers; the second involves fetching from IP addresses in Google IP space.

The first method is very simple: the URL form is:

http://XXX-YYY.XXX.youtube.com/get_video?video_id=VIDEO_ID

XXX is the pop name; YYY is I’m guessing either a server or a cluster name.

This is pretty standard stuff – and If-Modified-Since requests seem to also be handled badly too! The query string “?” in the URL makes it uncachable to Squid by default, even though its a flash video. Its probably not going to change very often.

The second method involves a bit more work. First the video is requested from a google server. This server then issues a HTTP 302 reply pointing the content at a changing IP address. This request looks somewhat like this:

http://74.125.15.83/get_video?video_id=HrLFb47QHi0&origin=dal-v37.dal.youtube.com

Again, the “?” query string. Again, the origin, but its encoded in the URL. Finally, not only are If-Modified-Since requests not handled correctly, the replies include ETags and requests with an If-None-Match revalidation still return the whole object! Aiee!

So how to cache it?

Firstly, you have to try and cache replies with a “?” reply. It would be nice if they handled If-Modified-Since and If-None-Match requests correctly when the object hasn’t been modified – revalidation is cheap and its basically free bandwidth. They could set the revalidation to be, say, after even 30 minutes – they’re already handling all the full requests for all the content, so the request rate would stay the same but the bandwidth requirements should drop.

The URLs also have to rewritten, much like they do to cache google maps content. The “canonical” form URL will then reference a “video” regardless of which server the client is asking.

Now, how do you do this in Squid? I’ve got some beta code to do this and its in the Squid-2 development tree. Take a look here for some background information. It works around the multiple-URL-referencing-same-file problem but it won’t unfortunately work around their broken HTTP/1.1 validation code. If they fixed that then Youtube may become something which network administrators stop asking to filter.

(ObNote: the second method uses lighttpd as the serving software; and it replies with a HTTP/1.1 reply regardless of whether the request was HTTP/1.0 or HTTP/1.1. Grr!)

Web Cache Whitepapers/Articles

August 24, 2007

Why bother with Squid as a purely proxy server? Isn’t most of the content on the Internet today dynamic?

Perhaps; perhaps not. A few years ago “media caching” required licenced software to handle WMA and RealMedia streams; today the heavy bandwidth users are flash videos from popular sites such as YouTube. The HTML may not be cachable but all those thumbnail images, all those previews and all those large flash video files are very cachable. The problem isn’t that the Internet is “dynamic”; the problem is that website designers view caching as “evil” – they’re suddenly not 100% in control of their content – and try as hard as possible to dodge caching.

Squid has a few knobs which can be set to cache this so-called “dynamic” content. Squid has to treat everything which may be dynamic as uncacheable – the telltail “?” in the URL identifying the output as being from a script – when in fact the content isn’t all that dynamic. More on that will be covered in a future article.

ISPs who run Squid with a well-tuned configuration have shown web traffic savings of around 30%. Thats 30% of their traffic, not just hits. And thats not with any attempt at caching the “dynamic” content which can actually be cached – Youtube and Windows Updates are two big offenders here.

So Squid isn’t that useless at all!

A couple of articles which give an overview of caching follow. They’re dated – the technology isn’t new after all – and just as applicable today.