How we are using Jenkins and DigitalOcean

May 5, 2021 by

My main contribution to the Squid Web Cache project is these days running the project’s infrastructure. A lot of it is the project’s CI/CD farm.

In order to run it, we rely on a very kind donation by DigitalOcean . We use a VM hosted there to run the main jenkins instance and part of the jobs for the x86-64 architecture. We are then using the jenkins digitalocean plugin to spin up instances (droplets) on demand when we need to have more throughput from our build jobs.

In order to maximise how we use our resources, we rely on docker to run all of our target linux userlands. This allows us to decouple the runtime environment from the machine that’s running it, and to ensure consistency across builds (also coming up: a proper staging system).

In this post I’ll focus on how we spin up these instances, the whole setup is a bit more convoluted.

The digitalocean plugin is quite well integrated and easy to use; TBH I haven’t tried plugins for EC2 or GCP, but my other reference point, jclouds, was much harder to configure and set up.

Given our prerequisites, we need ondemand instances to only contain the docker runtime and java, which is needed to run the jenkins slaves as unlike other setups I’ve found online, these run outside the docker containers.

In order to do that, we supply to the “User Data” section this config snippet:

#cloud-config
apt_upgrade: true
package_upgrade: true
packages:
 - openjdk-11-jre-headless
 - docker.io
users:
 - name: name of the jenkins user on the executor machine
   groups: docker
   shell: /bin/bash
   ssh-authorized-keys: ssh-rsa ssh public key of the user jenkins runs under

These actions will be run when the droplet is launched, and prep the executor for jenkins to ssh into it and run the test jobs. In order to give the droplet time to do that, we need to wait for it with this init script:

#!/bin/bash

echo "starting init script"
while ! cloud-init status|grep -qF 'done'
do
  echo "waiting for cloud-init to be done"
  sleep 10
done

The next tricky bit is in the Droplet section, in the node Labels section we define a label for triggering the instance startup when needed. It can be anything, in our case docker-build-host, and an instance cap.

Referencing this label in the projects’ configuration matrix will trigger the spinup and imaging. Jenkins will then connect to the droplet via ssh and use docker run commands to test the various runtime environments

Converting from MoinMoin to MediaWiki

May 2, 2021 by

The Squid Wiki is hosted on an own instance of MoinMoin. We picked it at the time as it had fewer external dependencies than other engines, and it fit the bill.

Over time, and as the number of pages grew, its strengths became limitations, and I’m currently exploring whether to switch to a different engine. Mediawiki is the go-to choice for most people, so that’s what I investigated first.

W3C has developed a tool to convert from one to the other, but it hasn’t been updated in some time, to the point where MediaWiki API changes have bit-rotten it. It doesn’t help that this tends to be a one-off activity, so it

Open Source to the rescue! I have patched it to support current API and it worked for me ™. While waiting for the PR to be approved, feel free to use my fork of it

Project Update

June 4, 2018 by

If one was not following the mailing lists it might seem the Squid project has gone the way of the Dodo. In truth it is quite the opposite. The dev team and Foundation board are working so hard it has been difficult to find time for these additional updates.

 

So to recap the major projects going on since last update;

The largest change has been our move to git for source code repository. That has been a long road taking up a lot of time over several years. A great big thanks to the various people working on that.

Following along from that we now have github (squid-cache account) as our code repository for public access. The Squid Projects repository is no longer directly available for general access. Our code submission process has changed from accepting .patch files to github PR requests. So developers working on code changes please convert to that (I can still work with patch submissions, but it is significantly more trouble than having your own github account for submission updates/edits). Anyone who forked one of the Squid github repositories prior to the 2017 transition should fork the new repository and convert their code changes.

 

The new code submission process has resolved quite a few issues we had with the old submit and auditing/QA process. There are still a few quirky behaviours caused by github and our automation that cause trouble from time to time – but overall it is a big improvement on what we had before.

The largest issue we face now for QA and code development is manpower. We now have automated change tracking, content integration helping out with QA and a committer bot taking a huge load off my shoulders as maintainer. Our submission process is open and public – so anyone can read the proposals and should also be able to comment about any bugs you can see that have not already been pointed out. Anyone with an interest in the Squid code is encouraged to participate in that process.

 

In the shadow of all those very time consuming alterations to the Squid Project systems the dev team has also managed to iron out several of the major bugs blocking Squid-4 release. Just one of the long-standing bugs remain. A few regressions in recent code have brought up some new major bugs, but those are for the most part already fixed or soon will be. So watch this space for news on further progress there.

Whats going on with Squid 4

July 1, 2016 by

Those that have been paying attention will have noticed that Squid-4 beta cycle has been going on for a very, very long time. A whole year now in fact. There are several reasons for this.

* The ecosystem that Squid is deployed into has a somewhat mixed situation in regards to C++11.

Even 5 years after it was standardized compiler support is still not readily available in some popular OS distributions. In particular there are a large number of people clinging to the outdated but still supported RHEL 6 and its derived family of OS which do not easily provide recent versions of GCC. So we are procrastinating on the deprecation of Squid-3.5 to give more people a chance to move on. The clock is ticking though.

* We had a lack of early adopters testing the early versions of Squid-4.

So bugs that only show up in real usage have been very slow to be found. We don’t make a new version start its beta cycle until the developers are reasonably sure that its stable enough to be used. The beta process is supposed to be just a confirmaton of that. Indeed a year ago 4.x looked like it had no bugs at all. Which is kind of suspicious, but not impossible. As the betas progressed though the bug reports started rolling in. It is somewhat a testament to how our modern build systems are catching out minor and trivial bugs that these tester reported issues have largely all been difficult to resolve.

* We are experimenting with a new management process for Squid releases.

Previously Squid-5 would have been branched for development and Squid-4 starting to stagnate, er, become “stable”. This time though we have delayed the branching and instead kept Squid-4 as main development branch for minor changes while the beta cycle goes on. That has made it a little more volatile than most of the Squid-3 series betas.

All up, we are down to the last few major bugs to be resolved in the new code. Progress on those is slow but steady. So Squid-4 production “stable” release should be not to far ahead on the calendar.

Squid-3.4 Transaction Annotations

October 20, 2015 by

Transaction Annotations is a feature added in Squid-3.4 which is being used solve some of the more annoying and difficult old problems with Squid configuration and performance. But it seems as yet has not made much of a splash in general usage.

The basic concept of these annotations grows out of the old external ACL helpers tag= feature. Originally the ACL helpers could add one tag to the client request state data and later ACLs could test for its value even in the “fast” type ACL checks without having to repeat any complex work the helper already did or risking unreliable match results.

With Squid-3.4 we took this nice little feature and extended it to the extreme.

  • the helper protocols got altered so any helper could produce key=value details and send them back to Squid. But not only that; they can send any key name not just ‘tag’, they can send multiple keys and even repeat one key multiple times.
  • the note directive was added so admin could configure some requests to always be marked with certain annotations.
  • logging codes were added to record annotations individually, or in groups to the log files.
  • a note ACL type to match these helper annotations. Replacing the original tag ACL type.

With Squid-4 external ACL have also been updated to accept any logformat code. Many of these are useful in themselves, but for this article we shall focus on the note format code.

Use Case #1: Re-checking authentication without 407 or 401

It is easy to find oneself writing access permissions that require testing the username but not wanting the client to be presented with a 407/401 or popup if the test fails.

In the past the only way to do this was to use a proxy_auth ACL with explicit username(s) listed. Followed by a non-authentication ACL test or the ‘all hack’.

With annotations there is now a third option. After an initial authentication ACL check has been passed the user= annotation has been added to the request. Simply checking the note ACL to test for the user key having been supplied by an authentication (or external ACL) helper with value being whatever username(s) to match.

Use Case #2: User based outgoing IP (or TOS)

Making Squid use a particular outgoing IP or TOS for one user but not for others has been very difficult for almost the whole exitence of Squid. If each user had an identifiable source IP it was not too bad, but once they used a downstream proxy all bets were off. The tcp_outgoing_* directives simply do not support helper lookups reliably.

With annotations, we can use the note ACL in a tcp_outgoing_addr or tcp_outgoing_tos access list to test for the user key having been supplied by an authentication (or external ACL) helper with value being whatever username we want to match.

Use Case #2: Fast group ACLs

In complex networks with many user groups being assigned and controlling different Squid functionality one may find oneself trying to optimize performance for a large number of separate external ACLs which only check for the users membership of a group.

Remembering that these are slow async lookups, and the resulting restriction to asynchronous (slow group) access controls can lead to administrative problems and some annoying workarounds in squid.conf.

With annotations, we can reduce the group lookup to a single helper query which returns a list of group=X annotations. Then use the note ACL again to test without any fast/slow group access control restrcictions.

If you are using a custom authenticator you could then even have it produce that list of groups alongside the user= credentials. Completely avoiding the need for an external ACL helper.

Squid’s in the Clouds

January 4, 2015 by

After the incident to the Project’s main server the Sysadmin team has started looking for ways to improve the reliability and performance of all of our services.

Our sponsor Rackspace has donated us a number of virtual machines, and we decided to leverage that for most of our core infrastructure. The main website, official source code repository, bugzilla and other essential services have since been migrated to virtual machines. Some of the benefits we expect to obtain are improved reliability, easier upgrades, simpler backups; we will not be abandoning physical nodes, there are some tasks that can’t be easily virtualized.

Our continuous integration testing farm used to run on a number of smaller VMs, but this was quite inefficient: smaller VMs would spend most of their time idle, and wouldn’t have much horsepower when active, resulting in relatively long build times. We currently test each commit on over a dozen different OSes and OS versions, across three different compilers. Each run requires over four and a half hours.

Docker has gained lots of mindshare in recent months as a lightweight, flexible and functional containerization platform, and after giving it a try I proposed to the project sysadmins to adopt it as the Project’s platform of choice for most of our build farm, on top of a beefier VM.
After a bit of testing, the results are very satisfying. Our main Linux farm infrastructure has changed from 12 single- or dual-core VMs to 1 8-core system running a dozen build-node containers. Build times are down by 25% and we are able to more easily overcommit CPU and memory resources; disk usage is dramatically down: while a full VM typically reqiures 40 GB of HDD space, a container only needs 4, half of which is ccache data.

As Docker containers are not visible to the network unless explicitly configured, security is up and we need less work to maintain them. It would be rather trivial to fully automate the deployment of newer versions of several Linux distributions by writing a few Dockerfiles.

Technically, most of this could be obtained by chroot(2) call and the right setup; we are only scratching the tip of the cloudy containerization iceberg. But being able to migrate a nontrivial setup in only a few hours of work could say that in the right hands the tools are already in a pretty advanced state.

Squid-3.2 mythbusting NAT

December 19, 2014 by

One of the more frequently mentioned “problems” with Squid-3.2 since its release is a change in how it handles NAT failures.

The Myths

“Squid used to work when I NAT traffic to it from my router.”

“Squid used to work with one port when I configure the browser and NAT traffic.”

The Reality

No. Squid up until 3.1 would silence the NAT errors and treat the router as if it were the client browser. Any differences between the Host header and the requested URL were also completely ignored. All of this would be invisible to you the administrator, hidden at debug levels not normally shown. With several of the problems being completely unrecorded as well.

This last part seems to be behind a fair bit of skepticism about whether the problem we solved really exists. Nothing was showing up in the logs before and even a full packet trace from Squid or a gateway server would not reveal how JavaScript hijacks a client browser to pass internal documents to an external attacker.

What Changed?

Squid-3.2 finally added Host header validation to protect against the nasty behaviour and a few 0-day attacks using CVE-2009-0801 security vulnerability. Which has been a known flaw with NAT interception by HTTP proxies for at least 12 years now. This meant two things had to change:

* ignoring NAT errors had to stop. We can’t rely on validation results if the TCP details are already known to be corrupted.

* traffic directly from a browser to a NAT intercept port on Squid had to stop. There is no way to separate the NAT lookup result of error and unknown entry. A pity really, but that is what we are forced to work with.

During testing we uncovered a lot of really quite nasty behaviour by various clients, and indeed some public services setup to take advantage of the NAT bug as if it were a feature. On the whole a lot of client software sends a Host header with values strangely unrelated to what is being requested of the proxy. All of this had to go, or did it? for most of the year or so of testing it was banned completely and a HTTP error message returned for any such garbage. But in the end it was clear that we had to let it through somehow or we would be pitting Squid against the biggest players on the Internet … yeah.

(Almost-) Final Results

Squid-3.2 and later will reject traffic where NAT lookups fail on an intercept port. This includes NAT done on external devices and browsers directly sending proxy requests to the Squid intercept port. At the very least accurate reporting about the traffic and what it contains is the critical factor. If we let this traffic into Squid it would seriously compromise the validation results and also allow malicious clients to attack internal network resources with a large measure of anonymity. Neither of which is acceptible for a proxy trusted with controlling user traffic.

The best practice guideline for some years has been that NAT MUST be done on the Squid device. Squid-3.2 are now enforcing it as a basic requirement. If you are one of the network administrators running into this requirement change, please investigate the Policy Routing functionality of the device you originally had doing the NAT.

Squid-3.2 and later will accept invalid Host headers and produce a response to the client. But they will not cache the untrustworthy transaction results. They will contact the same server which the client TCP connection would have originally reached had Squid not been there (visible as ORIGINAL_DST in the logs). Notice that this is now deserving of the name “transparent proxy” a lot more than previous intercepting Squid. Even so NAT interception is still only half-transparent with the server being able to easily identify the proxies existence.

The loss of caching is intended to be a temporary solution until we can properly implement per-client caching of objects. As the saying goes there is nothing so permanent as the temporary – Squid still contains this workaround several series later. Support with this work is very welcome.

The situation when upstream peers are involved is also quite dangerous. For now we have had to permit Squid to pass the traffic to peers, which opens all multi-hop systems of proxies to the same vulnerabilities that were previously possible wth a single intercepting proxy. The solution is also going to require a substantial amount more work and be some years away. For the meanwhile it is a good idea to avoid passing intercepted traffic to cache_peer when possible.

More information on the problems, log entries, when they occur and what can be done about each is detailed in the Squid wiki Host header forgery page.

squid-cache.org outage

September 7, 2014 by

We are currently weathering both a PSU meltdown and disk failure (with full compliment of side effects) on the main squid-cache.org server. The Squid sysadmin and Foundation are all working on it as fast as possible.

Donations towards the purchase of a new server are greatly appreciated and will allow us to buy better hardware.

UPDATE: 2014-10-08: after weeks of late nights and very early mornings by the volunteer sysadmin team we are nearly all back up to full operational state again. The latest status of each affected major service is outlined below.

Mail and Mailing lists

The mail server for squid-cache.org was worst hit. Email has been down across most of September since the initial event. Any email sent to a squid-cache.org email address will have been held up and/or bounced.

Mail services are now back, but some spam control upgrades were forced on us that are still having fallout. Mailing lists are being migrated to a new domain name lists.squid-cache.org. Existing subscriptions have been automatically moved to the new list domain. You can expect to see an initial post explaining the change when the list you are subscribed to is recovered.

This change may require updates to mail filtering and rules outside our services. If you are aware of any in your domain or systems please see that they are updated.

IMPORTANT: some subscriptions have subsequently had to be removed due to backscatter spam from relays and corporate ticket logging systems. Posters to the list know who I am talking about. If you find your subscription has gone silent again recently please check the systems you are having mail delivered to and through then try re-subscribing to the new list.

Mail archives are currently split between the old hypermail + swift archival system and new pipermail. This is reflected on the website. If you are maintaining a mirror of the Squid mail archives please subscribe to our new mailing list for mirror operators and get in touch with the sysadmin team to sort out what is going to happen with mail mirrors in future.

DNS

We believe this is recovered. If anyone is still having issues resolving the domains please get in touch with noc @ lists.squid-cache.org.

Code Repository

The repository has been fully recovered and service on bzr.squid-cache.org and rsync is resumed.

FTP

The FTP service has been limping along with access but no updates. The main server is now in the process of being rebuilt from scratch. Please do not be surprised if you are suddenly challenged for login, try a mirror instead. Anonymous access to the main FTP will be resumed ASAP.

Website

The http://www.squid-cache.org site is mostly up and running. Mirrors have remained available for the duration, but were not being updated with daily contents. The updates should now have resumed, but there are still a few kinks to work out in the content. If you find any issues going forward please report it in our bugzilla under Project Services.

Mirror Services  and rsync

If you are running a WWW mirror please ensure you are using rsync access and your server is capable of serving the http://www.squid-cache.org name as outlined in the mirror guidelines. Similar goes for FTP mirrors. We are adding a new mailing list for mirror server contacts. Our database of registered contacts for HTTP and FTP mirrors will be automatically subscribed so please keep an eye on the mailbox you registered with us already. Anyone running a Squid mirror of any kind please subscribe and post your mirror details to the list.

The rsync service itself is running with some data shares temporarily disabled. These will be re-opened as the services are brought back to full functionality. There are no changes to remote configurations provided you have been following the current mirror guidelines. The dynamic website (http-files-dyn) will no longer be publicly available, please mirror the static (http-files) instead.

Apologies for the inconvenience.

… and Murphys Law has not finished with us yet:

Some security vulnerabilities were reported. A new squid-3.4.8 package has been released to resolve those. All users relying on SNMP or the pinger helper are advised to upgrade. The SNMP details can be found here, pinger details can be found here.

HTTP/1.1 update obsoleting RFC2616, is complete

June 7, 2014 by

If you have not been aware of the IETF HTTPbis Working Group and what we do, it is chartered to improve HTTP. For the last decade and a half  HTTP/1.1 has been defined by the monolithic and sometimes confusing RFC2616 document with a relatively few extensions. The WG has been putting in a lot of effort to simplify the texts and clarify how the protocol actually works.

If you have been putting off reading the HTTP/1.1 specification because of its enormous length now is a good time to dive in. The text has never been simper and easier to read. Changes from the old document have been kept minimal, but there are some listed in the Appendices.

Mark Nottingham the WG chairman made this formal announcement a few hours ago:

The revision of HTTP/1.1’s specification, obsoleting RFC2616, is complete.

See:
 http://tools.ietf.org/html/rfc7230 – Message Syntax and Routing
 http://tools.ietf.org/html/rfc7231 – Semantics and Content
 http://tools.ietf.org/html/rfc7232 – Conditional Requests
 http://tools.ietf.org/html/rfc7233 – Range Requests
 http://tools.ietf.org/html/rfc7234 – Caching
 http://tools.ietf.org/html/rfc7235 – Authentication

Along with the related documents:
 http://tools.ietf.org/html/rfc7236 – Authentication Scheme Registrations
 http://tools.ietf.org/html/rfc7237 – Method Registrations

Thanks to everyone who has commented upon, reviewed and otherwise contributed to them over this nearly seven-year(!) effort.

Special thanks to our Area Directors over the years: Lisa Dusseault, Alexey Melnikov, Peter Saint-Andre and Barry Leiba, along with Yves Lafon, who helped edit Range Requests.

Finally, please warmly thank both Roy Fielding and Julian Reschke the next time you see them (I believe beer would be appreciated); the amount of effort that they put into these documents is far, far more than they originally signed up for, and they’ve done an excellent job.

Now, onwards to HTTP/2

P.S. This document set’s completion also has enabled the publication of these related non-WG documents:
 http://tools.ietf.org/html/rfc7238 – The Hypertext Transfer Protocol Status Code 308 (Permanent Redirect)
 http://tools.ietf.org/html/rfc7239 – Forwarded HTTP Extension
 http://tools.ietf.org/html/rfc7240 – Prefer Header for HTTP

 

Oh! And one more thank you, to Mark Baker for serving as Shepherd for the Caching doc.

 

Zero-Sized Replies from Windows Servers

April 30, 2014 by

During the last few months there have again been a number of bug reports and queries from administrators seeing Zero Sized Reply error pages being produced by Squid 3.2 and later.

These “errors” are produced when Squid sends an HTTP request, then something out in the network goes wrong and the TCP connection gets severed while Squid is still waiting for the start of HTTP response to arrive. As you can imagine this is a little vague because that “something” is any one of a large set of potential networking problems.

Investigation of the old usual culprits in ECN, Window Scaling, PMTUd, and CONNECT proxying ruled them out leaving us mostly in the dark.

Testing without the proxy appeared to work fine. As did small short transactions even through the proxy. Leaving us more than a little confused.

The most common theme this time seems to be Windows based SSL/TLS services with recent but not top of the line software versions. IIS or Sharepoint on Server 2008 and 2010 for example.

Daniel Beschorner has done some investigating and reported this:

Since Squid 3.2 the SSL flag SSL_OP_ALL is no longer enabled by default in Squid. It enables different workarounds in the OpenSSL library.

Windows / IIS seems to get confused by empty packets (to mitigate the BEAST attack) sent from OpenSSL in TLS 1.0.

So the possibilities are:

We have also had remarkably similar problem reports about iTunes servers. That one is still unconfirmed and unresolved.