After the incident to the Project’s main server the Sysadmin team has started looking for ways to improve the reliability and performance of all of our services.
Our sponsor Rackspace has donated us a number of virtual machines, and we decided to leverage that for most of our core infrastructure. The main website, official source code repository, bugzilla and other essential services have since been migrated to virtual machines. Some of the benefits we expect to obtain are improved reliability, easier upgrades, simpler backups; we will not be abandoning physical nodes, there are some tasks that can’t be easily virtualized.
Our continuous integration testing farm used to run on a number of smaller VMs, but this was quite inefficient: smaller VMs would spend most of their time idle, and wouldn’t have much horsepower when active, resulting in relatively long build times. We currently test each commit on over a dozen different OSes and OS versions, across three different compilers. Each run requires over four and a half hours.
Docker has gained lots of mindshare in recent months as a lightweight, flexible and functional containerization platform, and after giving it a try I proposed to the project sysadmins to adopt it as the Project’s platform of choice for most of our build farm, on top of a beefier VM.
After a bit of testing, the results are very satisfying. Our main Linux farm infrastructure has changed from 12 single- or dual-core VMs to 1 8-core system running a dozen build-node containers. Build times are down by 25% and we are able to more easily overcommit CPU and memory resources; disk usage is dramatically down: while a full VM typically reqiures 40 GB of HDD space, a container only needs 4, half of which is ccache data.
As Docker containers are not visible to the network unless explicitly configured, security is up and we need less work to maintain them. It would be rather trivial to fully automate the deployment of newer versions of several Linux distributions by writing a few Dockerfiles.
Technically, most of this could be obtained by chroot(2) call and the right setup; we are only scratching the tip of the cloudy containerization iceberg. But being able to migrate a nontrivial setup in only a few hours of work could say that in the right hands the tools are already in a pretty advanced state.