Docker is all the rage right now, and it was only recently that I understood why. I’ve spent the last month leading the implementation of Docker at SOASTA, and it’s solved some enormous problems for us in a very short amount of time:
- Our developer onboarding process has gone from days to minutes.
- We’ve reduced operational complexity by consolidating three different operating systems with their own unique configurations into one.
- We’ve simplified installing and upgrading our products both internally and externally.
This post will give you an idea of why we chose Docker and where it’s delivered specific outcomes for us. If you’re looking to get started with Docker, then this should also serve as a bit of a Docker intro and hopefully set you on the path to some quick wins with your product.
I had heard about Docker so many times, but I had no clue what it was or how it worked. More importantly, I didn’t understand why everyone else wanted it so bad. Coming from the cloud computing space, I was very familiar with virtual machines, but I wasn’t sure how this world of containers was different from virtualization.
I spent a few days researching Docker at the onset of this project, and a few things jumped out at me as potential benefits:
- Docker is easily installed on any operating system via a simple package on Mac OS X or Windows, or a one-line install command on Linux. This is different from virtual machines, which have many VM players, both open source and commercial, available for every host operating system that you want to run on.
- Docker images are portable to any operating system, like virtual machines, but they’re much more lightweight. A virtual machine contains the entire bloated operating system when you take a snapshot of it, whereas Docker images are built from extremely lightweight base images of Linux that have virtually nothing installed in them. The Ubuntu Docker image is 188mb, which is remarkably small for an OS these days.
- Images can be built through Dockerfiles in an automated fashion, and this functionality is built into Docker out of the gate.
- Images can easily be pulled from public or private repos on Docker Hub with no extra work needed to set that up in the Docker client. This would quickly provide an easy distribution method for any images that are created.
Another thing became apparent very quickly though…
Docker isn’t just a product, it’s a design pattern
Only a small fraction of the value I’ve seen in Docker is the software itself. At DockerCon this week, the Docker team stated that 50% of Docker is built off of existing open source software plumbing. The idea of walled off ‘containers’ has been around since Solaris introduced Zones in 2004. Docker is built off of LxC, Linux LightWeight Containers, which was publicly released in 2008. The technologies aren’t new, but what is new is how you build with them. The real value that comes from Docker is the way it forces you to think about how you build software.
This is where the term “microservice” comes into play.
Docker wants you to build these microservice containers that are small, lightweight, immutable, and that only do one thing per container. By “only do one thing” I mean not running a web server, app server, and a database all inside the same container.
These concepts aren’t just suggestions from Docker, they’re actually requirements that Docker enforces on you in various ways:
You can’t build a single image out of more than 127 “layers” (Dockerfile speak for commands) built into it.
Docker wants you to build base images that are used as building blocks for subsequent images to leverage where possible.
If you try to run more than one process in a container, then Docker will kill those extra processes.
“How rude,” I thought when this first started happening to me, but I’ve learned to love this about Docker. The design pattern here is to run your web server in a container, your app server in a container, and your database in a container all communicating with each other between containers. Docker provides all the mechanisms to make this work neatly on its own local network within the Docker daemon.
Docker container layers are essentially read only.
This forces you to have any working directories outside of the container and mounted in the container from the host system. When architected properly, you can start, stop, throw away, and replace containers without affecting the working data directories on the host.
The moment of revelation
The moment I truly realized how Docker was going to rock so hard for us was when I went hands on for the first time.
I went to http://boot2docker.io and installed the boot2docker package on my Macbook Pro. I ran it and was presented with a shell that had the Docker client all loaded up in it. I looked on Docker Hub for a simple Docker image to try out and decided on Ubuntu Linux. I was able to install it with one command: ‘docker pull ubuntu’. This downloads the latest image from Docker Hub and puts it in your local repository.
Since an image itself isn’t running or doing anything, you need to ‘docker run’ an image to get a container going. A simple ‘docker run -it ubuntu’ will start the container with an interactive shell and drop you right inside the container.
This was my eureka moment.
I could pull a whole product image down with one command and be running with one more command. Furthermore, I could build these images automatically and allow them to be run anywhere. I went from zero to Ubuntu running in Docker in a few minutes. BOOM.
Now the real fun could start.
Win #1: Developer onboarding
Starting to get pretty stoked about Docker, I decided to pick a product engineering team at SOASTA and work to make their life easier. Starting with the developers was key anyways, since ultimately the dev team would be the first users of these new containers and be the ones to maintain them all the way out to production.
I decided to go with our Data Science Workbench product and team, since it was our newest product at SOASTA and probably a bit easier to containerize. DSWB is built on a modern Django/Python/Tornado/SQLite stack. DSWB is also a self-contained app that gets run as one instance on a single server for a customer, which is another thing that made it a great product to start with — no worrying about wide n-tier deployments for our first go at a Docker deployment.
Having done development work locally on all of our products, I was painfully familiar with the onboarding process for a new developer. I’ve found that this process is always painful, no matter what company you work for or product you work on, so I really wanted to make this better for everyone. Our process for onboarding a new developer was essentially to point them to a series of wiki pages and give them a few days to painfully go through setting up their local developer environment by following step after step of copy-and-paste commands for dependency installations, Eclipse configuration, and on and on.
I decided to see if I could take every step in the developer setup wiki and translate them into a Dockerfile.
A Dockerfile is a single file made up of commands that, when passed into the ‘docker build’ command, will get executed step by step and produce a Docker image at the end. The syntax is made up of a handful of simple commands, the most important of which is RUN, which runs a Linux shell command inside the image being built.
As discussed earlier, you can’t have one monolithic image since Docker won’t let you have too big a Dockerfile, so I ended up making two images — one that is the base OS and all of the packages that the product needs, and one that builds off that image and does all of the app installation, Linux user and group configs, database initialization, environment variables, and so on.
Below you can see our Dockerfile for the DSWB base image.
It took a few days of working step by step in the wiki and converting each step into an equivalent Dockerfile command, and in the end I reduced a multi day, error prone process with nearly 100 manual steps into less than 10 steps. Even then half of the new setup steps are simple things like ‘install boot2docker’, ‘create a Docker Hub account’, ‘have Dan add you to the private repo on Docker hub’, etc.
It took a total of three weeks from researching Docker to Dockerizing the developer setup, and then training and converting everyone over to using Docker on their local developer environments. We did the latter in a one hour lunch and learn. We routinely have new developers up and running in under an hour now. All of our new summer interns just went through the new developer setup wiki and were onboarded effortlessly in under an hour. This is a huge improvement from where we were.
Win #2: One operating system to rule them all
With all of our developers now using Docker locally, I started to focus on our next problem: deployment to pre-prod and prod.
Our developers were using Mac OS X, our pre-production environments were running Ubuntu, and our production environments were running on Amazon Linux in EC2. Like our developers who had a multi-page wiki to setup their environments, we had a similar setup wiki for pre-prod environments and then another wiki for production environments. Each operating system had its own set of different commands for installing packages (Homebrew on Mac, apt-get on Ubuntu, Yum on Amazon Linux, etc).
To fix this disparity, we ended up making pre-prod and prod environments just be Amazon Linux machines with Docker installed… and that’s all we had to do. We slide the same Docker image that developers use into those environments, with an official version of the build in it of course, start the container, and that’s it! Machine-specific configs are in the local working directory on the host and mounted in the container (things like the SSL cert), and the working directory for data is decoupled from the container.
The benefits of this are enormous:
The same container that is built on the build server and used by a developer on their local machine is the same container that gets run in pre-prod and prod! The OS and App are all inside of an immutable box. The container and its contents are always the same, everywhere. If automated tests passed on the container on the build server, they will pass anywhere else. This is a beautiful thing. What I have in my container is always the same as what anyone else has — from OS to application code. The awesomeness of this can’t be overstated.
Win #3: Product upgrade process
This one was the huge surprise to us — an entirely unexpected benefit of switching to Docker.
Since Data Science Workbench is a new SOASTA product, one of the big features on our roadmap was to add in-product updates like our other products have. It occurred to me one day that we could scrap this whole project, because downloading a new Docker container was now the upgrade path for our app. Every OS update and all of the app are inside the container, therefore if you ‘docker stop’ the old container, ‘docker pull’ the new image, and then ‘docker run’ the new image… then you’re upgraded.
Not only is that your upgrade path, but you maintain the old versions of the containers in your local repo for as long as you want to keep them around, creating a seamless way to revert back a version if you want to do so.
Let’s say the new version of the app doesn’t work for you because of some bugs, or you need to start up an old version of the app for some work on legacy projects you have around — simple! You can maintain any number of versions of the product in your local repo, and as a vendor we can decide how many different point versions to keep up in our Docker Hub repos for customers to download at any time.
Docker has made our life so much better in just over a month’s time. Here’s what we’re doing next:
- We’ll be working on converting our other products to use Docker from development to production.
- We’re looking at distributing CloudTest Lite (our free version of CloudTest) as a Docker image instead of a virtual machine.
- In a week, we’re shipping an internal version of the DSWB container along with a data container to all SOASTA delivery engineers so they can run Data Science Workbench locally for demos or customer work.
I love geeking out about Docker and any other product design topics, so feel free to hit me on Twitter (@PerfDan) if I can share any deeper Docker learnings or help out in any way.
About the Author
I love great products and I love building them. At SOASTA I work in R&D with some of the best engineers and designers I've ever known. Some things I've designed and worked on here over the past six years include the mPulse Globe, the Dynamic Ramp dashboard, Flip Book mode for dashboards, Data Science Workbench, and ushering in Docker.