Dockerize existing ruby on rails application
This article will focus on dockerizing existing rails app because it’s common and challenging.
But as long as a new app will finally become old, you may find some tips here still be helpful.
Say we have an existing ruby on rails project and we decide to somehow run it inside a docker container so we can get those great benefits of Docker ecosystem.
Ready? Let’s start.
First we can divide the task into several phases:
- Setup a base image with ruby environment
- Install gems with bundler
- Compile static assets
- Provision app before start
- Start web server
Setup a base image with ruby environment
Both Ruby and Rails have offical images available.
But it’s also not too hard to build it from scratch.
Using Rails official images
We don’t recommend Rails official images for production use because
- only rails 4+ has offical images
- we can’t control ruby version, which brings trouble in case we want to upgrade ruby for vulnerability issues
- rails could simply be installed by
bundle install
Using Ruby offcial images
Ruby official images are turned to be helpful, they offer options from down to ruby 1.9.3 and up to latest stable version.
If you’re lucky, your target ruby is one of the official versions, it could be the best choice to start from a official ruby image.
Build base Ruby image from scratch
But if you’re not so lucky, like our case, we have legacy rails apps running on ree-1.8.7, so there’s no way except building a base ruby image by ourselves.
We’ve tried both RVM and ruby-build, and we finally chose ruby-build for its simplicity.
Note ree-1.8.7 need some patches to make it work correctly, create a directory called patch, and put ree-1.8.7-2011.12 inside of it.
Here’s the Dockerfile demonstrating how we build it:
FROM centos:7
# install those basic tools we will use in debugging
RUN yum install -y \
git \
tar vim \
gcc gcc-c++ make patch \
hostname nmap-ncat readline-devel; \
yum -y clean all
# install ruby build
RUN git clone https://github.com/sstephenson/ruby-build.git /root/ruby-build && /root/ruby-build/install.sh
# install ree, gcc44 is a must-have
RUN yum install -y compat-gcc-44; yum -y clean all
ADD ./patch/* /usr/local/share/ruby-build/
RUN CC=gcc44 ruby-build ree-1.8.7-2011.12 /ruby
RUN echo "export PATH=$PATH:/ruby/bin" > /etc/profile.d/ruby.sh
ENV PATH $PATH:/ruby/bin
# use taobao gem source, as in China rubygems.org has been blocked
RUN gem sources --remove https://rubygems.org/ && gem sources --add https://ruby.taobao.org/
# install bundler
RUN gem install -N bundler
Then build image with
docker build -t my-company/ruby-base .
Install gems with bundler
Install System Libraries
Before installing gems, we should first install those system libraries that we need for some gems to compile to native extensions.
For example, we need to install mysql-devel
package manually in Centos before we install mysql2
gem.
Bundle Install
Simply run a bundle install in Dockerfile is a solution. But it has a drawback. See this typical example:
FROM my-company/ruby-base:latest
ADD . /my-app
WORKDIR /my-app
RUN bundle install --jobs 3 --retry 3
RUN bundle clean --force
Because we add project files first, the ADD command invalidate the Docker cache, then bundle install can be a expensive command, in our case without any optimization this take more than 10 minutes.
How could we make it faster?
We could also separate the cost into 2 parts:
- Time takes for downloading Gems from remote gem servers
- Time takes for compiling gems with native extensions
Let’s deal with them one by one.
Reduce downloading time
We’ve tried many means and finally we find this solution being helpful and more importantly being elegant.
# this is a command line context, it could be a Jenkins Job or our local terminal
# in the same directory we placed the Dockerfile above
bundle package --all
docker build -t my-company/incredible-app
After bundle package --all
, gem sources will be cached in vendor/cache
directory, and during docker build we will spend no time downloading it from remote server.
For the first time, bundle package --all
could be slow, but from second time it will be very fast.
Which means we have the same solution in local and in CI environment, and there’s no any invasion into Dockerfile.
Beatuiful, just like the progressive enhancement idea that was once famous in frontend world.
Reduce compiling time
It’s easy to see which extension takes the most time. Considering Gems should be quite stable, we could introduce a new image layer which install some gems in advance. Here’s our example:
FROM my-company/ruby-base:latest
# only purpose for this image is to speed up normal build
# save 3 min
RUN gem install -N nokogiri -v 1.6.2.1
# save 1min10s
RUN gem install -N curb -v 0.8.5
# save around 30s
RUN gem install -N poltergeist -v 1.5.1
# save more than 30s
RUN gem install -N unicorn -v 4.8.3
# save 16s
RUN gem install -N therubyracer -v 0.12.1
# save 10s
RUN gem install -N oj -v 2.12.0
# save 10s
RUN gem install -N mysql2 -v 0.3.18
# save 10s
RUN gem install -N ruby-prof -v 0.15.2
And some slightly modification is needed for app’s Dockerfile
# update FROM section to point to the image created above
FROM my-company/ruby-incredible-app-base:latest
Compile static assets
This step is quite simple, just add a RUN rake assets:precompile
and it will work.
Provision the app before start
Generate Config Files
Config files are prerequisite for rails application.
Pass them in via volume is an option, but if you have ever agreed Twelve Factor you may prefer the environment variables option.
We want to discuss how to generate config files given injected ENV variables.
Let’s think about it deeper.
- The generation will happen in container runtime, which limit the possible dealing points in
CMD
andENTRYPOINT
instruction. - The main difference between
CMD
andENTRYPOINT
isCMD
is very likely to be overriden indocker run
command. - So it depends on whether we want the config files to be generated if command is overridden?
For our case, we want to generate config files even if other commands, such as rspec spec
is given.
So we chose ENTRYPOINT
as a point to generate config files.
Here’s our sample docker-entrypoint.sh
#!/bin/bash
# stop execution if any commands fail
set -e
# generate database.yml
source /my-app/docker-initializers/generate_database_yml.sh /my-app/config/database.yml
exec "$@"
docker-initializers/generate_database_yml.sh
#!/bin/bash
cat << EOF > $1
defaults: &defaults
adapter: mysql2
reconnect: true
encoding: utf8
host: $MYSQL_MY_APP_HOST
port: $MYSQL_MY_APP_PORT
username: $MYSQL_MY_APP_USERNAME
password: $MYSQL_MY_APP_PASSWORD
test:
<<: *defaults
development:
<<: *defaults
production:
<<: *defaults
EOF
So from the above samples you could find that we will be dependent on these MYSQL_* parameters.
Link Support
It’s possible users will link the app container with mysql or some other service containers.
In order to support them, we could adapt link ENV variables to previous MYSQL_* parameters.
docker-initializers/link_support.sh
#!/bin/bash
# --link mysql support
if [ -n "$MYSQL_PORT_3306_TCP_ADDR" ]; then
export MYSQL_MY_APP_HOST=$MYSQL_PORT_3306_TCP_ADDR
export MYSQL_MY_APP_PORT=$MYSQL_PORT_3306_TCP_PORT
export MYSQL_MY_APP_USERNAME=default_user
export MYSQL_MY_APP_PASSWORD=default_password
fi
Adding link support to docker-entrypoint.sh, it now becames:
#!/bin/bash
# stop execution if any commands fail
set -e
# handling case such as docker run --link mysql-1:mysql
source /my-app/docker-initializers/link_support.sh
# generate database.yml
source /my-app/docker-initializers/generate_database_yml.sh /my-app/config/database.yml
exec "$@"
Wait Support
Now our database.yml will be automatically generated upon start. But there’s another problem, if mysql is still initializing while rails tries to connect to it, rails will throw an error about database connection.
The reason is rails need to get the metadata of DB tables to allow ActiveRecord work correctly.
How to deal with this problem?
We could ensure this from outside, but add some protection inside is still harmless. Here’s how we do it:
We’ve employed
nmap-ncat
package in ruby-base image, now it’s time for it to shine.
docker-initializers/wait_support.sh
#!/bin/bash
function wait_for() {
service=$1
host=$2
port=$3
echo "waiting for $service to be up on $host:$port..."
if [ -n "$host" -a -n "$port" ]; then
while ! nc -w 1 -c echo $host $port
do
echo -n .
sleep 1
done
echo 'ok'
else
echo "[ERROR] invalid host=$host or port=$port for $service"
exit 1
fi
}
wait_for "database connection - $MYSQL_MY_APP_DBNAME" $MYSQL_MY_APP_HOST $MYSQL_MY_APP_PORT
Adding wait support, docker-entrypoint.sh now becomes:
#!/bin/bash
# stop execution if any commands fail
set -e
# handling case such as docker run --link mysql-1:mysql
source /my-app/docker-initializers/link_support.sh
# wait for other service ports to be ready, this can be enabled by a environment variable
if [ "$WAIT_FOR_DEPENDED_SERVICES" = "true" ]; then
source /my-app/docker-initializers/wait_support.sh
fi
# generate database.yml
source /my-app/docker-initializers/generate_database_yml.sh /my-app/config/database.yml
exec "$@"
Default Params Support
In the above section we add a switch for wait support. Only if user set WAIT_FOR_DEPENDED_SERVICES to be true we will enjoy the benefit of wait support.
It’s natural to consider adding a default value for this variable. Also for mysql db name we can utilize this design, as following:
docker-initializers/default_env_params.sh
#!/bin/bash
if [ -z "$MYSQL_MY_APP_DBNAME" ]; then
export MYSQL_MY_APP_DBNAME=my_app_database
fi
if [ -z "$WAIT_FOR_DEPENDED_SERVICES" ]; then
export WAIT_FOR_DEPENDED_SERVICES=true
fi
Including some simple cleanups, now docker-entrypoint.sh reach its final state:
#!/bin/bash
# stop execution if any commands fail
set -e
# handling case such as docker run --link mysql-1:mysql
source /my-app/docker-initializers/link_support.sh
# set ENV params if they're not set by users
source /my-app/docker-initializers/default_env_params.sh
# wait for other service ports to be ready
if [ "$WAIT_FOR_DEPENDED_SERVICES" = "true" ]; then
source /my-app/docker-initializers/wait_support.sh
fi
# generate database.yml
source /my-app/docker-initializers/generate_database_yml.sh /my-app/config/database.yml
# prepare log and tmp directories
mkdir -p /my-app/log
mkdir -p /my-app/tmp
rm -rf /my-app/tmp/*
exec "$@"
Start web server
Choose a proper web server
Depends on each team’s situation, maybe you have some experts about puma, maybe you prefer event machine based implementations like thin.
It’s a total freedom to choose whatever we’re comfortable to power your rails app.
From our case, we were using unicorn as our web server so we kept using it in dockerized app.
It worth mentioning that unicorn has a fantastic feature that is it can dynamically adjust its worker count in runtime, using process signal like TTIN and TTOU.
Say we have unicorn.conf.rb in our project root, then we can start server in default ```CMD`` command:
FROM my-company/ruby-incredible-app-base:latest
ADD . /my-app
WORKDIR /my-app
RUN bundle install --jobs 3 --retry 3
RUN bundle clean --force
EXPOSE 3000
ENTRYPOINT ["/my-app/docker_entrypoint.sh"]
CMD ["unicorn_rails", "-l", "3000", "-c", "/my-app/unicorn.conf.rb"]
Conclusion
Some of you may figure out there’s still a problem, now it’s quite complex for us to start the application.
We need to either offering some MYSQL_* ENV variables or linking our app container to a mysql container.
This is a great question, and it opens the door for container orchestration, which is another fascinating area.
Maybe in the future I will post a blog post about it.
For now, just some quick advices. If you want to simplify local development or small amount deployment, you may be interested in docker-compose.
It’s a long run, sincerely thank you for reading till here!