Transferring a project from Vagrant + VirtualBox to Docker containers using Ansible
Before containers became a thing, the main tools for creating a local development environment were technologies like Vagrant And VirtualBox. These tools, combined with automation tools such as Ansible And Chef, allowed us to create a working, reproducible environment for applications. However, the development of lightweight virtualization options, pioneered by docker and constantly simplified by various cloud innovations, has led to the demise of these once very popular tools among developers. So fast that when we see them anywhere, we involuntarily think about the age of the code base.
And recently I came across them myself. To be more precise, I got a project that still relies on them – it involves installing a VirtualBox virtual machine running Debian, created using Vagrant and then configured using Ansible. And it all works. Well, for the most part. But when it doesn’t work, figuring out what went wrong is a real pain. Maintaining coordination between Vagrant and VirtualBox was a particularly nasty piece of black magic that got me thinking about cheaper, friendlier virtualization alternatives.
By the end of my introduction to the project, I had developed a three-step plan:
Replace vagrant and virtualBox with docker. Those. use ansible to move the project to docker containers.
Replace ansible with docker-composeaccompanied by special Dockerfiles for each of the applications involved in the project.
Extend the tooling change to other environments: staging and production.
This article provides documentation for the first phase of this plan.
Why did I decide to write this article?
Using ansible to set up and run docker containers is not the most common scenario. In fact, one of the reasons for writing this article was that I could not find a single resource that collected what I needed. This lack of information resources is understandable, because once you manage to achieve some success in automating a project, adding something else becomes quite easy thanks to the many tools available.
But this, unfortunately, is a different case. Considering that I am not very familiar with the project, it will take a lot of time and effort to track down all the necessary dependencies scattered across several Ansible roles and playbooks, figure out their interaction and correctly reproduce them in isolated containers.
However, since the requirements for these dependencies are already reflected in the ansible playbooks, I can make things much easier for myself by simply pointing ansible to an empty container and have it configure the container the same way it configures a virtualBox virtual machine. After all, this is one of the amazing features of ansible: give it (ssh) access to any machine and it will give you the environment you want. In this case, the machine with access via ssh will be the docker container, and the desired environment will be the development environment reflected in various ansible playbooks.
Project
My project is a fairly large Ruby-on-Rails application with a bunch of typical dependencies such as Sidekiq, Nginx, etc.
For obvious reasons, I can't use the actual project in this article. We will use a public project instead docker-rails. I chose this project because its dependencies are quite similar to my own project. They contain:
Sidekiq
Postgres
Redis
Opensearch
Having these dependencies will give us the opportunity to solve a number of problems that I would like to consider in this article.
Well, our adventures begin with cloning the project into the workspace.
mkdir -p ansidock
cd ansidock
git clone https://github.com/ledermann/docker-rails
Inventory
The Ansible inventory is responsible for the list of nodes/hosts that Ansible should work with and how exactly it should work with them. Settings ranging from the connection plugin used to the location of the python interpreter on each node are configured using an inventory file. In this case, I'm using it to create a topology for my local development environment.
At a high level, we need to highlight two different nodes:
the parent node where docker containers are created (and managed)
a container node in which the desired application runs
To understand the intent behind this structure, let's look at the current ansible + vagrant + virtualBox setup.
Vagrant creates a virtual machine based on virtualBox. After that, it passes information about the new virtual machine to Ansible. Ansible then configures a new box to run the required application. In other words, vagrant works with virtualBox on the real host, the developer's machine, to provide a virtual machine that ansible then configures to run the target application.
As a result of all this, we will get the following inventory file:
# file: ansidock/dev
[dockerhost]
localhost
[app]
[postgres]
[redis]
[opensearch]
[containers:children]
app
postgres
redis
opensearch
[all:vars]
ansible_python_interpreter= /usr/bin/python3
[dockerhost:vars]
ansible_connection=local
[containers:vars]
ansible_connection=docker
The control node is specified as dockerhostwith connection to localhost. Hosts redis, postgres And opensearchconveniently grouped under the tag containers, are remote nodes that can be connected to via docker.
Note that none of the container hosts contain any connection information (such as URL or IP address). This is because they will be created on the fly, and one of the tricks I'll use moving forward is to dynamically populate this information as the containers are created.
Net
In a simplified world inside a VirtualBox virtual machine, all processes will be able to communicate with any neighboring process. So the first step is to ensure a wide open network for future services.
Docker allows you to set up different types of networks. For my purposes, a basic bridge network is sufficient.
# file: ansidock/network.yml
---
- hosts: dockerhost
tasks:
- name: "Create docker network: {{ network_name }}"
ansible.builtin.docker_network:
name: "{{ network_name }}"
driver: bridge
This network will allow any container on it to contact any other container on the network using the container's IP address, name, or alias.
Dependencies
Our target application requires three main dependencies: Postgres, Redis And Opensearch.
In the actual project I'm working on, these dependencies are installed directly into the virtual machine. Ansible scripts are also available for this installation. This way we have the ability to pull up the same line and explicitly set each dependency to an empty container.
However, I have a second phase planned – the phase in which I will abandon Ansible. This way, we can already start deploying dedicated containers for each of these services. What's more, there are ready-to-use container options for these projects, making them fairly easy to implement. With this in mind, we will deploy these dependencies using their official docker images instead of running their installation playbooks.
Now we will need to do some repeatable manipulations with the containers, so now is the best time to create a reusable ansible role to abstract these tasks:
# file: ansidock/roles/container/tasks/main.yml
---
- name: "pull {{ image }}:{{ image_tag }}"
ansible.builtin.docker_image:
name: "{{ image }}"
tag: "{{ image_tag }}"
source: pull
- name: "create {{ container_name }} container"
ansible.builtin.docker_container:
name: "{{ container_name }}"
image: "{{ image }}:{{ image_tag }}"
command: "{{ container_command }}"
auto_remove: yes
detach: yes
env: "{{ container_env }}"
ports: "{{ container_ports }}"
volumes: "{{ container_volumes }}"
working_dir: "{{ container_workdir }}"
networks:
- name: "{{ network_name }}"
- name: "add {{ container_name }} container to host group: {{ container_host_group }}"
ansible.builtin.add_host:
name: "{{ container_name }}"
groups:
- "{{ container_host_group }}"
changed_when: false
when: container_host_group is defined
- name: "update {{ container_name }} package register"
ansible.builtin.command:
cmd: 'docker exec {{ container_name }} /bin/bash -c "apt-get update"'
when: container_deps is defined
- name: install dependencies
ansible.builtin.command:
cmd: 'docker exec {{ container_name }} /bin/bash -c "apt-get install -y {{ container_deps | join(" ") }}"'
when: container_deps is defined
with the following default variables
# file: ansidock/roles/container/defaults/main.yml
---
container_command:
container_env: {}
container_host_group:
container_ports: []
container_volumes: []
container_workdir:
This role includes the tasks of extracting, creating a container from a given image and adding it to the docker network. It can also install dependencies on the container. As you noticed, there is also a task here add hostdefined to fill empty host sections in the inventory.
Using this role, we create a playbook for our three dependencies:
# file: ansidock/dependencies.yml
---
- name: Postgres database
hosts: dockerhost
vars:
image: "{{ postgres_image }}"
image_tag: "{{ postgres_version }}"
container_name: "{{ postgres_container_name }}"
container_env: "{{ postgres_env }}"
container_ports: "{{ postgres_ports }}"
container_host_group: postgres
roles:
- container
- name: Redis cache
hosts: dockerhost
vars:
image: "{{ redis_image }}"
image_tag: "{{ redis_version }}"
container_name: "{{ redis_container_name }}"
container_host_group: redis
roles:
- container
- name: Opensearch library
hosts: dockerhost
vars:
image: "{{ opensearch_image }}"
image_tag: "{{ opensearch_version }}"
container_name: "{{ opensearch_container_name }}"
container_env: "{{ opensearch_env }}"
container_host_group: opensearch
roles:
- container
This brings us about 40% closer to our goal.
Here's our progress so far:
inventory that will be dynamically updated as containers are created
docker network so all containers can communicate with each other
application dependencies available in native containers
It would also be a good idea for us to check our work. For variable values used in scripts, we will provide:
# file: ansidock/group_vars/all.yml
---
network_name: ansidocknet
app_dir: /app/ansidock
app_ruby_version: 3.2.1
app_bundler_version: 2.4.6
# file: ansidock/group_vars/dockerhost.yml
---
postgres_image: postgres
postgres_version: 15-alpine
postgres_container_name: ansidock_db
postgres_ports:
- 8765:5432
postgres_env:
POSTGRES_PASSWORD: password
POSTGRES_USER: postgres
POSTGRES_DB: ansidock
opensearch_image: opensearchproject/opensearch
opensearch_version: latest
opensearch_container_name: ansidock_search
opensearch_env:
discovery.type: single-node
plugins.security.disabled: "true"
redis_image: redis
redis_version: alpine
redis_container_name: ansidock_redis
then let's run:
ansible-playbook -i dev network.yml dependencies.yml
After the script completes successfully, we perform the following checks:
docker container ls --format "{{.Names}}"
# expect three containers: ansidock_search, ansidock_redis, ansidock_db
docker network inspect --format "{{range .Containers}}{{println .Name}}{{end}}" ansidocknet
# ansidock_search, ansidock_redis, ansidock_db
Great. Let's move on.
Application
Before we dive into the application, take a look at the value of the host attribute in the playbooks.
Please note that so far ansible has worked on dockerhost, that is, at the control node. So in some cases where we needed to perform an operation on a container, we did not use ansible directly on the container, but used ansible to execute a command shell on the management node, which then executed the command using the docker cli.
For example, look at the tasks install dependencies
from the container role above.
- name: install dependencies
ansible.builtin.command:
cmd: 'docker exec {{ container_name }} /bin/bash -c "apt-get install -y {{ container_deps | join(" ") }}"'
when: container_deps is defined
If the target at this stage were the container node, the tasks would be very simple:
- name: install dependencies
ansible.builtin.apt:
package: '{{ container_deps | join(" ") }}'
when: container_deps is defined
But since ansible for this task runs from the control node, that is, from the host machine with containers, we must explicitly execute the commands using the docker cli.
This difference shows up in the application playbook, where in the first part we will need to create a container (using the management node as the host), and in the second we will need to configure this container (switching to the container node as the host). So watch out for switching.
The first step is to create an empty container similar to the debian virtual machine that vagrant provides with ansible to set up the project. We will take the base ubuntu image as an example. As with dependencies, we will use the container role for configuration.
However, before we do that, we need to figure out what to do with the container's main process. The life cycle of a docker container revolves around its main process (PID 1), and working with it correctly is the subject of many ideas, conclusions And disappointments in container management.
Our problem is that the target main process, the rails server, will only be available after Ansible has worked on the container. But for ansible to reach the container, the container must be running. And to run the container we'd like it to be a rails server… The obvious solution would be to pass PID 1 to another long-lived task (e.g. sleep infinity
), and then start the rails server later when it's ready. This is a step in the right direction, with the caveat that we need everything running the main processes to also take control of the rails processes and any other child processes that might appear.
Fortunately, this is not such a difficult task. The Linux ecosystem is rich in applications written for just this purpose. Of all the variety of options, we will focus on supervisord. Supervisord, in addition to the desired behavior, allows you to add (and remove) child processes at any time. We'll use this later to start our rails processes.
With that sorted out, the next task is clear: put together a set of tasks that will give us a base image with supervisord and provide the ability to change the supervisord configuration as needed.
# file: ansidock/roles/supervisor/tasks/build.yml
---
- name: create temp directory for build
ansible.builtin.tempfile:
state: directory
register: build_dir
- name: generate dockerfile
ansible.builtin.template:
src: dockerfile.j2
dest: '{{ build_dir.path }}/Dockerfile'
- name: generate supervisord conf
ansible.builtin.template:
src: supervisord.conf.j2
dest: '{{ build_dir.path }}/supervisord.conf'
- name: build supervisord image
ansible.builtin.docker_image:
name: "{{ image }}"
tag: "{{ image_tag }}"
source: build
state: present
force_source: true
build:
path: "{{ build_dir.path }}"
pull: yes
The following two templates are required to complete the task:
simple supervisord configuration
; file: ansidock/roles/supervisor/templates/supervisord.conf.j2
[supervisord]
logfile=/tmp/supervisord.log
loglevel=debug
nodaemon=true
user=root
and the docker image that uses it
# file: ansidock/roles/supervisor/templates/dockerfile.j2
# syntax=docker/dockerfile:1
FROM ubuntu:23.04
RUN apt-get update \
&& apt-get install -y supervisor \
&& mkdir -p /var/log/supervisor
COPY supervisord.conf /etc/supervisor/conf.d/supervisord.conf
CMD ["/usr/bin/supervisord", "-n"]
The end result will be an Ubuntu image with supervisord installed.
And now that we have the image, we can pass it to run our container role. So here is the first part of our application playbook:
# file: ansidock/application.yml
---
- name: Prepare application container image
hosts: dockerhost
vars:
aim: build
image: "{{ app_image }}"
image_tag: "{{ app_image_version }}"
container_name: "{{ app_container_name }}"
container_env: "{{ app_env }}"
container_ports: "{{ app_ports }}"
container_host_group: app
container_workdir: "{{app_dir}}"
container_volumes:
- "{{playbook_dir}}/{{app_src}}:{{app_dir}}"
container_deps:
- python3-apt
- python3
- python3-pip
roles:
- supervisor
- container
Note the additional python dependencies we install into the container. They will allow us to use ansible commands directly in the container (which we will use in the next step). Of course, we can (and should) include them when building the base image, but it won't be as interesting.
Also, if you didn't notice, at this point we added our project to the container.
Now that we have a properly prepared base image, everything is ready for my real project. All that was left was to point the ansible scripts to the container host, and voila, I had a production environment similar to my existing virtual machine solution. But since you've stuck with me so far, we'll finish this part by running the docker-rails project in our setup.
All that remains is to configure the container to run the rails application. To do this, you need to install ruby with all its dependencies, install the ruby dependency manager Bundler
, Node.js
and its package manager Yarn
and also prepare the database for the application.
# file: ansidock/roles/ruby/tasks/main.yml
---
- name: install rbenv and app dependencies
ansible.builtin.apt:
name:
- autoconf
- bison
- build-essential
- git
- imagemagick
- libdb-dev
- libffi-dev
- libgdbm-dev
- libgdbm6
- libgmp-dev
- libncurses5-dev
- libpq-dev
- libreadline6-dev
- libssl-dev
- libyaml-dev
- patch
- rbenv
- ruby-build
- rustc
- tzdata
- uuid-dev
- zlib1g-dev
state: present
update_cache: true
- name: register rbenv root
ansible.builtin.command:
cmd: rbenv root
register: rbenv_root
- name: install ruby-build rbenv plugin
ansible.builtin.git:
repo: https://github.com/rbenv/ruby-build.git
dest: "{{ rbenv_root.stdout }}/plugins/ruby-build"
- name: "install ruby {{ ruby_version }}"
ansible.builtin.command:
cmd: "rbenv install {{ ruby_version }}"
args:
creates: "{{ rbenv_root.stdout }}/versions/{{ ruby_version }}/bin/ruby"
environment:
CONFIGURE_OPTS: "--disable-install-doc"
RBENV_ROOT: "{{ rbenv_root.stdout }}"
PATH: "{{ rbenv_root.stdout }}/shims:{{ ansible_env.PATH }}"
- name: install bundler
ansible.builtin.gem:
name: bundler
version: "{{ bundler_version }}"
environment:
PATH: "{{ rbenv_root.stdout }}/shims:{{ ansible_env.PATH }}"
- name: install app gems
ansible.builtin.bundler:
state: present
executable: "{{ rbenv_root.stdout }}/shims/bundle"
- name: remove conflicting yarn bin
ansible.builtin.apt:
package: cmdtest
state: absent
- name: add yarn source key
block:
- name: yarn |no apt key
ansible.builtin.get_url:
url: https://dl.yarnpkg.com/debian/pubkey.gpg
dest: /etc/apt/trusted.gpg.d/yarn.asc
- name: yarn | apt source
ansible.builtin.apt_repository:
repo: "deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/yarn.asc] https://dl.yarnpkg.com/debian/ stable main"
state: present
update_cache: true
- name: install yarn
ansible.builtin.apt:
package: yarn
- name: install javascript packages
ansible.builtin.command:
cmd: yarn install --frozen-lockfile
environment:
NODE_OPTIONS: "--openssl-legacy-provider"
- name: prepare database
ansible.builtin.command:
cmd: bundle exec rails db:prepare
environment:
PATH: "{{ rbenv_root.stdout }}/shims:{{ ansible_env.PATH }}"
- name: precompile assets
ansible.builtin.command:
cmd: bundle exec rails assets:precompile
environment:
PATH: "{{ rbenv_root.stdout }}/shims:{{ ansible_env.PATH }}"
NODE_OPTIONS: "--openssl-legacy-provider"
Nothing fancy, just standard tasks that any good rubyist should have done at some point in their career.
If it looks overly cluttered, that's because it struggles with common problems like keyword yarn
indicating cmdtest
on Ubuntu, which should be explicitly replaced with Yarn
, JavaScript dependency manager; problems such as outdated version rbenv
ruby-build
in apt repositories, … etc. In any case, all these intricate subtleties do not interest us now. So we move on.
Now that we're ready to run the application, we need to instruct supervisord to help us do it.
# file: ansidock/roles/supervisor/tasks/reconfigure.yml
---
- name: generate supervisor conf
ansible.builtin.template:
src: program.conf.j2
dest: "/etc/supervisor/conf.d/{{ filename }}"
vars:
command: "{{ item.value }}"
program: "{{ item.key }}"
filename: "{{ item.key }}.conf"
workdir: "{{ container_workdir }}"
with_dict: "{{ programs }}"
- name: restart supervisord
ansible.builtin.supervisorctl:
name: '{{ item.key }}'
config: /etc/supervisor/supervisord.conf
state: present
with_dict: "{{ programs }}"
The task takes a program map for the execute command, generates the supervisord config from the template below and copies it to the container.
; file: ansidock/roles/supervisor/templates/program.conf.j2
[program:{{ program }}]
command={{ command }}
directory={{ workdir }}
startretries=10
stdout_logfile={{ workdir }}/log/development.log
user=root
Yes, the task also restarts supervisord to use the new configuration(s).
Since this role serves both to create the base image and to change the configuration of the supervisord process, let's add a parent task that will switch between these two actions:
# file: ansidock/roles/supervisor/tasks/main.yml
---
- include_tasks: build.yml
when: aim == "build"
- include_tasks: reconfigure.yml
when: aim == "configure"
As you probably realized, we haven't paid much attention to Sidekiq so far. This is because it runs the same rails application, just through a different process. Therefore, everything we did for the main application applies to it as well. We'll highlight it only now, when we've finished working on our application's playbook:
# file: ansidock/application.yml
---
- name: prepare application container image
hosts: dockerhost
vars:
aim: build
image: "{{ app_image }}"
image_tag: "{{ app_image_version }}"
container_name: "{{ app_container_name }}"
container_env: "{{ app_env }}"
container_ports: "{{ app_ports }}"
container_host_group: app
container_workdir: "{{app_dir}}"
container_volumes:
- "{{ playbook_dir }}/{{ app_src }}:{{ app_dir }}"
container_deps:
- python3-apt
- python3
- python3-pip
roles:
- supervisor
- container
- name: setup application container
hosts: app
vars:
aim: configure
container_workdir: "{{ app_dir }}"
ruby_version: "{{ app_ruby_version }}"
bundler_version: "{{ app_bundler_version }}"
programs:
app: "/root/.rbenv/shims/bundle exec puma -C config/puma.rb"
worker: "/root/.rbenv/shims/bundle exec sidekiq"
roles:
- ruby
- supervisor
And our work is finished.
Collect everything into one beautiful playbook.
# file: ansidock/site.yml
---
- ansible.builtin.import_playbook: network.yml
- ansible.builtin.import_playbook: dependencies.yml
- ansible.builtin.import_playbook: application.yml
and accompanying vars file
# file: ansidock/group_vars/dockerhost.yml
---
postgres_image: postgres
postgres_version: 15-alpine
postgres_container_name: ansidock_db
postgres_ports:
- 8765:5432
postgres_env:
POSTGRES_PASSWORD: password
POSTGRES_USER: postgres
POSTGRES_DB: ansidock
opensearch_image: opensearchproject/opensearch
opensearch_version: latest
opensearch_container_name: ansidock_search
opensearch_env:
discovery.type: single-node
plugins.security.disabled: "true"
redis_image: redis
redis_version: alpine
redis_container_name: ansidock_redis
app_image: rails_supervisor
app_image_version: 2
app_container_name: ansidock_app
app_src: docker-rails
app_ports:
- 7000:3000
app_env:
DB_HOST: "{{ postgres_container_name }}"
DB_USER: "{{ postgres_env.POSTGRES_USER }}"
DB_PASSWORD: "{{ postgres_env.POSTGRES_PASSWORD }}"
OPENSEARCH_HOST: "{{ opensearch_container_name }}"
REDIS_SIDEKIQ_URL: "redis://{{ redis_container_name }}:6379/0"
REDIS_CABLE_URL: "redis://{{ redis_container_name }}:6379/1"
REDIS_CACHE_URL: "redis://{{ redis_container_name }}:6379/2"
SECRET_KEY_BASE: some-super-secret-from-ansible-vault
RAILS_MASTER_KEY: another-super-secret-from-ansible-vault
APP_ADMIN_EMAIL: admin@example.org
APP_ADMIN_PASSWORD: secret
APP_EMAIL: reply@example.org
PLAUSIBLE_SCRIPT: https://plausible.example.com/js/script.js
Let's also take a test drive of our work
ansible-playbook -i dev site.yml
If everything went well, docker container ls
should show our 4 containers running normally. And when visiting localhost:7000
We should be greeted with a sample application running in all its glory.
So we did it.
Conclusion
This exercise helped answer the questions:
Can I replace vagrant + virtualBox with docker?
If yes, how can I make it easier?
But this is not the final stop. Everything you need is put into containers, and they are now a suitable target for many modern tools.
To begin with, we can take a snapshot of the application after it is created. By creating an image from a running container, we have a working base that we can use to bypass the entire Ansible creation process we just went through.
And armed with these new capabilities, we will begin the second stage – docker-compose.
The material was prepared in anticipation of the start of the online course “Advanced administration of RED OS”.