Transferring a project from Vagrant + VirtualBox to Docker containers using Ansible

Before containers became a thing, the main tools for creating a local development environment were technologies like Vagrant And VirtualBox. These tools, combined with automation tools such as Ansible And Chef, allowed us to create a working, reproducible environment for applications. However, the development of lightweight virtualization options, pioneered by docker and constantly simplified by various cloud innovations, has led to the demise of these once very popular tools among developers. So fast that when we see them anywhere, we involuntarily think about the age of the code base.

And recently I came across them myself. To be more precise, I got a project that still relies on them – it involves installing a VirtualBox virtual machine running Debian, created using Vagrant and then configured using Ansible. And it all works. Well, for the most part. But when it doesn’t work, figuring out what went wrong is a real pain. Maintaining coordination between Vagrant and VirtualBox was a particularly nasty piece of black magic that got me thinking about cheaper, friendlier virtualization alternatives.

By the end of my introduction to the project, I had developed a three-step plan:

  1. Replace vagrant and virtualBox with docker. Those. use ansible to move the project to docker containers.

  2. Replace ansible with docker-composeaccompanied by special Dockerfiles for each of the applications involved in the project.

  3. Extend the tooling change to other environments: staging and production.

This article provides documentation for the first phase of this plan.

Why did I decide to write this article?

Using ansible to set up and run docker containers is not the most common scenario. In fact, one of the reasons for writing this article was that I could not find a single resource that collected what I needed. This lack of information resources is understandable, because once you manage to achieve some success in automating a project, adding something else becomes quite easy thanks to the many tools available.

But this, unfortunately, is a different case. Considering that I am not very familiar with the project, it will take a lot of time and effort to track down all the necessary dependencies scattered across several Ansible roles and playbooks, figure out their interaction and correctly reproduce them in isolated containers.

However, since the requirements for these dependencies are already reflected in the ansible playbooks, I can make things much easier for myself by simply pointing ansible to an empty container and have it configure the container the same way it configures a virtualBox virtual machine. After all, this is one of the amazing features of ansible: give it (ssh) access to any machine and it will give you the environment you want. In this case, the machine with access via ssh will be the docker container, and the desired environment will be the development environment reflected in various ansible playbooks.

Project

My project is a fairly large Ruby-on-Rails application with a bunch of typical dependencies such as Sidekiq, Nginx, etc.

For obvious reasons, I can't use the actual project in this article. We will use a public project instead docker-rails. I chose this project because its dependencies are quite similar to my own project. They contain:

  • Sidekiq

  • Postgres

  • Redis

  • Opensearch

Having these dependencies will give us the opportunity to solve a number of problems that I would like to consider in this article.

Well, our adventures begin with cloning the project into the workspace.

mkdir -p ansidock
cd ansidock
git clone https://github.com/ledermann/docker-rails

Inventory

The Ansible inventory is responsible for the list of nodes/hosts that Ansible should work with and how exactly it should work with them. Settings ranging from the connection plugin used to the location of the python interpreter on each node are configured using an inventory file. In this case, I'm using it to create a topology for my local development environment.

At a high level, we need to highlight two different nodes:

  1. the parent node where docker containers are created (and managed)

  2. a container node in which the desired application runs

To understand the intent behind this structure, let's look at the current ansible + vagrant + virtualBox setup.

Vagrant creates a virtual machine based on virtualBox. After that, it passes information about the new virtual machine to Ansible. Ansible then configures a new box to run the required application. In other words, vagrant works with virtualBox on the real host, the developer's machine, to provide a virtual machine that ansible then configures to run the target application.

As a result of all this, we will get the following inventory file:

# file: ansidock/dev
[dockerhost]
localhost

[app]
[postgres]
[redis]
[opensearch]

[containers:children]
app
postgres
redis
opensearch

[all:vars]
ansible_python_interpreter= /usr/bin/python3

[dockerhost:vars]
ansible_connection=local

[containers:vars]
ansible_connection=docker

The control node is specified as dockerhostwith connection to localhost. Hosts redis, postgres And opensearchconveniently grouped under the tag containers, are remote nodes that can be connected to via docker.

Note that none of the container hosts contain any connection information (such as URL or IP address). This is because they will be created on the fly, and one of the tricks I'll use moving forward is to dynamically populate this information as the containers are created.

Net

In a simplified world inside a VirtualBox virtual machine, all processes will be able to communicate with any neighboring process. So the first step is to ensure a wide open network for future services.

Docker allows you to set up different types of networks. For my purposes, a basic bridge network is sufficient.

# file: ansidock/network.yml
---
- hosts: dockerhost
  tasks:
    - name: "Create docker network: {{ network_name }}"
      ansible.builtin.docker_network:
        name: "{{ network_name }}"
        driver: bridge

This network will allow any container on it to contact any other container on the network using the container's IP address, name, or alias.

Dependencies

Our target application requires three main dependencies: Postgres, Redis And Opensearch.

In the actual project I'm working on, these dependencies are installed directly into the virtual machine. Ansible scripts are also available for this installation. This way we have the ability to pull up the same line and explicitly set each dependency to an empty container.

However, I have a second phase planned – the phase in which I will abandon Ansible. This way, we can already start deploying dedicated containers for each of these services. What's more, there are ready-to-use container options for these projects, making them fairly easy to implement. With this in mind, we will deploy these dependencies using their official docker images instead of running their installation playbooks.

Now we will need to do some repeatable manipulations with the containers, so now is the best time to create a reusable ansible role to abstract these tasks:

# file: ansidock/roles/container/tasks/main.yml
---
- name: "pull {{ image }}:{{ image_tag }}"
  ansible.builtin.docker_image:
    name: "{{ image }}"
    tag: "{{ image_tag }}"
    source: pull

- name: "create {{ container_name }} container"
  ansible.builtin.docker_container:
    name: "{{ container_name }}"
    image: "{{ image }}:{{ image_tag }}"
    command: "{{ container_command }}"
    auto_remove: yes
    detach: yes
    env: "{{ container_env }}"
    ports: "{{ container_ports }}"
    volumes: "{{ container_volumes }}"
    working_dir: "{{ container_workdir }}"
    networks:
      - name: "{{ network_name }}"

- name: "add {{ container_name }} container to host group: {{ container_host_group }}"
  ansible.builtin.add_host:
    name: "{{ container_name }}"
    groups:
      - "{{ container_host_group }}"
  changed_when: false
  when: container_host_group is defined

- name: "update {{ container_name }} package register"
  ansible.builtin.command:
    cmd: 'docker exec {{ container_name }} /bin/bash -c "apt-get update"'
  when: container_deps is defined

- name: install dependencies
  ansible.builtin.command:
    cmd: 'docker exec {{ container_name }} /bin/bash -c "apt-get install -y {{ container_deps | join(" ") }}"'
  when: container_deps is defined

with the following default variables

# file: ansidock/roles/container/defaults/main.yml
---
container_command:
container_env: {}
container_host_group:
container_ports: []
container_volumes: []
container_workdir:

This role includes the tasks of extracting, creating a container from a given image and adding it to the docker network. It can also install dependencies on the container. As you noticed, there is also a task here add hostdefined to fill empty host sections in the inventory.

Using this role, we create a playbook for our three dependencies:

# file: ansidock/dependencies.yml
---
- name: Postgres database
  hosts: dockerhost
  vars:
    image: "{{ postgres_image }}"
    image_tag: "{{ postgres_version }}"
    container_name: "{{ postgres_container_name }}"
    container_env: "{{ postgres_env }}"
    container_ports: "{{ postgres_ports }}"
    container_host_group: postgres
  roles:
    - container

- name: Redis cache
  hosts: dockerhost
  vars:
    image: "{{ redis_image }}"
    image_tag: "{{ redis_version }}"
    container_name: "{{ redis_container_name }}"
    container_host_group: redis
  roles:
    - container

- name: Opensearch library
  hosts: dockerhost
  vars:
    image: "{{ opensearch_image }}"
    image_tag: "{{ opensearch_version }}"
    container_name: "{{ opensearch_container_name }}"
    container_env: "{{ opensearch_env }}"
    container_host_group: opensearch
  roles:
    - container

This brings us about 40% closer to our goal.

Here's our progress so far:

  • inventory that will be dynamically updated as containers are created

  • docker network so all containers can communicate with each other

  • application dependencies available in native containers

It would also be a good idea for us to check our work. For variable values ​​used in scripts, we will provide:

# file: ansidock/group_vars/all.yml
---
network_name: ansidocknet
app_dir: /app/ansidock
app_ruby_version: 3.2.1
app_bundler_version: 2.4.6
# file: ansidock/group_vars/dockerhost.yml
---
postgres_image: postgres
postgres_version: 15-alpine
postgres_container_name: ansidock_db
postgres_ports:
  - 8765:5432
postgres_env:
  POSTGRES_PASSWORD: password
  POSTGRES_USER: postgres
  POSTGRES_DB: ansidock

opensearch_image: opensearchproject/opensearch
opensearch_version: latest
opensearch_container_name: ansidock_search
opensearch_env:
  discovery.type: single-node
  plugins.security.disabled: "true"

redis_image: redis
redis_version: alpine
redis_container_name: ansidock_redis

then let's run:

ansible-playbook -i dev network.yml dependencies.yml

After the script completes successfully, we perform the following checks:

docker container ls --format "{{.Names}}" 
# expect three containers: ansidock_search, ansidock_redis, ansidock_db
docker network inspect --format "{{range .Containers}}{{println .Name}}{{end}}" ansidocknet
# ansidock_search, ansidock_redis, ansidock_db

Great. Let's move on.

Application

Before we dive into the application, take a look at the value of the host attribute in the playbooks.

Please note that so far ansible has worked on dockerhost, that is, at the control node. So in some cases where we needed to perform an operation on a container, we did not use ansible directly on the container, but used ansible to execute a command shell on the management node, which then executed the command using the docker cli.

For example, look at the tasks install dependencies from the container role above.

- name: install dependencies
  ansible.builtin.command:
    cmd: 'docker exec {{ container_name }} /bin/bash -c "apt-get install -y {{ container_deps | join(" ") }}"'
  when: container_deps is defined

If the target at this stage were the container node, the tasks would be very simple:

- name: install dependencies
  ansible.builtin.apt:
    package: '{{ container_deps | join(" ") }}'
  when: container_deps is defined

But since ansible for this task runs from the control node, that is, from the host machine with containers, we must explicitly execute the commands using the docker cli.

This difference shows up in the application playbook, where in the first part we will need to create a container (using the management node as the host), and in the second we will need to configure this container (switching to the container node as the host). So watch out for switching.

The first step is to create an empty container similar to the debian virtual machine that vagrant provides with ansible to set up the project. We will take the base ubuntu image as an example. As with dependencies, we will use the container role for configuration.

However, before we do that, we need to figure out what to do with the container's main process. The life cycle of a docker container revolves around its main process (PID 1), and working with it correctly is the subject of many ideas, conclusions And disappointments in container management.

Our problem is that the target main process, the rails server, will only be available after Ansible has worked on the container. But for ansible to reach the container, the container must be running. And to run the container we'd like it to be a rails server… The obvious solution would be to pass PID 1 to another long-lived task (e.g. sleep infinity), and then start the rails server later when it's ready. This is a step in the right direction, with the caveat that we need everything running the main processes to also take control of the rails processes and any other child processes that might appear.

Fortunately, this is not such a difficult task. The Linux ecosystem is rich in applications written for just this purpose. Of all the variety of options, we will focus on supervisord. Supervisord, in addition to the desired behavior, allows you to add (and remove) child processes at any time. We'll use this later to start our rails processes.

With that sorted out, the next task is clear: put together a set of tasks that will give us a base image with supervisord and provide the ability to change the supervisord configuration as needed.

# file: ansidock/roles/supervisor/tasks/build.yml
---
- name: create temp directory for build
  ansible.builtin.tempfile:
    state: directory
  register: build_dir

- name: generate dockerfile
  ansible.builtin.template:
    src: dockerfile.j2
    dest: '{{ build_dir.path }}/Dockerfile'

- name: generate supervisord conf
  ansible.builtin.template:
    src: supervisord.conf.j2
    dest: '{{ build_dir.path }}/supervisord.conf'

- name: build supervisord image
  ansible.builtin.docker_image:
    name: "{{ image }}"
    tag: "{{ image_tag }}"
    source: build
    state: present
    force_source: true
    build:
      path: "{{ build_dir.path }}"
      pull: yes

The following two templates are required to complete the task:

simple supervisord configuration

; file: ansidock/roles/supervisor/templates/supervisord.conf.j2
[supervisord]
logfile=/tmp/supervisord.log
loglevel=debug
nodaemon=true
user=root

and the docker image that uses it

# file: ansidock/roles/supervisor/templates/dockerfile.j2
# syntax=docker/dockerfile:1
FROM ubuntu:23.04

RUN apt-get update \
    && apt-get install -y supervisor \
    && mkdir -p /var/log/supervisor

COPY supervisord.conf /etc/supervisor/conf.d/supervisord.conf

CMD ["/usr/bin/supervisord", "-n"]

The end result will be an Ubuntu image with supervisord installed.

And now that we have the image, we can pass it to run our container role. So here is the first part of our application playbook:

# file: ansidock/application.yml
---
- name: Prepare application container image
  hosts: dockerhost
  vars:
    aim: build
    image: "{{ app_image }}"
    image_tag: "{{ app_image_version }}"
    container_name: "{{ app_container_name }}"
    container_env: "{{ app_env }}"
    container_ports: "{{ app_ports }}"
    container_host_group: app
    container_workdir: "{{app_dir}}"
    container_volumes:
      - "{{playbook_dir}}/{{app_src}}:{{app_dir}}"
    container_deps:
      - python3-apt
      - python3
      - python3-pip
  roles:
    - supervisor
    - container

Note the additional python dependencies we install into the container. They will allow us to use ansible commands directly in the container (which we will use in the next step). Of course, we can (and should) include them when building the base image, but it won't be as interesting.

Also, if you didn't notice, at this point we added our project to the container.

Now that we have a properly prepared base image, everything is ready for my real project. All that was left was to point the ansible scripts to the container host, and voila, I had a production environment similar to my existing virtual machine solution. But since you've stuck with me so far, we'll finish this part by running the docker-rails project in our setup.

All that remains is to configure the container to run the rails application. To do this, you need to install ruby ​​with all its dependencies, install the ruby ​​dependency manager Bundler, Node.js and its package manager Yarnand also prepare the database for the application.

# file: ansidock/roles/ruby/tasks/main.yml
---
- name: install rbenv and app dependencies
  ansible.builtin.apt:
    name:
      - autoconf
      - bison
      - build-essential
      - git
      - imagemagick
      - libdb-dev
      - libffi-dev
      - libgdbm-dev
      - libgdbm6
      - libgmp-dev
      - libncurses5-dev
      - libpq-dev
      - libreadline6-dev
      - libssl-dev
      - libyaml-dev
      - patch
      - rbenv
      - ruby-build
      - rustc
      - tzdata
      - uuid-dev
      - zlib1g-dev
    state: present
    update_cache: true

- name: register rbenv root
  ansible.builtin.command:
    cmd: rbenv root
  register: rbenv_root

- name: install ruby-build rbenv plugin
  ansible.builtin.git:
    repo: https://github.com/rbenv/ruby-build.git
    dest: "{{ rbenv_root.stdout }}/plugins/ruby-build"

- name: "install ruby {{ ruby_version }}"
  ansible.builtin.command:
    cmd: "rbenv install {{ ruby_version }}"
  args:
    creates: "{{ rbenv_root.stdout }}/versions/{{ ruby_version }}/bin/ruby"
  environment:
    CONFIGURE_OPTS: "--disable-install-doc"
    RBENV_ROOT: "{{ rbenv_root.stdout }}"
    PATH: "{{ rbenv_root.stdout }}/shims:{{ ansible_env.PATH }}"

- name: install bundler
  ansible.builtin.gem:
    name: bundler
    version: "{{ bundler_version }}"
  environment:
    PATH: "{{ rbenv_root.stdout }}/shims:{{ ansible_env.PATH }}"

- name: install app gems
  ansible.builtin.bundler:
    state: present
    executable: "{{ rbenv_root.stdout }}/shims/bundle"

- name: remove conflicting yarn bin
  ansible.builtin.apt:
    package: cmdtest
    state: absent

- name: add yarn source key
  block:
    - name: yarn |no apt key
      ansible.builtin.get_url:
        url: https://dl.yarnpkg.com/debian/pubkey.gpg
        dest: /etc/apt/trusted.gpg.d/yarn.asc

    - name: yarn | apt source
      ansible.builtin.apt_repository:
        repo: "deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/yarn.asc] https://dl.yarnpkg.com/debian/ stable main"
        state: present
        update_cache: true

- name: install yarn
  ansible.builtin.apt:
    package: yarn

- name: install javascript packages
  ansible.builtin.command:
    cmd: yarn install --frozen-lockfile
  environment:
    NODE_OPTIONS: "--openssl-legacy-provider"

- name: prepare database
  ansible.builtin.command:
    cmd: bundle exec rails db:prepare
  environment:
    PATH: "{{ rbenv_root.stdout }}/shims:{{ ansible_env.PATH }}"

- name: precompile assets
  ansible.builtin.command:
    cmd: bundle exec rails assets:precompile
  environment:
    PATH: "{{ rbenv_root.stdout }}/shims:{{ ansible_env.PATH }}"
    NODE_OPTIONS: "--openssl-legacy-provider"

Nothing fancy, just standard tasks that any good rubyist should have done at some point in their career.

If it looks overly cluttered, that's because it struggles with common problems like keyword yarnindicating cmdtest on Ubuntu, which should be explicitly replaced with Yarn, JavaScript dependency manager; problems such as outdated version rbenv ruby-build in apt repositories, … etc. In any case, all these intricate subtleties do not interest us now. So we move on.

Now that we're ready to run the application, we need to instruct supervisord to help us do it.

# file: ansidock/roles/supervisor/tasks/reconfigure.yml
---
- name: generate supervisor conf
  ansible.builtin.template:
    src: program.conf.j2
    dest: "/etc/supervisor/conf.d/{{ filename }}"
  vars:
    command: "{{ item.value }}"
    program: "{{ item.key }}"
    filename: "{{ item.key }}.conf"
    workdir: "{{ container_workdir }}"
  with_dict: "{{ programs }}"

- name: restart supervisord
  ansible.builtin.supervisorctl:
    name: '{{ item.key }}'
    config: /etc/supervisor/supervisord.conf
    state: present
  with_dict: "{{ programs }}"

The task takes a program map for the execute command, generates the supervisord config from the template below and copies it to the container.

; file: ansidock/roles/supervisor/templates/program.conf.j2
[program:{{ program }}]
command={{ command }}
directory={{ workdir }}
startretries=10
stdout_logfile={{ workdir }}/log/development.log
user=root

Yes, the task also restarts supervisord to use the new configuration(s).

Since this role serves both to create the base image and to change the configuration of the supervisord process, let's add a parent task that will switch between these two actions:

# file: ansidock/roles/supervisor/tasks/main.yml
---
- include_tasks: build.yml
  when: aim == "build"
- include_tasks: reconfigure.yml
  when: aim == "configure"

As you probably realized, we haven't paid much attention to Sidekiq so far. This is because it runs the same rails application, just through a different process. Therefore, everything we did for the main application applies to it as well. We'll highlight it only now, when we've finished working on our application's playbook:

# file: ansidock/application.yml
---
- name: prepare application container image
  hosts: dockerhost
  vars:
    aim: build
    image: "{{ app_image }}"
    image_tag: "{{ app_image_version }}"
    container_name: "{{ app_container_name }}"
    container_env: "{{ app_env }}"
    container_ports: "{{ app_ports }}"
    container_host_group: app
    container_workdir: "{{app_dir}}"
    container_volumes:
      - "{{ playbook_dir }}/{{ app_src }}:{{ app_dir }}"
    container_deps:
      - python3-apt
      - python3
      - python3-pip
  roles:
    - supervisor
    - container

- name: setup application container
  hosts: app
  vars:
    aim: configure
    container_workdir: "{{ app_dir }}"
    ruby_version: "{{ app_ruby_version }}"
    bundler_version: "{{ app_bundler_version }}"
    programs:
      app: "/root/.rbenv/shims/bundle exec puma -C config/puma.rb"
      worker: "/root/.rbenv/shims/bundle exec sidekiq"
  roles:
    - ruby
    - supervisor

And our work is finished.

Collect everything into one beautiful playbook.

# file: ansidock/site.yml
---
- ansible.builtin.import_playbook: network.yml
- ansible.builtin.import_playbook: dependencies.yml
- ansible.builtin.import_playbook: application.yml

and accompanying vars file

# file: ansidock/group_vars/dockerhost.yml
---
postgres_image: postgres
postgres_version: 15-alpine
postgres_container_name: ansidock_db
postgres_ports:
  - 8765:5432
postgres_env:
  POSTGRES_PASSWORD: password
  POSTGRES_USER: postgres
  POSTGRES_DB: ansidock

opensearch_image: opensearchproject/opensearch
opensearch_version: latest
opensearch_container_name: ansidock_search
opensearch_env:
  discovery.type: single-node
  plugins.security.disabled: "true"

redis_image: redis
redis_version: alpine
redis_container_name: ansidock_redis

app_image: rails_supervisor
app_image_version: 2
app_container_name: ansidock_app
app_src: docker-rails
app_ports:
  - 7000:3000
app_env:
  DB_HOST: "{{ postgres_container_name }}"
  DB_USER: "{{ postgres_env.POSTGRES_USER }}"
  DB_PASSWORD: "{{ postgres_env.POSTGRES_PASSWORD }}"
  OPENSEARCH_HOST: "{{ opensearch_container_name }}"
  REDIS_SIDEKIQ_URL: "redis://{{ redis_container_name }}:6379/0"
  REDIS_CABLE_URL: "redis://{{ redis_container_name }}:6379/1"
  REDIS_CACHE_URL: "redis://{{ redis_container_name }}:6379/2"
  SECRET_KEY_BASE: some-super-secret-from-ansible-vault
  RAILS_MASTER_KEY: another-super-secret-from-ansible-vault
  APP_ADMIN_EMAIL: admin@example.org
  APP_ADMIN_PASSWORD: secret
  APP_EMAIL: reply@example.org
  PLAUSIBLE_SCRIPT: https://plausible.example.com/js/script.js

Let's also take a test drive of our work

ansible-playbook -i dev site.yml

If everything went well, docker container ls should show our 4 containers running normally. And when visiting localhost:7000 We should be greeted with a sample application running in all its glory.

So we did it.

Conclusion

This exercise helped answer the questions:

  • Can I replace vagrant + virtualBox with docker?

  • If yes, how can I make it easier?

But this is not the final stop. Everything you need is put into containers, and they are now a suitable target for many modern tools.

To begin with, we can take a snapshot of the application after it is created. By creating an image from a running container, we have a working base that we can use to bypass the entire Ansible creation process we just went through.

And armed with these new capabilities, we will begin the second stage – docker-compose.


The material was prepared in anticipation of the start of the online course “Advanced administration of RED OS”.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *