turning the complex into simple

Using Ansible as an automation tool, you often encounter the task of processing and filtering structured data. Typically, this is a set of facts received from managed servers, or a response to a request to external APIs that return data in the form of standard json. Many inexperienced engineers, using Ansible in such cases, begin to resort to the help of familiar console commands and begin to create what is called bashsible among specialists. In general, I remember the famous meme:

In this article, we’ll show you how easy it is to process data directly in Ansible using its own powerful features. It’s about filters Jinja2, which provide a powerful yet intuitive tool for data transformation. These filters allow you to efficiently sort, select, and transform data, eliminating the need for complex external commands and scripts.

In conventional programming languages, the task of data processing is usually solved using loops (for, while, for_each, etc.) and various functions for converting object types (arrays, collections, closures, etc.). Ansible uses a simplified data model, using There are essentially only two versions of a data object, a list and a dictionary. If you don’t understand very well what these are and how they differ, I recommend reading them first here is this short article.

Let’s start with basic examples of using filters to work with lists and dictionaries. This will allow you to see how powerful and flexible these tools can be in the right hands.

In practice, you often have to deal with facts or the result of executing certain modules, which are a structure in the form of json/yaml. For example, let’s take a fact like ansible_mounts and let’s try to work with him.

Try running the following playbook that displays the values ansible_mounts:

---
- name: Test ansible_mounts
  hosts: localhost
  gather_facts: true
  connection: local
  tasks:
    - name: Show ansible_mounts
      debug:
        var: ansible_mounts

The output will be something like this:

TASK [Show ansible_mounts] *********************************************
ok: [localhost] => 
  ansible_mounts:
  - block_available: 9906494
    block_size: 4096
    block_total: 16305043
    block_used: 6398549
    device: /dev/sda3
    fstype: ext4
    inode_available: 3666700
    inode_total: 4161536
    inode_used: 494836
    mount: /
    options: rw,relatime,errors=remount-ro
    size_available: 40576999424
    size_total: 66785456128
    uuid: e24606ee-2b07-4de0-a3c3-63c605f627ff
  - block_available: 0
    block_size: 131072
    block_total: 1
    block_used: 1
    device: /dev/loop0
    fstype: squashfs
    inode_available: 0
    inode_total: 29
    inode_used: 29
    mount: /snap/bare/5
    options: ro,nodev,relatime,errors=continue,threads=single
    size_available: 0
    size_total: 131072
    uuid: N/A
  - block_available: 0
    block_size: 131072
    block_total: 507
    block_used: 507
    device: /dev/loop1
    fstype: squashfs
    inode_available: 0
    inode_total: 11906
    inode_used: 11906
    mount: /snap/core20/1822
    options: ro,nodev,relatime,errors=continue,threads=single
    size_available: 0
    size_total: 66453504
    uuid: N/A
...

Let’s try to filter from this long list only those values for which the key value device contains /dev/sda. This can be done using the selectattr filter and match test:

---
- name: Show ansible_mounts filtered
  hosts: localhost
  gather_facts: true
  connection: local
  tasks:
    - name: Show ansible_mounts
      debug:
        var: ansible_mounts | selectattr('device', 'match', '/dev/sda')

We get something like this:

TASK [Show ansible_mounts] *********************************
ok: [localhost] => 
  ansible_mounts | selectattr('device', 'match', '/dev/sda'):
  - block_available: 9906486
    block_size: 4096
    block_total: 16305043
    block_used: 6398557
    device: /dev/sda3
    fstype: ext4
    inode_available: 3666696
    inode_total: 4161536
    inode_used: 494840
    mount: /
    options: rw,relatime,errors=remount-ro
    size_available: 40576966656
    size_total: 66785456128
    uuid: e24606ee-2b07-4de0-a3c3-63c605f627ff
  - block_available: 9906486
    block_size: 4096
    block_total: 16305043
    block_used: 6398557
    device: /dev/sda3
    fstype: ext4
    inode_available: 3666696
    inode_total: 4161536
    inode_used: 494840
    mount: /var/snap/firefox/common/host-hunspell
    options: ro,noexec,noatime,errors=remount-ro,bind
    size_available: 40576966656
    size_total: 66785456128
    uuid: e24606ee-2b07-4de0-a3c3-63c605f627ff
  - block_available: 129508
    block_size: 4096
    block_total: 131063
    block_used: 1555
    device: /dev/sda2
    fstype: vfat
    inode_available: 0
    inode_total: 0
    inode_used: 0
    mount: /boot/efi
    options: rw,relatime,fmask=0077,dmask=0077,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro
    size_available: 530464768
    size_total: 536834048
    uuid: 7041-A883

Now let’s assume that from the entire set of keys of each dictionary we only need the key value mount. How to select only the values of a specific key from the resulting list? Here the map filter will come to our aid. This is a fairly powerful filter, the essence of which boils down to the fact that it applies a filter with arguments, which themselves are passed to it as arguments, to each element of the list of dictionaries that come to the input. In the simplest case, if we just need to get the value of a specific key from each element of a list of dictionaries, using this filter will be very simple. You just need to specify the value of the desired key name as a filter attribute with the corresponding command. In our case it will be map(attribute="mount"). As a result, we get the following code:

---
- name: Show ansible_mounts filtered
  hosts: localhost
  gather_facts: true
  connection: local
  tasks:
    - name: Show mounts
      debug:
        var: >-
          ansible_mounts
            | selectattr('device', 'match', '/dev/sda')
            | map(attribute="mount")

The output is the following:

TASK [Show ansible_mounts] ***************************************
ok: [localhost] => 
  ? |-
    ansible_mounts
      | selectattr('device', 'match', '/dev/sda')
      | map(attribute="mount")
  : - /
    - /var/snap/firefox/common/host-hunspell
    - /boot/efi

As you can see, it turns out to be quite simple to obtain the data set we need, even without the use of programming and loops. Let’s make it more difficult. Let’s say we need to output not only the values of the mount key, but also the values of the keys size_available And size_total. The included filters cannot do this. Ansible filters can filter lists and lists of dictionaries, but not the dictionary keys themselves. The solution is simple: you need to turn the dictionary into a list and filter it using available tools. Fortunately, Ansible has suitable filters for this task. With their help, you can turn dictionaries into lists, and lists back into dictionaries. These filters are called accordingly dict2items And items2dict.

For example, we have the following dictionary:

server_config:
  apache:
    version: "2.4"
    modules: ["mod_ssl", "mod_rewrite"]
  php:
    version: "7.4"
    extensions: ["curl", "json", "pdo"]
  mysql:
    version: "5.7"
    databases: ["db1", "db2", "db3"]
  system:
    os: "Ubuntu"
    os_version: "20.04"

And we want to filter only those dictionary elements where the key is version. We can do it this way. First, we turn word into a list using the dict2items filter:

server_config | dict2items

We get:

ok: [localhost] =>
  server_config | dict2items:
  - key: apache
    value:
      modules:
      - mod_ssl
      - mod_rewrite
      version: '2.4'
  - key: php
    value:
      extensions:
      - curl
      - json
      - pdo
      version: '7.4'
  - key: mysql
    value:
      databases:
      - db1
      - db2
      - db3
      version: '5.7'
  - key: system
    value:
      os: Ubuntu
      os_version: '20.04'

Now let’s filter only those list elements that contain a child key version:

server_config | dict2items | selectattr('value.version', 'defined')

We get:

ok: [localhost] =>
  server_config | dict2items | selectattr('value.version', 'defined'):
  - key: apache
    value:
      modules:
      - mod_ssl
      - mod_rewrite
      version: '2.4'
  - key: php
    value:
      extensions:
      - curl
      - json
      - pdo
      version: '7.4'
  - key: mysql
    value:
      databases:
      - db1
      - db2
      - db3
      version: '5.7'

Now let’s turn the list back into a dictionary of the original form. To do this, add a filter to the end of the pipeline items2dict. We get:

ok: [localhost] =>
  server_config | dict2items | selectattr('value.version', 'defined') | items2dict:
    apache:
      modules:
      - mod_ssl
      - mod_rewrite
      version: '2.4'
    mysql:
      databases:
      - db1
      - db2
      - db3
      version: '5.7'
    php:
      extensions:
      - curl
      - json
      - pdo
      version: '7.4'

Now let’s remember our original task with a list of dictionaries ansible_mounts from which we want to extract only some keys. We have already filtered the list itself by the condition we need, now we need to select only certain keys from the list. It turns out that you need to apply a filter to each element of the list of dictionaries dict2items, then filter this list by a list of keys, and then turn each child list back into a dictionary. Difficult? Not really. We have already seen how to work with a dictionary in the example above. Now we need to do the same with the list of dictionaries. The filter mentioned above will help us here. map, only in a more advanced application. We’ll show you the final result right away. Code:

- name: Show mounts data
  debug:
    var: >-
      ansible_mounts
        | selectattr('device', 'match', '/dev/sda')
        | map('dict2items')
        | map('selectattr', 'key', 'in', ['mount', 'size_available', 'size_total'])
        | map('items2dict')

And the result:

ok: [localhost] => 
  ? |-
    ansible_mounts
      | selectattr('device', 'match', '/dev/sda')
      | map('dict2items')
      | map('selectattr', 'key', 'in', ['mount', 'size_available', 'size_total'])
      | map('items2dict')
  : - mount: /
      size_available: 40576851968
      size_total: 66785456128
    - mount: /var/snap/firefox/common/host-hunspell
      size_available: 40576851968
      size_total: 66785456128
    - mount: /boot/efi
      size_available: 530464768
      size_total: 536834048

What does the code in this example consist of:

Selecting disk partitions:
- ansible_mounts | selectattr('device', 'match', '/dev/sda')
- A filter is used here selectattr to select those items from the list ansible_mountswhose attribute device matches regular expression /dev/sda. This allows you to select mounting information for only a specific device.
Converting dictionaries to lists of elements:
- | map('dict2items')
- The dict2items filter is applied to each list element (each dictionary), converting it into a list of key-value pairs.
Fetching specific keys:
- | map('selectattr', 'key', 'in', ['mount', 'size_available', 'size_total'])
- Next, for each list of key-value pairs, use map together with selectattrto select only those pairs whose keys include ‘mount’, ‘size_available’ or ‘size_total’.
Convert back to dictionaries:
- | map('items2dict')
- Finally, after filtering the desired keys, each list of key-value pairs is converted back into a dictionary using map and filter items2dict.

Conclusion

We’ve seen some clear examples of using Ansible filters, and hopefully now they don’t seem so complicated to you. After all, everything often seems difficult until you try. It’s the same with Ansible filters: try it, experiment, and see how they can make your life easier. Also, write in the comments if you would like to see a continuation of the article with other interesting examples of filters? If the topic is interesting, then in the following articles I will try to cover the topic of writing your own custom filters.