Groups of consistent snapshots and RestFul API for AERODISK storage

Hey Habr.

And let’s talk about automation in the world of storage systems. We at AERODISK regularly conduct surveys of our and not only our customers on the desired functions in our products. The desire of the client for us, perhaps, is not yet a law, but the desire of several dozen clients is necessarily reflected in the roadmap. One of these functions in the new version of the A-CORE 5.1.0 software for VOSTOK-5 and ENGINE-5 storage systems was storage management via the REST API and consistency groups when creating data snapshots on a group of interdependent volumes. In our article today, we will use a clear (albeit artificial) example to show why this is necessary and how to use this functionality in AERODISK storage systems. And you can listen to it at the webinar “Near IT”, which will be held on June 6, 2023 at 15:00 – register using the link.

So, some theory

REST API – a set of web services that allow developers to interact with the storage system using HTTP requests. These APIs provide a standard management method for automated systems, that is, they automate the usual set of storage management functions: creating and deleting pools and volumes, mapping volumes, orchestrating data snapshots, managing replication, QOS, file systems, and others. The REST API allows you to unify storage management from different manufacturers, integrate the necessary functions into external systems and applications: virtualization systems, backup and recovery systems, self-service portals in cloud infrastructures, etc.

Consistency groups

Modern IT landscapes consist of many information systems tightly interconnected by data flows. For example, in retail, the purchase of goods at the checkout of a store is reflected in many IT systems, including CRM, warehouse, financial reporting and planning, etc. That is, information systems are interconnected at any given time, which leads to additional difficulties in ensuring the integrity data when restoring from backups. Let’s imagine that due to the difference in the time of backing up systems after a failure and restoring from a backup, a product was sold in one system, but not in another: balance sheets do not converge, unaccounted goods are in warehouses, logistics failures occur, etc. bad stories.

If we are dealing with a modern storage system, then the technology for ensuring the consistency of volume groups (LUNs) comes to the rescue. By combining volumes into a logical group, storage ensures their connectivity during local or remote replication operations. If a data snapshot is created, then you can be sure that the data snapshot for all volumes in the group is created at a certain point in time at the same time and the systems will be synchronized.

Tune in

To demonstrate how the functionality works, we organized the following test bench:

To emulate interdependent IT systems, we will create a PostgreSQL database instance into which we will write test data with a timestamp. Based on the data generated in the table, we will create the same text files on different file systems. Naturally, in order for us to have an experiment, we will place the database and file systems on three different storage volumes (for example, 200GB in size).

So, let’s begin

Let’s create an RDG RAID 6/60 (4+2) group on the AERODISK disk array:

Let’s create 3 logical volumes on the R00 group with a volume of 200GB:

Let’s map 200GB volumes to the Ubuntu server via FC:

Scan for new LUNs on the Ubuntu server:

root@tester:~# echo 1 > /sys/class/fc_host/host11/issue_lip
root@tester:~# echo 1 > /sys/class/fc_host/host0/issue_lip

Let’s check for three 200GB block devices:

Let’s mount the created file systems into the operating system at the /vol6 /vol7 and /vol8 mount points.

Let’s create a tablespace and a PostgreSQL DB6 database instance. To generate a set of test data, we use a python script that inserts test data ID 100 .. 20000 and current_timestamp into the db_test database table. After inserting the data, we create text files with the name ID and minimal text at the /vol7 and /vol8 mount points.

The script runs for several minutes, as a result of which we have an emulation of the work of interconnected systems. Examples from real IT systems can be design systems that store drawings on a file system with a catalog in a DBMS, or media content management systems that also have a similar structure in both unstructured file systems and structured DBMS.

So, while the load emulator is running, let’s take a regular hardware snapshot of the /vol7 and /vol8 file system volumes:

Now let’s try to restore the hardware snapshots of two volumes and map the restored volumes:

You can see that due to the non-synchronization of the snapshot times, we get a different number of files on the restored file systems:

This problem is relevant both when creating snapshots manually and when using scripts. To solve this problem, we implemented the “Consistency Groups” functionality, which provides data consistency for all volumes combined into a group.

Let’s do it in a new way

Let’s create a SAVEMYDATA linked clones consistency group and add volumes VOL_1 and VOL_3:

During the creation of test data, we will create related clones for the SAVEMYDATA group:

After creating clones, delete file data from /vol7 and /vol8 and restore data from snapshots created in the consistency group:

Let’s check the recovered data: as you can see, the /vol7 and /vol8 file systems are synchronous:

And now about automation

As you can see, we have done a lot of actions in order to demonstrate the work of only a small part of the functionality of the storage system. Every day, storage administrators work with hundreds of hosts and several hundred volumes, and in such an environment it is quite laborious to configure all the settings from the graphical interface. The traditional admin assistant in this is CLI or a full-fledged REST API for automating routine operations and fully integrating storage into the application ecosystem.

In A-Core software version 5.1.0, we introduced a full-fledged REST API for our storage systems, which will help you set up and use storage without a lot of GUI operations. What typical REST API applications do we see:

  • allocation of capacity, presentation to hosts in large IT landscapes;

  • creation of standard configurations for rapid deployment;

  • integration into self-service portals;

  • integration with hardware snapshots of data for backup;

  • integration with configuration monitoring systems.

All REST API methods available for use are described and available from the storage interface at https://controller_ip/api/docs. There you can also find examples of queries and create and execute these queries in real time.

Automation example

As an example, let’s do the snapshot work we did in the beginning with REST requests. We will create scripts using our favorite Python.

So, we log in to the storage system and get a token:

URL_auth = “https://my_ip/api/auth/token”

headers = {

“accept”: “application/json”,

“Content-Type”: “application/x-www-form-urlencoded”

}

resp = requests.post(URL_auth, headers = headers ,data=”grant_type=&username=admin&password=*****&scope=&client_id=&client_secret=”)

tk = json.loads(resp.text)[‘access_token’]

Let’s take a snapshot of volume VOL_1 in group R00:

URL_snaplun = “https://my_ip/api/rdg/snapshot”

auth_str = (“Bearer ” + str(tk))

headers = {

‘accept’: ‘application/json’,

‘Authorization’: ‘Bearer ‘+str(tk),

‘Content-Type’: ‘application/json’,

}

json_data = {

‘pool’: ‘R00’,

‘lun’: ‘VOL_1’,

}

response = requests.post(URL_snaplun, headers=headers, json=json_data)

Let’s take a snapshot of volume VOL_3 in group R00:

json_data = {

‘pool’: ‘R00’,

‘lun’: ‘VOL_3’,

}

response = requests.post(URL_snaplun, headers=headers, json=json_data)

After restoring volumes from data snapshots, we will also need a procedure for adding mappings, which we will also do using REST requests. Authorization and obtaining a token will be identical to the previous step, while creating a mapping can be implemented using the following code:

url = “https://my_ip/api/fc/mapping”

auth_str = (“Bearer ” + str(tk))

headers = {

‘accept’: ‘application/json’,

‘Authorization’: ‘Bearer ‘+str(tk),

‘Content-Type’: ‘application/json’,

}

json_data = {

‘group’: ‘host2_79’,

‘pool’: ‘R00’,

‘lun’: ‘VOL_1’,

‘lun_id’: ‘1’

}

response = requests.post(URL_snaplun, headers=headers, json=json_data)

json_data = {

‘group’: ‘host2_79’,

‘pool’: ‘R00’,

‘lun’: ‘VOL_3’,

‘lun_id’: ‘3’

}

The process of managing data snapshots can be further integrated with the RMS system or applications. In this simple way, the process of creating snapshots and restoring can be done with one command. The use of REST requests for administration saves a lot of mouse clicks in routine operations and, I hope, will make the life of our users much easier.

To learn more about the REST API and using storage snapshots sign up for our webinar “About IT” on June 6 at the link. On it you will see a demonstration of the work of technologies, talk about the inner workings of developing new AERODISC features, and, of course, answer any questions on the topic and off topic 😊

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *