Secure Continuous Deployment database based on DB-First principles

The practice of Continuous Deployment implies complete automation of the delivery of changes to production when they are published in the main branch of the repository. It results in a high frequency of releases, which means they need to be done seamlessly or they will reduce the availability of the service.

Seamless releases are a non-trivial task and require careful adherence to a certain approach. Fortunately, it has already been tested on the familiar REST-API, but for context, we will look at this in more detail. Next, we will look at the application of this approach to database schema releases with accompanying tools and how this affects processes in the organization.

The publicized requirement of caution, combined with the human factor, inevitably leads to failures, and therefore this factor must be minimized. It is minimized, as a rule, in only two ways: process automation and system reorganization in order to eliminate some processes altogether. In this article we will touch on both methods with an emphasis on the first.

The main ideas in this article are based on the principles of database primacy over applications. If you are not familiar with this topic, expect a serious paradigm shift in the article.

In the examples we mean PostgreSQL as a database, but the main ideas of the article are universal.

Seamless releases using the example of REST API

Let's imagine a system consisting of three services. Two of them use the third. This means that backwards-incompatible changes to the API of the third service can lead to breakdowns in the other two.

This system is already in production, and we had a need to make a release that requires changing the API of a third service. How to do this without breaking anything, and without having to simultaneously roll out updates for all three services with downtime?

An approach

The established approach for such situations is to release the service with support for both the old and new APIs, followed by a gradual transition of consumers to the new API.

That is, in addition to the resource /v1 with the old API we have a resource /v2 with the new one. At the same time, the old API adapts the updated service to the same functionality. Making such a release without downtime is not a problem.

Obviously, maintaining different versions in this way is cumbersome and is a constraining factor for the development of the service. Therefore, after such a release, the stage of transferring all service consumers to the new API begins. Releasing consumers without downtime is also not a problem.

Eventually we get to the stage where all consumers /v1 have disappeared and we have the opportunity to remove this burden from our service by deleting all the code that provides this version of the API. We are making another deployment of the service with support only /v2 still no downtime.

Why did I tell you this? The fact is that nothing in this tactic binds us to REST directly. It is universal and applicable to everything that has an interface for interaction. Next, we will look at using it to release database schema updates.

Apply to the database

Let's imagine that in a database already deployed in production, we need to rename one of the tables. How to carry out a release so that running applications do not break, and without downtime for both the database and applications?

To do this, you will also need to divide this transformation into several releases. Alternatively, you can do this:

  1. Add support for a table with a new name in the schema by creating a projection onto the old table:

begin;
create view new_table as select * from old_table;
commit;
  1. Convert all applications to use only the new table name.

  2. Delete the projection and rename the old table to the new one:

begin;
drop view new_table;
alter table old_table rename to new_table;
commit;

As you can see, in essence, we have carried out the same actions as in the case of REST. At the first step, we added support for new functionality while maintaining backward compatibility with the previous one. Next, we transferred all users to use only the new functionality. And then they got rid of the old functionality.

We used migrations to transform the database. Updating a food database using migrations can be automated by integrating it into a CI/CD pipeline using tools such as liquibase, flyway, sqitch. This is an established practice and you will find many articles on this topic, so we will not focus on this.

The most difficult thing is hidden in the second point. To implement it, you need to go through all the requests in the applications and change the names. If we have only one application and all requests live in an isolated layer and are already covered by tests based on migrations, this may go smoothly, but in other cases we face difficulties and risks of going into production with broken integration with the database.

Database integration testing

So, we have identified the need to closely test the integration layer of applications with the database in case of backward-incompatible changes in its schema.

An ideal set of tests will allow us to ensure that queries:

  • executed without errors from the database

  • operate on data types expected by the application in parameters and results

  • use indexes effectively

  • produce expected content results

  • lead to expected changes in data if they are mutating

It sounds like a lot of work, so we'll probably end up cutting corners in practice. In general, this list is sorted in accordance with decreasing probability of implementation.

It is critically important that there are no errors from the database and that the application does not stop because, say, it receives int4 where it awaits uuid. The absence of errors for us will eliminate the most likely causes of failures, from typos to references to non-existent tables and columns in queries.

The absence of an index can often be survivable, since most likely the application, although it will run slower, will not stop completely, and you can react to this quickly if you have monitoring configured.

To make sure that queries behave as we expect, we need complex tests that require simulating many complex scenarios according to the principle “if such and such a record in the database has such and such a value, then calling this query will lead to such and such consequences.” They can certainly be important, but they are painstaking to write and often overlap with functional tests at the application or component API level.

So, the most important of the listed checks can be fully automated without the need to write any tests, and these checks can be carried out in a fraction of a second.

Automation of checks using the DB-first approach

Let's think about it: what minimum information would be required to convey to a colleague if we wanted to ask him to check SQL queries for correctness and compatibility with the database schema? The answer is obvious: you need the queries themselves and a description of the database schema. Since this information is enough for a person, it should be enough for an automation program for this task.

With the help of migrations, it is possible to recreate a database with the appropriate structure, and, having at hand all the queries that are planned to be made against it, it is possible to execute each one and verify compliance with the schema, obtain metadata about parameters and results and carry out analysis using queries like EXPLAIN.

As a result, you can fully automatically make sure that queries:

  • executed without errors from the database

  • operate on data types expected by applications in parameters and results

  • use indexes effectively

Great! But now we are dealing with a set of requests that are completely separate from the application code. The question arises: how to integrate all this and how to structure the work?

Code generation

The information obtained from the migrations and the set of queries is sufficient to generate an SDK that executes these queries and translates them into the appropriate types of parameters and results.

And here are just some of the opportunities that this opens up:

  • An SDK can be generated for any programming language that has a PostgreSQL driver. That is, by providing SQL files, you can end up with ready-made integrations for a plethora of languages ​​that may be used in your company: from the ubiquitous Java to the exotic Haskell.

  • Potentially, working with such an SDK should be much easier than with any framework, since by definition it is tailored to the domain of your database and does not have the task of matching others, which means the domain is narrower and the complexity is less.

  • For the same reason, it can fundamentally reduce the opportunity for users to shoot themselves in the foot, leaving only domain-appropriate use cases possible.

  • The Object-relational Impedance Mismatch problem eliminates itself, since queries are projected one-to-one into methods or procedures of the target language – after all, what is a query if not a parameterizable procedure that returns results?

We connect the generated SDK as a library to our project and henceforth communicate with the database from the application through it. We make changes to requests in the request code and regenerate the SDK. When we add migration, we also regenerate the SDK. We never edit the SDK ourselves.

Thus, we get a clear boundary between the database code and the application code. SQL queries live with schema migrations rather than being scattered throughout the application code. The application relies on artifacts of a code generator, which works based on SQL code.

As a result, we get the following change delivery process.

Change delivery process by role

¹ Push to master is a PR/MR procedure with its own checks for review and CI, if successful, changes are pushed into master. It could also be a direct push to this thread. It all depends on the policies in the organization.

² The library repository is Artifactory, Maven, NuGet, Hackage, S3, Git repository, or whatever is specific to your language and organization.

The diagram above shows a clear identification of the area of ​​responsibility for the database at the process level. We have also already noted that at the code level this division is also obvious.

Of course, both developer roles can be filled by the same people. However, now we have the opportunity to transfer the development of the database to the responsibility of specialists who understand its intricacies, and not superficially, like most application developers. Due to this one action, we will simultaneously speed up development and reduce the frequency of incidents related to incorrect use of the database. At the same time, database specialists will not be required to know anything about the programming language for which the SDK will be generated.

I also note that the SDK can be generated for multiple application programming languages ​​at the same time, which makes it possible to optimize work for several departments at once if they use the same database.

Subject-wise

Up to this point, the reasoning in the article took place in the hypothetical space of “what if we worked according to such and such an approach and had the necessary tools.” The fact is that I represent a company dbfirst.ru, in which such tools are developed. Now let's look at a substantive solution to the problem under discussion using it.

Examples of using dbFirst for checks in CI

In this section we will create a project and look at using the compiler dbfirst to identify errors in various situations and how these checks would allow us to build a reliable CI/CD pipeline.

Creating a Project

Let's create a new repository, in it we will create a project configuration file along the path ./project.dbfirst-v1.yaml with the following content:

# Неймспейс для ваших проектов
space: my-space
# Название данного проекта
name: music-catalogue
# Версия данного проекта
version: 1.0.0

Also on the way ./migrations/1.sql Let's create the first migration:

create table "artists" (
  "id" int4 not null generated always as identity primary key,
  "name" text not null,
  "genre" text null
);

And a couple of queries:

./queries/insert-artist.sql:

insert into artists (name, genre)
values ($name, $genre)
returning id

./queries/select-artist-by-id.sql:

select name, genre
from artist
where id = $id

Let's run the compiler dbfirst and look at the result:

> dbfirst
relation "artist" does not exist

Context:
  queries/select-artist-by-id

The compiler detected an error in one of the queries. Let's correct the request ./queries/select-artist-by-id.sql:

select name, genre
from artists
where id = $id

Let's run the compiler again and look at the result:

> dbfirst
Ok

Thus, we saw the first value: the compiler helps to avoid errors in the process of writing queries.

Code generation

Let's add the configuration file ./project.dbfirst-v1.yaml to the following state:

space: my-space
name: music-catalogue
version: 1.0.0
# Какие SDK генерировать
artifacts:
  - java-jdbc-v1
  - haskell-hasql-v1

And let's run the compiler again:

> dbfirst
Ok

Let's take a look at the resulting folder artifacts and find in it ready-made SDKs for Java and Haskell.

For example, this is what one of the generated methods in the Java SDK will look like:

public InsertArtistResultRow insertArtist(Optional<String> genreParam, String nameParam) throws SQLException {
  insertArtistStatement.setString(1, nameParam);
  if (genreParam.isPresent()) {
    String genreParamPresent = genreParam.get();
    insertArtistStatement.setString(2, genreParamPresent);
  } else {
    insertArtistStatement.setNull(2, Types.VARCHAR);
  }
  insertArtistStatement.execute();

  try (ResultSet resultSet = insertArtistStatement.getGeneratedKeys()) {
    if (resultSet.next()) {
      int idCol = resultSet.getInt(1);
      return new InsertArtistResultRow(idCol);
    } else {
      throw new SQLException("Not a single row produced");
    }
  }
}

So for Haskell:

insertArtist :: Statement Model.InsertArtistParams Model.InsertArtistResultRow
insertArtist =
  Statement sql encoder decoder True
  where
    sql =
      "insert into artists (name, genre)\n\
      \values ($1, $2)\n\
      \returning id"
    encoder =
      (Model.insertArtistParamsName >$< (Encoders.param (Encoders.nonNullable Encoders.text))) <> (Model.insertArtistParamsGenre >$< (Encoders.param (Encoders.nullable Encoders.text)))
    decoder =
      Decoders.singleRow $
        Model.InsertArtistResultRow <$> (Decoders.column (Decoders.nonNullable Decoders.int4))

Verifying that migration changes do not break existing clients

Let's imagine that we named the table artists by mistake, and now we want to rename it to artist, but the database is already in production, and we need a seamless release. The same situation that we have already described above.

Let's imagine what will happen if we begin to solve this problem head-on, simply by renaming the table. Let's add the following migration in the file ./migrations/2.sql:

alter table artists rename to artist;

Let's launch the compiler:

> dbfirst
relation "artists" does not exist

Context:
  queries/insert-artist

In other words, the compiler tells us that we have violated the contract expected by the requests, and, as a result, by the previously generated SDK. Accordingly, in addition to assistance during development, this opens up the opportunity for us to make sure during Continuous Deployment that we do not break clients before rolling out migration to production.

Let's fix the migration ./migrations/2.sql on the:

create view artist as select * from artists;

And let's run the compiler:

> dbfirst
Ok

Let's imagine that at this point our CD has rolled out migration.

Next, let's update the clients. To do this, let's change the queries:

./queries/insert-artist.sql:

insert into artist (name, genre)
values ($name, $genre)
returning id

./queries/select-artist-by-id.sql:

select name, genre
from artist
where id = $id

And let's run the compiler:

> dbfirst
Ok

At this point, CD published our SDKs, and we were able to go update client applications. Upon completion of this procedure, we clean up the following in the database:

./migrations/3.sql:

begin;
drop view artist;
alter table artists rename to artist;
commit;

Listen to the compiler:

> dbfirst
Ok

And then another automatic rolling of the scheme occurs.

Type checking

If you noticed, in the generated code the data types of the parameters and results were inferred automatically. We only provided migrations and untyped queries as input. On the one hand, it's convenient. On the other hand, this means that the compiler will not be able to help us determine changes in types in the schema. Let's look at this problem in detail and find a solution to it.

Let's add another deliberately invasive migration that changes the type of column genre With text on int8:

./migrations/4.sql:

begin;
alter table artist
drop column genre;

alter table artist
add column genre int8;
commit;

We run the compiler and see:

> dbfirst
Ok

But for us this is not OK at all. If we roll out such a migration to production, all running clients will start throwing errors when trying to execute a request insert-artistsince in the column genre they will substitute the text value when the database expects int8. Similar problems await us with the query result. select-artist-by-id.

How can I force the compiler to detect such errors?

Explicit signatures

A method that is already working is to strictly fix the data types of parameters and results in queries.

For example, if our request ./queries/insert-artist.sql looked like this:

insert into artist (name, genre)
values ($name :: text, $genre :: text)
returning id :: int8

then the compiler would give us the following error:

column "genre" is of type integer but expression is of type text

Context:
  queries/insert-artist

However, due to the presence of implicit conversions in Postgres, the severity of such checks is not absolute. For example, if we wrote id :: int2 instead of id :: int8, the compiler would not notice any problems with this. It is worth noting, however, that the application would not produce errors in runtime, so the task is solved. However, this is worth taking into account, since such conversions may result in truncated values.

One more nuance: the compiler cannot force you to specify signatures, and therefore whether we specify them or not is left to the discipline.

Contract generation

We are considering the possibility of developing a functionality for generating contract files, which would store information about request signatures taking into account nullability and, possibly, additional meta-information about the project. The implication is that this would allow for rigorous design checks against the expected contract, which could be used in CI/CD in particular.

If you have an opinion or ideas on this matter, please speak up in the comments.

Performance check

This functionality is currently in the plans. The idea is to run all queries with samples explain and identify seq-scan. Also, for the future, we are exploring the possibility of filling tables with generated data for more complex performance analysis.

If you have ideas on this matter, I also encourage you to discuss it in the comments.

Checking compatibility at the start and readiness tests

This functionality is also in the plans. The idea is to build into the SDK a procedure that will check the database for compatibility, namely, for compliance with the expectations of all requests in the SDK.

Such a check can be carried out at the initialization stage of the application and block traffic from being directed to it if the check does not pass, for example, by building it into the readiness probe for Kubernetes, if you use it, or for its analogue. If you don’t use it, then such a check will at least allow you to be notified of impending problems at the earliest stages.

This check can also be performed when the application is already receiving traffic to protect against unexpected schema changes while the application is running.

Please also rate the importance of this improvement in the comments.

Developed change delivery process

Finally, let's take a look at what our CI/CD pipeline could evolve to using the dbFirst toolkit. This is an example whose purpose is to highlight the enormous possibilities and flexibility in automating processes and increasing the reliability of your system.

About dbfirst

You can start using the described tools right now. To do this, go to dbfirst.ru.

We also provide consulting services for the development of a database integration layer and the implementation of the DB-First approach in the company. If your company has several services interacting with one database, we can dramatically speed up your development.

To contact me directly, write to Telegram: @wormholio.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *