monolith test data management

Testing a monolithic application can be a challenging task — especially when the service is actively developing. Testing each feature takes up more and more resources, and there is little time for optimization. What to do?

Today I want to share the experience of our IT specialists, Evgenia, Dmitry, Alexander, who successfully reduced the testing time on one of the projects from an hour to several minutes.

Navigation
Project Features
What's the problem?
Solution
Results
What else to try?


Project Features

Our team is developing a B2B service for hotels to manage rooms and prices. As the project grows and develops, it becomes increasingly difficult to maintain the relevance of the test set and there is less and less time for its optimization. This is what we encountered.

On the project, more time is devoted to unit and module testing than to integration testing: due to the ratio of 1 integration test suite to 2–4 module suites.

If a module (hotel, user, etc.) does not have unit tests, then there will be at least one suite test for integration. So it turns out that there are few of them, but they are a priority.

Here are some parameters that needed to be taken into account for our application:

What's the problem?

At the beginning of the project development, each test used the method def setUp()which allowed for quick writing of tests. However, as the number of tests increased, the time spent on testing increased proportionally.

Eventually, we reached a critical point where full automated testing of the project took more than an hourwhich had a negative impact on the productivity of the development team.

Here are the key development moments where test duration had the most detrimental impact on productivity:

— completion of work on a new feature, when final testing was carried out and new tests for the feature were written.

— when deploying to stands, especially on release days, when testing was carried out on each of the stands.

That's why we decided to optimize all the tests of the project.

Solution

We used custom test classwhich helps to effectively prepare the test environment for testing. We chose the library django test pluswhich provides a number of useful mixins for testing various aspects of a project.

Usage custom test turned out to be a double-edged sword. Its apparent ease of use allowed the need for optimization and changes in the approach to testing to be ignored for a long time.

We encountered problems with running tests in random order and parallelizing tests. We could only control parallelization locally, and running in random order took more time, which we did not take into account at the very beginning.
Using frameworks like unit test, nose, coverage and our own factories also created challenges.

Practical improvement of test efficiency

The technique of autotest optimization is to improve the efficiency of tests by making them faster and more accurate.

— Use Set up test data instead of def set up
One way to improve efficiency is to use the method setUpTestData() instead of def setUp(). This method allows you to set up data for a test that will be used by all test methods in the class.

Note the difference between @classmethod def setUpTestData() And def setUpClass().

The first is called once for each test class, before any of the test methods are run.

The second is used for general setup tasks that only need to be performed once for the entire test class.

— Resolve conflicts between Transaction Test Case and def SetupClass. In our case, the conflict arose when TransactionTestCase with reset_sequences = True flag closes the test database after each test for everything -coveragewhich may result in loss of connection to the database for def SetupClass.
To avoid these conflicts, we used the method setUpTestData() from Django's Test Case class that does not cause such problems.

In addition, all functions of the custom test class should be rewritten as class methods, except for the user authentication methods, which remain in def Setup(). This allows them to be used in the setUpTestData class method.

Tests written using Transaction Test Case as a result of profiling, they began to occupy more than half the time all tests, and they decided to rewrite them using a custom test class inheriting from TestCase. This required changing the logic of the tested commands. This required additional labor costs, but allowed us to complete the optimization of the tests.

Add correct timing

We include timers in our automated tests to track how long each test takes to complete – and that way we can find problematic ones.

Instead of measuring time “inside” the Python testing process, switch to counting “outside” using the shell timing command.

On unix systems this is the command time, on Windows – Measure-Command.

We are interested in the parameter real time. The other two, user and sys, reflect the time spent executing the program and the operating system. Real time will exceed the time inside the test runner, as it includes parameters such as Python and Django initialization time. The differences can be significant if you have libraries that are slow to import.

Exclude folders from coverage to effectively reflect test coverage. Excluded folders and files are not taken into account when calculating the code coverage percentage. This can be useful if there are parts of the code that do not need to be tested or that are difficult to test automatically. For example, legacy code can distort the overall coverage picture because there are no tests for it.

Specify default .env state for coverage. Will eliminate false negative tests.

Let's say you have an environment variable DB_HOSTwhich is used in your code and should be checked by tests. However, by default, the value of this variable is not set, and therefore coverage will not take into account those parts of the code that depend on DB_HOSTsince they were never executed. Setting the default value for DB_HOST in .env file will allow coverage to count those parts of the code as covered, even if the specific tests do not set a different value for DB_HOST.

Add a flag – shuffle directly to the team coverage along with other flags, and be sure to debug any tests that fail. When running tests sequentially, it's easy to miss logic errors, especially in a large project.
And remember that tests are used not only as automated testing, but also as additional documentation of functionality. This means that an incorrectly written test can mislead a new employee in the future.

In our case, running tests in a custom mode allowed us to find and fix some errors in the test logic. To do this, we ran coverage via script directly in pycharm.
When using -coverage –shuffle, tests fail because they are launched in an unusual order. A regular debugger cannot simulate this behavior, because it launches them sequentially. The screenshot shows an example of setting up -coverage –shuffle to launch via a debugger, which allowed us to diagnose why the tests were not working correctly. This detail allowed us to fix all the tests.

Results

Time spent on automated tests has been reduced by 10 times

Was

It became

This has had an impact on reducing the CI/CD pipeline time and has ensured faster deployment of features and bug fixes.

What else can you try?

Test profiling to identify tests that take the longest to run and then optimize them.

For example, we noticed that a test is running slowly, but we don't yet know what exactly is slowing it down. To track this, we can use py-spy or built-in library cProfile.

Adding TestRunner will not speed up the tests themselves, but will reduce the time spent testing.

Parallelization of tests using a testing framework or pytest-django. We do not use this method on the project, since the tests run on a virtual machine. Therefore, we do not consider it within the framework of the article. More details can be found in the book Speed ​​Up Your Django Tests by Adam Johnson.

This solution was experimental for us – it was difficult to find information for our case in open sources. We hope this article will help you optimize your tests with examples or give you new ideas.

If you have any questions, let's discuss them in the comments.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *