How We Work with QA Metrics: Runiti's Experience

Basic concepts

Defect metrics

Autotest Metrics

Traffic light

Automation of metrics collection

Metrics Analysis

Conclusions

Basic concepts

Testing Metrics — these are specific indicators that allow us to evaluate the quality of the software, the team's productivity, and any other goals set. We tried different metrics and settled on those that suited us and our product.

We divided our metrics into three blocks:

Defect metrics

We have 7 defect metrics:

  1. Bugs in old code;

  2. Bugs before release;

  3. Bugs after release;

  4. Percentage of filtered defects;

  5. Found by auto tests;

  6. Critical bugs;

  7. Incidents (type, command, service).

Bugs in old code. This metric is needed to measure the overall quality and understand how many bugs we found in the old code. Collected automatically from Jira: tasks with the type “Bug” and the label “prod” are taken into account.

Bugs before release. They are found by the tester. Thanks to this metric, we understand how many defects were found in a new feature before we released it. They are collected automatically from Jira. They are counted by the number of checkboxes in the “only_for_qa” checklist or the number specified in the “Internal Bugs” field in tickets with the “Bug” type.

Bugs after release — these are bugs found by testers, managers, other employees or users after the release of a feature. Collected automatically from Jira. Tasks with the type “Bug” and the label “bug_in_new_code” are taken into account.

The second and third metrics are needed to form the fourth one – proportion of filtered defects. This is an indicator of the efficiency of bug detection: we look at what share of defects we managed to filter out, and what share still made it to production. There is a certain indicator here that we should not exceed – we will talk about this later. It is calculated by the formula: the number of bugs after the release divided by the number of all bugs (before and after the release).

Bugs found by automated tests. This metric is mostly needed to evaluate the effectiveness of our autotests. It is collected automatically from Jira. Bugs found in production with the “autotesting” label are taken into account.

Critical bugs. This metric is needed by teams to understand what functionality was affected. Of course, there should be as few critical bugs as possible. If many of them are skipped, it means there are some problems in testing. It is collected automatically from Jira. Bugs found in production are taken into account, with the “crit” label.

Incidents — situations that affected more than 10% of users. We look at the type of incident: either infrastructure or development error. We also look at the team in which the incident occurred and the service of this team. We then process all the indicators. It is collected automatically from Jira from the crash board. Information about the type, team and service in which the incident occurred is taken from the tickets.

If we imagine our team's work and labeling of defects in Jira as a diagram, it would look like this:

Autotest Metrics

We use only four autotest metrics:

  1. Total percentage of requirements coverage by automated tests in %;

  2. Current coverage of autotests in pcs;

  3. Percentage of assembly fragility;

  4. Average build duration in minutes.

Total percentage of requirements coverage by automated tests in % — the first metric we start collecting. All our requirements in the form of test cases are written in Testrail. We see how many of them are covered by automated tests and how many are not yet. Here we have everything written down by teams: we can clearly see what percentage of each team is covered.

Current coverage of autotests in pcs. Our next metric is coverage in pieces. Almost the same, but from a slightly different angle. We look at how many cases we can cover. In some teams, this is more than three thousand, and in others, only 30 cases. In this case, calculating in percentages will not always be informative.

Assembly fragility percentage is collected automatically from autotest logs. We look at the teams, determine the average value. In the example below – for a week: in each team we see a different percentage of fragility. This percentage is influenced by various factors: for example, infrastructure problems or the rollout of new functionality, when the tester did not have time to change the autotests, and everything crashed on production. If the bug affects a lot of functionality, then the percentage of fragility for a week will be high.

We also take from the autotest logs average assembly duration. We need this metric to understand which builds are slow. Firstly, to calculate the time for testing, for running builds. Secondly, to see problematic builds, where tests last more than an hour. In this case, we need to do something: either add threads, or optimize tests, or look at how tests are written and whether we need all of them.

Traffic light

With the help of the third block of metrics — traffic lights — we track the mental state of testers. We believe that this is an important point in our work.

In this block we have two metrics:

  1. Number of developers and testers by teams;

  2. The general condition of the employee in the current month.

Number of developers and testers by teams — a metric that roughly shows the workload per employee. Without additional data, this is not a very objective metric: teams may have different tasks. For example, only technical ones, where there is not much testing, or product ones with a high QA load. The metric shows which employees may be overloaded, and who the manager should contact first to ask if everything is okay. We collect this metric from the list of employees on the corporate portal.

General condition of the employee in the current month — the only metric we collect manually. Testers fill out the document themselves, indicating their workload level, and optionally leave a comment. We collect the metric at the end of each month and see how we can reduce the workload of our testers.

Automation of metrics collection

We have tried to automate the collection of metrics as much as possible. We currently have 4 services from which we take information:

We get information from these sources using a custom script via API. The results are then put into a database — MySQL for us. From there they are pulled into Grafana, where they are transformed into informative graphs and tables.

Analysis of testing metrics

Now let's figure out what to do next with the collected data. We have come to certain threshold values, when exceeded, we understand that something is going wrong and we need to take action. Now we have 5 such values:

  1. Number of bugs after release in old code: no more than 10 per quarter

  2. Number of bugs after release: no more than 5 per month

  3. Percentage of filtered defects: no more than 0.1

  4. Autotest coverage of requirements in %: the current value should not fall by more than 10%

  5. Fragility of automated tests: no more than 5%

We also partially automated the metrics analysis. When threshold values ​​are exceeded, the script sends a report to the corporate messenger. You can also view the dynamics in Grafana and compare the values ​​— how it was before and what the situation is now. After analyzing all the data, we go to the team of testers, developers or managers with clarifying questions or suggestions for improving the process.

Conclusions: Why do we need QA metrics?

By implementing testing metrics, we understand our efficiency, productivity, and see strengths and weaknesses. Although the metrics are primarily tailored to our goals and needs, I hope that other teams will find our approach similar. We have almost completely automated data collection and processing: this allowed us not only to simplify the monitoring process, but also to promptly respond to emerging issues through configured alerts. This step-by-step approach makes our metrics relevant and useful, improves the work and efficiency of the team, frees up time from routine tasks and allows us to focus on important goals.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *