Decomposition of test cases in theory and practice

Foreword

We continue our series of articles about testing. Earlier, we learned about the prevalence of unit testing in the development environment, as well as whether we, developers, should test our code (spoiler: it’s more likely). Today we will touch on a slightly more applied part of a competent testing process, namely the creation of test cases. Initially, selection of cases may seem like a trivial task to the developer, but, as we will see shortly, this process can be determined by some set of rules.

What is it and why is it needed

First, let’s imagine some component that we would be interested in testing. A textbook example in today’s context would be checking user passwords during registration. If anyone has ever tried to write their own authorization framework, then they can imagine the complexity of the problem. But I propose to simplify everything to just one “verifier” of passwords. Let’s take some standard configuration: in the requirements we have a maximum and minimum length, allowed and prohibited characters, as well as a strict requirement to have one character from each set (capital letters, small letters, numbers, special characters, Chinese characters, etc. .). Moreover, physically this “checker” can be a class or function in our favorite programming language, or it can even be a ready-made form. But this is not so important, so let’s imagine it as some kind of function unknown to us. The function has only conditional inputs (password string) and conditional outputs (ok, not ok, why), and we know little about its internal structure. As you might guess, we are talking about the so-called. “Black Box” approach.

Now, with this particular example in mind, we can try to break it down into cases. At first it seems simple: we need to show that the system under test produces the expected result. And this means a test case on “positive” data. In other words, given the “correct” login password, we would expect an abstract “ok” as the answer. Should I write more? Let’s say that we also want to test that the “checker” will answer us “not ok” to the wrong password. But the password can be wrong under several conditions. Do I need to check for each of them? And if so, how detailed?

In the field of testing, there is not a complete understanding of whether there is any single and comprehensive correct theory about how many test cases and when to write. In real projects, engineers very often write as they see fit. And it turns out well. But questions still arise in more complex cases, for example, when we have something to calculate the algorithm. Well, or just when there are more input parameters and states. One way or another, in this situation, the presence of some method of searching for these very test cases can help in order to understand how many tests it is generally worth having here. Thus, we can arrive at some criterion of correctness, non-redundancy, or even completeness for our set of test cases.

But that’s not all. We, as engineers, are often burdened by the desire to do everything right and complete. During the development process, in the places of interaction between several developers (say, for a code review), this noble desire can give rise to long discussions about the correctness of what is happening. The presence of rules that allow us to somehow specifically formalize the construction of test cases helped to have some agreements or foundations in these discussions. Like, “in this case, as much as needed was done, and here’s why.”

Methodology

Equivalence classes

Let’s go back to the password example.

Initially, the difficulty with building test cases in this example can be created precisely because of the large number of states that a password can take – correct and incorrect. However, as such a password “verifier” has a finite and very limited set of behaviors. Namely: our system can give “ok” as an answer if the password is correct. For each potential reason why the password might be incorrect, the system may return “not ok” as a response, along with an error message.

If we consider all possible input and output parameters through the prism of this finite set of behaviors, then we will see that each password option corresponds to exactly one “verifier” behavior. Thus, the total set of all possible input and output values ​​can be divided into subsets according to the principle of belonging to any one variant of the “checker” behavior. An example of such a subset is, say, the absence of a digit in the password – in this case, we would expect the result “not ok” from the checker with the message that the password is missing at least one digit. But since we now know that all variants of input / output values ​​within each subset will cause the same behavior in the program under test, then in order to check this very variant of the program’s behavior, it is enough for us to execute it only on one value from this set, because for the other values we expect exactly the same behavior. Here we make the important assumption that we are certain with some certainty that the same set of machine instructions will be executed in every script within a subset.

Thus, for each behavior of the “checker” there is a set of all input parameters that call it. And we need only one test case to check this behavior. Each of these subsets is usually called an equivalence class, and the method of selecting test cases is thus called equivalence class partitioning. The term comes from mathematics, where, as you might guess, it means something roughly similar to the one described above. But we are not interested in this now, but our “verifier” is interesting. Applying our theory to it, we get the following test cases:

No.

Correct length

Has capital letters

Has lowercase letters

Has numbers

Has a special symbol

Has a hieroglyph

Has no illegal characters

Expected output

1

Yes

Yes

Yes

Yes

Yes

Yes

Yes

OK

2

No

Yes

Yes

Yes

Yes

Yes

Yes

Not ok cause

3

Yes

No

Yes

Yes

Yes

Yes

Yes

Not ok cause

4

Yes

Yes

No

Yes

Yes

Yes

Yes

Not ok cause

5

Yes

Yes

Yes

No

Yes

Yes

Yes

Not ok cause

6

Yes

Yes

Yes

Yes

No

Yes

Yes

Not ok cause

7

Yes

Yes

Yes

Yes

Yes

No

Yes

Not ok cause

8

Yes

Yes

Yes

Yes

Yes

Yes

No

Not ok cause

As you can see, everything is quite simple.

Combinatorics of classes

Here the question may immediately arise: why can’t two classes with errors be combined together? Let’s say we have any two errors in one password. For example, the length is not the same, and there are no numbers. This theory is somewhat silent on this. However, this can be explained as follows. We assume that there will be only one reason for the error. Maybe somewhere in the code itself, we check all the errors one by one, and after the first check that worked, we don’t check further. Well, that is, all the same, such an input will belong to only one error class. Things are a little more complicated when we can combine errors and write about all the errors at once in the answer. Fortunately for us, in this case, we need to check only one component of the system, which we did not check earlier – the combinatorics of errors. You can check this with exactly one test case, where there are more than one errors. It can be a test case for two violated conditions or for all at once – that’s not the point. It will still be, as you might guess, one class.

No.

Correct length

Has capital letters

Has lowercase letters

Has numbers

Has a special symbol

Has a hieroglyph

Has no illegal characters

Expected output

9

No

No

No

No

No

No

No

Not OK, everything is bad

Lots of correct classes

In the example above, we have a case where all the correct values ​​can be placed in one class so that it looks logical. But are there cases when it makes sense to single out several correct classes?

Let’s try to proceed from our understanding of why we write test cases. One of the goals is to guard against regressions, or misbehavior after conditions change. And this means that if we have some kind of conditionally finite list of correct values, then it makes sense to test them all. We want to do this in order to make sure that this list does not suddenly just change.

The simplest example of this behavior is a test where the parameter is an Enum. For example we have enum class PASSWORD_COMPLEXITY { INSECURE, AVERAGE, COMPLEX, VERY_COMPLEX }. Let’s say that we have some kind of predicate, the task of which is simply to resolve all passwords with complexity COMPLEX or harder. We receive the following test cases:

No.

Variable value

Predicate response

1

INSECURE

false

2

AVERAGE

false

3

COMPLEX

true

4

VERY_COMPLEX

true

This is a very primitive example, but in real code this can happen quite often. Fortunately, it will take several minutes of real time to write tests for this function, because all modern testing libraries have the ability to create parameterized test cases. Yes, and you can find test cases here without much difficulty. We will not dwell on this for a long time now.

Spacing Method

Oddly enough, this method is usually considered in isolation. I would say that it can be distinguished rather as a particular form of equivalence classes on numerical parameters.

Let’s consider the length requirement separately from the password example. Let’s say that the password must be 10 characters long or more, but strictly less than 20. We can now imagine that the password length is a numerical discrete parameter that can be plotted on a conditional axis. In this case, this axis will have the following possible intervals:

(0, 10)

[1020)[1020)

(20, MAX_LENGTH)

not OK

OK

not OK

We see how we got two out of one wrong equivalence class. But, as you might guess, that’s not all. The interval method encourages us to pay special attention to the boundary values. It is proposed to do this in order to make sure that the boundaries are actually drawn correctly. And it conditionally works like this:

  • If the boundary point X is “inclusive”, then it is necessary to write test cases for the values ​​X, X-1 and X+1. For example, since we have a “filled” point 10, we will need test cases for the values ​​9, 10 and 11.

  • If the boundary point (X) is punched out, then we need to write test cases for the values ​​X and X-1 or X + 1, depending on which of them falls into a different interval. For point 20, these will be 20 and 19. In fact, canonically, sometimes they also call for testing for points 21 and 18. This is motivated by the fact that (19, 20) we actually have the point being tested, and we also need to test within each interval. It may be redundant, but I would not prevent it.

And yes, I pay special attention to the case when our numbers are conditionally non-discrete. For example, if we have a data type of float or double. In this case, when we take points like X-1 or X + 1, we just need to take something approximate.

Let’s go back to our original table with the password “verifier” testing matrix. Remove all other parameters from it, except for the length. We don’t need them now. Previously, we had only two length tests. Now, we will have the following tests:

No.

Correct length

Expected output

1

No (9)

not ok cause

2

Yes (10)

OK

3

Yes (11)

OK

4

Yes (19)

OK

5

No (20)

not ok cause

conclusions

We got acquainted with how you can theoretically justify the decomposition of test cases. This testing methodology is enough for me personally to test most of the code I wrote in my activity. The only thing, as you may have noticed, is that this is somewhat poorly applicable to more complex cases. Let’s say when we have several independent parameters. I would like to dwell on them in a separate article.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *