We need to stop using the terms “testing the white

In this article, I plan to touch on a somewhat sensitive topic and I want to apologize in advance if I say something wrong. Just know that I speak from the bottom of my heart and draw on my current knowledge.

In the book Biased: Uncovering the Hidden Prejudice That Shapes What We See, Think, and Do by Jennifer Eberhardt, Jennifer Eberhardt says: grouping similar things is not some disgusting feature of the human brain, but a process in which some people are involved and others are not. Rather, it is a universal function of the brain that allows us to organize and manage the stimuli that constantly bombard us in order to prevent overload.

In the realm of testing, we also have categories that help us learn and understand types of tests, such as functional and non-functional, for example. However, the division between white-box and black-box testing is, in my opinion, imprecise and not entirely adequate. It is this type of categorization that is used quite often, especially by novice testers. My opinion is that we should stop using these terms, and in this article I will present my arguments in defense of the idea and an alternative.

Note: In order not to confuse readers, I will continue to use the above terms throughout the article.

Maintaining stereotypes

You might think that white-box testing is no better or worse than black-box testing, and it’s okay to compare the terms. However, the idea here is that white-box testing allows you to see the code, while black-box testing takes place without access to the code. With such a categorization in hand, we seem to support the stereotypical idea that “white” is something clear, associated with purity, while “black” is something opaque, cloudy, associated with immorality.

Why is it considered that during white box testing you can “see the code”? If you think about it, in the case of a white box, you would not see its contents. Think about this: usually the ceiling is painted white. Can you see the sky or your neighbors through it? How can the fact that the “white box” is somehow transparent be made commonplace?

Here we have three gift boxes in three different colors.  I can't see what's inside the white box, can you?
Here we have three gift boxes in three different colors. I can’t see what’s inside the white box, can you?

Unit Test Category – Chameleon

If I were to ask you what category unit tests fall into, I bet most would say it’s a white box, because “code is visible” in the process of testing. However, if you think within TDDyou would write tests BEFORE the code and thus you would not see the code, which by definition cannot be called a white box.

What is the real difference between unit tests, which are at the bottom of the testing pyramid, and other kinds of tests?

Is it that unit tests and code are written by the same person? But the same can be said about others.

The biggest difference is that unit testing is focused on the structure of the code rather than on the logic of the features of the product itself. For example, in this case, you are not concerned with the number of products that are displayed in the cart, but that a particular piece of code correctly handles exceptions and gives the correct output.

Transparent vs opaque

I think that simply replacing the words here will not be enough, although this is also a step forward. No doubt the quick fix would be to keep our systems and classifications intact, which would make us more comfortable, I feel like the stereotype will somehow still persist that people will think white vs black first and then maybe also call it all some other names.

Also, I still have questions about this categorization that I’m going to continue thinking out loud before suggesting the best alternative.

Gray box testing

It can be said that a category does not make sense when it needs many exceptions and an additional group to collect everything that cannot be attributed to others. We present to your attention gray box testing – this is a type of testing when you can partially or occasionally see the code.

Testing the gray box on the left… (image taken from meme-arsenal.com)
Testing the gray box on the left… (image taken from meme-arsenal.com)

As in the example above, gray is also an (opaque) color, and if the box is painted that color, you won’t be able to see its contents. Even partially. And what should we use instead to make it fit the stated term? Box with holes? Partially transparent box? (for example, with plastic “windows”). Not quite clear. Which tests should be included in this category?

Testers rarely work with a white box

There are a lot of really interesting concepts behind white box testing, but they are usually included in automated testing tools and rarely used by testers on a daily basis.

Cyclomatic complexity rarely calculated by hand, although sometimes found in code reviews. Basically it is part of the code optimization process.

Operator Testingtesting boundary valuestesting branches, cyclic testing, testing data flow are techniques that are rarely planned, tested, and often automated. Many of these terms are still unknown or not used by many in our industry.

Most of the rest of the testing pyramid is directly related to black or gray box testing.

Structure vs logic

My opinion is that a clearer and more understandable classification could be a classification with a division into structure and logic. In preparation for writing this article, I went through my old university notes and found that they were alternative names for these types of tests. They are rarely used at present. But why not use them, because they are more suitable and carry fewer risks?

The definition will depend on whether you are testing the structure of the code (number of branches, correct exception handling, etc.) or the business logic of the implemented features (for example, the logic “there should be 0 items in the cart after purchase”).

We have a tree before us: we can classify it according to the number of branches or the color of the trunk, but also according to its function: does it bear fruit or does it bear flowers?  Are they correct?
We have a tree before us: we can classify it according to the number of branches or the color of the trunk, but also according to its function: does it bear fruit or does it bear flowers? Are they correct?

What about backend testing / integration testing or database testing? The same thing: if you are testing the structure of the database, testing can be called structural, and if you are testing its logic, then testing the logic.

This classification encourages you to think about using additional tools for code validation. Other than that, unit testing remains structural testing without any confusion or need for additional categories.


I hope you understand that my idea, which I tried to convey in this note, is not to get rid of the category, but to use more appropriate terms and definitions for it. Even if leaders like Github started to implement major changes to remove obsolete coding terms, why can’t we do this even if we already have more suitable terms for this?

I hope that this idea will resonate with a lot of experts, and we will start to change our terminology and explanations, as many perceive them as the standard in the field of testing.

As Miguel Ruiz said in his book The Four Agreements: A Practical Guide to Personal Freedom, Don Miguel Ruiz, you must be “pure in your words.” Wouldn’t it be a good start to start using the terms structural testing and logic testing instead of black-box and white-box testing?

The translation of the material was prepared as part of the basic software testing course. As part of this course, there will soon be an open lesson in which we will discuss how to write a resume, prepare for an interview, what mistakes are often made in an interview and how to avoid them. Registration for the class is open to everyone. link.

Similar Posts

Leave a Reply