How to properly use the DRY principle in software development

Introduction

The Don't Repeat Yourself (DRY) principle, that is, avoid duplicating code, is often considered a mandatory practice in programming. However, in reality, you can often see how conceptually different blocks appear in the general code, which are similar only in external parameters. This inevitably leads to deterioration of the code and the appearance of “crutches”, without which it does not work. This is why blindly following the DRY principle is not always advisable! In this article I will talk about common mistakes when using this rule and ways to avoid them.

DRY principle

DRY is a software development principle designed to minimize duplication of information in code. According to DRY, “Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.” In practice, this means that repeated parts of code should be combined into common functions or modules.

The DRY approach is aimed at optimizing the development process: reducing the number of errors, simplifying the update process, and improving code readability. Developers use it to create applications with reusable components, making it easier to maintain and update the software.

When the DRY principle works as it should

DRY is often used to create generic functions or methods that perform a repetitive task on a system. Let's look at a simple example where DRY is applicable: finding the area of ​​a square and a rectangle.

# Finding the area of ​​a square

side_length = 5

area_square = side_length * side_length

print(f”Area of ​​square: {area_square}”)

#: Finding the area of ​​a rectangle

length = 5

width = 10

area_rectangle = length * width

print(f”Area of ​​rectangle: {area_rectangle}”)

Both blocks of code perform a similar operation—multiplying the lengths of the sides. In the case of a square, this is multiplying the length of the side by itself, and for a rectangle, multiplying the length by the width.

It's easy to apply DRY here by creating a common function to calculate the area of ​​both a square and a rectangle at once. To do this, it is enough to create a function that calculates the area of ​​a square if only one parameter is specified, and of a rectangle if two parameters are specified (length and width).

def calculate_area(length, width=None):

if width is None:

width = length

return length * width

# Using a function to calculate area

area_square = calculate_area(5)

print(f”Area of ​​square: {area_square}”)

area_rectangle = calculate_area(5, 10)

print(f”Area of ​​rectangle: {area_rectangle}”)

By combining two different cases of area calculations into one universal function, you can simplify the process of maintaining your code.

When the DRY Principle Leads to Problems

However, using the DRY principle does not always work as smoothly as in the previous example. Problems often arise in startups, where developers in the early stages of project development hastily introduce blocks with the same code or structures with the same set of fields into a common module. As the project develops, the combined modules often begin to develop in different directions, which means that the common module should also develop in different directions. This leads to constant complexity of the code, and, consequently, to problems with its maintenance and testing. Eventually, there comes a point when the codebase cannot be maintained.

Let's imagine that we are developing a startup. We model the entities “spaceship” and “atom-sized nanorobot”. For both entities, it is necessary to implement the speed calculation function. Considering that our startup is at an early stage of development and we do not require high accuracy of calculations, the formula for calculating the speed of a spacecraft and a nanorobot is the same. Therefore, it seems logical to separate the speed calculation function into a common module.

However, as the project progresses, more and more precision will be required to calculate velocities. And this is where the difficulties begin. To calculate the speed of a spacecraft, an increasingly accurate account of relativistic effects will be required, using the formula for relativistic speed, that is, comparable to the speed of light:

=002c2

Where 0 is the speed of the object in the inertial reference system, and c is the speed of light.

But in the case of a nanorobot, this formula is useless, since its speed obeys the principles of quantum mechanics, is determined probabilistically and is calculated using fundamentally different formulas.

And since we have a common velocity search module, we will have to combine relativistic and quantum velocities in it, iteration by iteration, which will cause the appearance of “crutches” to combine two fundamentally different physical theories. Also, these “crutches” will be needed at the points where the module is called. As the project grows, the number of “crutches” in the code will certainly grow, making its maintenance, updating and debugging an increasingly complex and resource-intensive process. At some point, the project will become so complex that it will be easier to throw it away and make a new one from scratch than to try to fix poorly working code with a lot of “crutches”. And re-launching a project is not a cheap process in itself!

Accidental Duplication and the difference between code and knowledge duplication

Thus, we found out that blind use of DRY can harm the project. That is why refactoring or merging code must be based on a deep understanding of the reasons and nature of duplication – without them it is impossible to optimize the software.

Code duplication can occur for various reasons. For example, if a developer does not know the codebase, he may unknowingly repeat a data model that already exists in the project. Duplication also occurs as a result of other processes, such as when different teams independently create similar solutions for different parts of the same project.

However, duplication of code does not always mean the need to move it into common modules. Sometimes accidental duplication occurs when modules describing different knowledge (different subject areas or different processes), at some point in time they randomly begin to be similar to each other. However, this does not mean that their code will always be similar – different subject areas will develop in different directions, and soon the code in them will begin to diverge significantly. Merging such code into common modules can be detrimental to the code base, especially if the common module becomes one of the core elements of the application. This can lead to deterioration in the quality of the entire code base. In this case, it is necessary to leave the code duplication!

But what about the DRY principle? It does not allow duplication! The DRY principle does not recommend simply blindly merging identical blocks of code. He calls for everyone piece of knowledge had only one, clear and authoritative representation in the system. Therefore, it is important to distinguish between similar blocks of code that describe different aspects of knowledge, and truly similar blocks of code that duplicate the same knowledge. When applying the DRY principle, it is important to focus on eliminating duplication of semantic information, not just code.

What does literature advise?

Now that we know what and why can go wrong during refactoring, I’ll tell you about several techniques that will help you avoid mistakes.

Three Strikes And You Refactor

This is a principle that states that if you are faced with having to make the same change to your code three times, or you find the same problem in three different places, this tells us that there is no duplicate code in different parts of the code. but there is duplication of knowledge. This means it’s time to refactor, explicitly highlighting this knowledge in the form of a new module that can be used by others. This approach helps continuously improve the code structure, eliminating duplication and preventing the accumulation of technical debt.

Domain Driven Design (DDD) and Event Storming

Domain Driven Design (DDD) is an approach to software development that focuses on creating software that reflects real-world business processes and problems.

One of the important DDD techniques is Event Storming, which is designed for collaboration between domain experts and software developers to identify important events, commands, and entities that play a key role in business processes.

Event Storming Typically held in a work session format, during which participants use colored sticky notes to represent different elements of the domain on a large paper trail or wall. Each sticker color represents a specific type of element, such as an event, command, entity, or bounded context.

This technique helps project participants identify subject areas and accurately determine whether there is duplication of knowledge between different modules, and decide whether to allocate them in a common place.

Conclusion

The DRY principle is a useful development tool that is often misused, leading to problems in the development process. Therefore, to optimize the code base, it is extremely important to understand the reasons for duplication and its varieties. And modeling practices such as DDD and Event Storming will help you make informed decisions about the structure and organization of code for its long-term stability and scalability.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *