Almost all systems developed involve the use of databases, often the database, its design and processing are key aspects of the system in terms of adding value to the business, security, performance, organizational policy and other factors that make this layer of our applications extremely important and worthy of special attention from side of us as developers.
I’ve always been of the opinion that it’s good for a developer to have some understanding of how databases work.
Given the vast number of different recommendations and tips for using databases, this simple list contained in this article represents only a part of what can be considered.
1. Using EXISTS
To retrieve records from a table that meet a condition based on references to another table in a query, it is recommended to use EXISTS instead of IN in the WHERE clause with a subquery. This will provide the best performance on most databases.
SELECT * FROM orders WHERE EXISTS(SELECT * FROM customers WHERE orders.customer_id = customers.customer_id AND customers.country = 'USA'); -- В этом примере мы выбираем все заказы, у которых есть связанный клиент из США.
2. Using boolean or integer flags
To create columns whose values determine the state of objects, it is recommended to use the boolean type, if your database supports it, instead of storing information in text fields. In tables with a large number of records, the use of such fields can greatly slow down the work. If your database does not support the boolean type, use NUMERIC(1) to store the values 1 or 0.
CREATE TABLE employees ( employe_id int, name varchar(255), is_active BIT ); -- В этом примере мы создаем таблицу "сотрудники" с полем "is_active", которое может быть типа boolean. Значения 1 или 0 будут означать, активен ли сотрудник или нет.
3. Conversions with UPPER, TO_CHAR, etc. in WHERE clauses
It is not recommended to convert the type and format of a column to filter data in a WHERE clause. This slows down the query and makes it impossible to use automatic indexes on those columns. It is best to store the data in the correct format, or in a format that is easy to represent in the application.
4. Don’t use HAVING to filter data
If you do not need to use aggregation operations, it is recommended to filter data in a grouping in a WHERE clause, and not in a HAVING clause, to improve query performance.
SELECT * FROM orders WHERE order_date > '2021-01-01' -- время выполнения запроса составило - 0,01 сек. SELECT customer_id, COUNT(order_id) as num_orders FROM orders GROUP BY customer_id HAVING COUNT(order_id) > 5 -- время выполнения запроса составило - 0,04 сек.
5. Be careful with the MAXVALUE setting for SEQUENCES
When creating an object of type SEQUENCE in an Oracle database, it is important to set the MAXVALUE property to an extremely high value so that the application does not subsequently stop due to reaching the maximum value. It is important to note that setting it to a high value will not result in unnecessary allocation of space, as this is only a configuration parameter of the SEQUENCE type object.
If you are not familiar with SEQUENCE or don’t know what it is, then it is a unique property of the Oracle database that does not have an auto-increment field to generate sequential numbers in tables. Previously, for older versions of Oracle, this additional object type had to be created until similar functionality (auto-increment) was implemented in version 12c.
6. The impact of the object-oriented approach on database design
Often, developers who are limited to application code do not have sufficient experience in database design, which causes them to tend to “think” in an object-oriented approach when creating a database. While using an ORM makes developers’ lives a lot easier, it’s important to note that relational databases are not object-oriented, even with similarities between tables and objects, columns and properties, and so on.
The database should be designed and built according to database best practices, not object-oriented best practices.
7. Advantages of using procedures and views (view)
If we do not use procedures and views (view), then every time we execute a SQL query, the DBMS analyzes the syntax of the query for correctness, checks for the existence of reference objects, and so on.
However, when our code is in a procedure or view, the database does not make these checks because they were already done when the procedures and views were created. As a result, the execution of SQL queries through the application is accelerated and performance is improved in critical systems.
Procedures can be used to run complex operations that require the execution of multiple queries.
Views can be used to make data easier to access, especially when you frequently select and display the same information. They can also provide an additional layer of security by restricting access to specific data fields.
8. The importance of choosing the right data types
Always make sure that the column types of the table in your system match the types of data being stored. For example, to store a date, you need to create a column of type DATE. If you need to store integers, then you need to create a column of type INTEGER, and so on. It seems obvious, but this flaw is quite common. Correctly setting data types will protect against the possibility of entering data with the wrong type and reduce the cost of future queries by avoiding data type conversion.
9. Select only the columns you need to select
This is the most commonly known advice: avoid using SELECT * FROM. We often use a large number of tables in JOIN queries. Specifying only those columns that are actually used is a good practice, almost mandatory for us developers. Another benefit that I also consider important is the ease of reading SQL when maintaining a database.
If your application uses queries on information that is not frequently updated, consider putting that data in a cache and relieving the database of having to deal with it. However, this option should always be analyzed taking into account the scenario of each project and its prerequisites.
11. Types of variables and parameters
Strive to use the exact same types in the variables and parameters of procedures and functions as in the columns of the table to avoid the need for useless conversions.
12. Data normalization
In order to efficiently store data in a database, you must adhere to certain guidelines and follow the principles of normalization. To do this, five normal forms are used to ensure data integrity and reduce data redundancy for applications and websites.
13. Language, local and culture setting
Ideally, the database used in the application is set to a language/culture that is compatible with business rules or the system context so that explicit transformations in database queries do not have to be performed, which slows down performance. These settings are related to aspects of globalization. In the case of Oracle, this is National Language Support.
14. Using “values in multiple lines”
If you need to sequentially add multiple records to the same table, it is preferable to use the following syntax as an example, which will increase performance in critical systems and save a few lines of code.
For example, let’s say we want to add multiple records to the Product Categories table with both Name and Description fields. You can do this using the multi-line value syntax, something like this:
INSERT INTO Categories (Name, Description) VALUES ('Electronics', 'Electronic devices and accessories'), ('Clothing', 'Clothes and apparel'), ('Home goods', 'Furniture and household items')
This will add three new records to the “Categories” table with the specified values for the “Name” and “Description” fields.
15. Request monitoring
Even after the system is in production or testing, you as a member of the development team should be actively involved in monitoring and analyzing database operations related to projects in which you are involved. This will identify potential improvements and problems in advance. Encouraging constant contact between teams is becoming an increasingly common practice and is an essential prerequisite for basic DevOps.
16. Don’t Postpone Foreign Keys
This advice is almost as obvious and trivial as the possibility of using SELECT * FROM. But often we see systems where tables are created without corresponding references in the database. Therefore, never delay creating the appropriate primary and foreign key references. Create them immediately when creating the table itself.
17. Optimal use of log and history tables
Sometimes our systems have log tables with millions of entries that are very little or never used. Explore the possibility of storing most of the history in other tables (archive), leaving only the most recent entries in the history and log tables. However, I would like to emphasize that each case should be analyzed separately, as business requirements often require that all history and logs be readily available in the original spreadsheets and not in an “archive”.
18. Add comments at will
When creating a table or column in a database, don’t skimp on commenting on its meaning, especially if the system is out of date. I think database comments are even more important than application comments. These comments greatly simplify interpretation and maintenance for developers who work with databases.
19. Tables without primary keys
Yes, unfortunately this happens quite often. If your table doesn’t have a primary key, it’s a good idea to rethink its modeling because, in theory, a table shouldn’t remain “isolated” in a relational model.
20. Dedicate time to database modeling
As mentioned earlier, databases are often the “soul” of a system. It is worth investing time in properly planning and modeling the database, reflecting the structure of each table, column, relationship, and many other aspects.
Investing in this stage has a high return on investment.
Knowing and adhering to the database code of practice is an important step towards increasing professional value and advancing a career in the database field. Compliance with these rules guarantees more efficient storage, processing and management of data, which in turn increases the efficiency of the company as a whole. Thanks to well-organized and optimized databases, the company can quickly and accurately analyze information, make decisions and ensure customer satisfaction at a high level.