Two approaches to historical data modeling

Translation of an article from Medium

Winston Churchill once said:

“The further you can see into the past, the further you can see into the future.”

He clearly emphasized the importance of studying and understanding history in order to make better decisions in the present and choose the right path for the future.

Historical data is needed in different cases. For example:

Comply with regulatory requirements;
For data analysis and reporting;
Recover from accidental data loss or data corruption.

In this article, the author will talk about two approaches to modeling historical data:

Usage temporary tables
Usage historical tables

Temporary tables

The past and future are real, but the present is a fleeting moment. Learning all three is essential to understanding the world.

The temporary table approach works by adding two date columns to each database table. “Validity start date” and “Validity end date”.

Consider a hospital that keeps records of patient records over time. The database may contain a Medical History table with attributes such as patient ID, admission date, diagnosis, and two time columns: “Effective From Date” and “Effective To Date.” . These columns will define the date range of the patient's records. Each time a patient's medical record is updated, a new record is added to the Medical Records table with the updated information and the current date in the Effective Date column.

The Expiration Date column is left blank for the new entry, indicating that it is the current entry.

Let's assume that the patient's medical record is updated again. In this case, another new record is added to the table with updated information, the current date in the “Effective From Date” column, and the “Effective To Date” column of the previous record is updated with the current date .

The main disadvantage of using temporary tables is their complexity in history storage, requiring deep knowledge of SQL to retrieve accurate information. In this approach, the developer needs to set special conditions to retrieve the actual ID and link it to other tables.

Historical tables

Another approach is to use historical tables, where only the latest record in the transactional table is stored and all old records are moved to a separate historical table. In this case, the database will contain two tables: “MedicalRecords” and “MedicalRecordsHistory”.

Initially, MedicalRecords will look like this:

The MedicalRecordsHistory table will be empty. Once a patient's medical record is updated, the old row will be added to “MedicalRecordsHistory” and the new entry will be saved to “MedicalRecords”.

Main advantage Using historical tables: Query performance for evaluating historical records is much better because it is separate from the main table. This also provides a clearer change log.

However, there is also negative side: using historical tables requires more storage space and therefore more maintenance costs.