Sizing of multi-level KHD (part 2 How to size)

Continuation of the analysis of the problem of sizing a multi-level QCD.
Part 1: “Part 1 What is Saizim”

STEP 0. Before sizing begins, IT specialists analyze current systems (databases) and evaluate the initial sizing criteria (annual data volume, growth, volume of historical data)

STEP 1. Total volume of historical (initial) data

At this step, the volume of historical data is specified.

At this step, the volume of historical data is specified.

STEP 2. Volume of daily raw data downloads per working day
Important! The size of the QHD critically depends on the volume of the original data.

Estimated daily data growth (based on 247 working days per year)

Estimated daily data growth (based on 247 working days per year)

STEP 3. Attributes of QCD

Attributes that define the characteristics of the QCD

Attributes that define the characteristics of the QCD

This step is the most difficult and requires a separate explanation.
Assumptions:

Column “Layer” – switching 1 or 0 allows you to specify whether this layer will participate in the calculation of the total volume of QCD.
Column “Storing transformation history and data” – switch 1 or 0 determines whether the entire history of data flow between algorithm steps and storage layers will need to be stored in materialized form
Column “Compression ratio” – a percentage determined by the assumption block and setting the percentage of data volume reduction when moving between layers of the data storage system. The percentage is applied to the result obtained in the previous step, taking into account the percentage specified for the layer “8. Layer for storing data transformation history”.

From the given example it follows that with the initial data inflow equal to 80Gb per day, using all eight layers of the KHD including 10% for the logs of transformation and data flows, at the level of materialized storage of results 242Gb of space will be required.

STEP 4. QCD coefficients
The QCD forecast is based on several additional coefficients presented below.

Characteristics of the growth forecast of the QCD

Characteristics of the growth forecast of the QCD

Number of months of detailed data – this is the number of months for which detailed information is stored for all layers; other periods are stored only in the data part of the layer “Layer of showcases for reporting”

STEP 5. Total (excluding resources for the BackUp system)

Calculation of the final values ​​of the growth of the QCD

Calculation of the final values ​​of the growth of the QCD

In total, the option for storing all data for a period of 3 years without clearing the data storage system from unused information:

315Tb However

315Tb However

Option to store only a specified number of months in detail (metric 14); history only in the form of reporting layer data

Cutting off the history of changes and high detail gives an almost sixfold reduction in the volume of the QHD

Cutting off the history of changes and high detail gives an almost sixfold reduction in the volume of the QHD

TOTAL:
The material presented is not a mantra, it is an approach.
It is clear that QCD can be very complex, moreover, some layers for different QCD blocks may simply be missing and in this case it will not be possible to apply a common measure, BUT!
There is nothing to prevent you from making sizings for different KHD blocks by creating your own set of data compression percentage values ​​for them, turning on or off layers, changing the parameters of retrospective data storage, or managing the level of detail and history for analyzing the receipt of aggregated data in a report down to the level of primary documents.

P.S.: The material does not present all the formulas used for calculation, I believe that you can easily repeat them based on the logical meaning of the operations performed.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *