Data processing for beginners

FineBI communityconduct training, collect digests and annually organize large-scale conferences.

My name is Alexander, I am the head of the support and training center for BI solutions at GlowByte, and today I want to share with you a small guide on data processing in FineBI. This material will be useful for beginning BI developers.

This article consists of four lessons that will help you become more familiar with the data editing features in FineBI. If after reading you want to understand all the features of the platform, consolidate basic knowledge and improve your skills in creating complex visualizations, I invite you to our courses.

Lesson 1: Processing and Simplifying Data Structures

1.1. Data Structure Processing

Before you begin to analyze data, it is often necessary to carry out certain transformations in the data structure. These transformations involve changing the order of rows and columns to process the data and generate the necessary fields for subsequent analysis.

FineBI has convenient tools for this, available in the data editor interface, which will help you achieve quick results.

Data processing functions

Data processing functions

Let’s take a closer look at the example of the “Split Rows and Columns” and “Convert Rows and Columns” functions and the data that contains Chinese students’ exam results. To do this, let’s launch FineBI and create a new “Analysis Subject”.

Initial data structure:

As you can see, this is raw data that cannot be analyzed yet

As you can see, this is raw data that cannot be analyzed yet

Using the “Split Field” and “Row to column” functions, we will divide the “Exam Results” column into columns with grades for the corresponding exams.

Step 1: Break down the “Results” column by words:

Result of 1st processing step

Result of 1st processing step

Step 2: We split the resulting column into 2 characters:

Result of 2nd processing step

Result of 2nd processing step

Data after processing:

The final result: after dividing the data into different columns and rows, the structure became clearer and simpler

The final result: after dividing the data into different columns and rows, the structure became clearer and simpler

1.2 Removing duplicates

When analyzing data, quality is always the most pressing issue, and duplicate rows are the most common problem. When deleting these lines, we may encounter two scenarios for the development of the calculation logic.

In the first case, it does not matter at all which of the rows we save, since it will not affect the analysis in any way. For example, in the row “A, A, A” we can leave any A. In this case, to optimize the process, we can use the “Delete Duplicate Row” function.

In the second case, you need to leave a specific line. For example, in a situation where students retook an exam and there will be two sets of data for each student, and during analysis we will need to leave the one containing the latest records. In a situation where it is A that needs to be left out of “A, B, C,” we must use the “Delete Duplicate Row” function so as to keep the first row. To do this, we first sort the data with the “Sort” function, and then delete the rows with the “Delete Duplicate Row” function.

Select the “remove repeat” field to quickly remove duplicates

Select the “remove repeat” field to quickly remove duplicates

To check for duplicates in the data, you can also use the filtering functionality in the headers.

Header drop-down menu analyzes existing fields and quickly detects duplicates

Header drop-down menu analyzes existing fields and quickly detects duplicates

1.3. Working with null value

One of the most problematic aspects when analyzing data is the presence of null values ​​in the table.

When we have large amounts of data and “null” is rare, it may not have much impact on our sum or average calculations. In such cases, we can simply ignore these values.

But if, during data processing, some “null” values ​​need to be excluded as unwanted data entirely, we can use a convenient filter in the table header to quickly exclude null values.

Filtering by title quickly found an empty field and then prompted you to choose whether to filter it or not

Filtering by title quickly found an empty field and then prompted you to choose whether to filter it or not

The above shows a simple processing scenario, but in real life you may have to encounter situations where the value “null” is important to preserve the calculation logic.

Let’s consider this situation using our dataset with data on student performance as an example. Probably one of the students missed class due to illness, so the system does not have a record of his grade on one of the exams – we cannot ignore him, so we cannot simply delete his data from the database.

We should be very careful about such situations in order to selectively set filters in further calculations. To do this, in FineBI you can use the “Formula Column” function (add a column with a calculated value using a formula) or a more convenient one: “Condition Tag Column” (add a column with a value based on a condition).

We mark students without grades as having missed the exam.

We mark students without grades as having missed the exam.

Lesson 2: Joining and Analyzing Multiple Tables

Analysts often encounter situations where data is presented in multiple tables. Before starting the analysis, the question arises: how to join these tables? For those just starting to learn business analytics, let’s look at two examples of using the join function: “Union All” and “Join”.

Let’s imagine that we have several identical tables containing exam results in different study groups. If we want to join these tables while preserving all the data, we use the “Union All” operation. It’s as if we simply glued these two tables together vertically, adding rows from the second table after the rows from the first table.

The table has more rows, but the number of columns remains the same

The table has more rows, but the number of columns remains the same

Now let’s imagine that one of the tables has more columns, for example, a column with an overall performance assessment has been added. Before we look at the “Join” function in detail, let’s study another similar function “Column from Other Tables” (add a column from another table).

You are probably already familiar with “Vlookup” and “Sumif” from Excel. As you might have guessed, this function allows you to aggregate (Sumif) or look up relevant dimensions from other tables and join them with the current table (Vlookup).

After we have summed up the points for the given parameters, we have a new column

After we have summed up the points for the given parameters, we have a new column

SQL lovers are probably very familiar with the left join and right join functions, so when using the “join left and right” function when editing data in BI, you can safely be guided by SQL logic.

For those who are used to working with SQL and are familiar with operators like left join, right join, etc., the “Join” function in the FineBI Data Editor may be more intuitive. This function behaves similarly to SQL statements and can be easily learned by those already familiar with basic SQL principles.

Lesson 3: Creating Calculations and Analytics

Once you’ve prepared your data structure and joined several tables, it’s important to stop and think about the problem we’re exploring. Check to see if your tables have the necessary metrics to solve the problem, or if you need to add new calculations or analytical metrics.

For example, when analyzing sales, you often have to independently calculate profit, growth rates and other indicators. Therefore, before conducting analysis, we can insert calculation formulas directly into the data table. In FineBI, this can be done using the “Formula Column” function that you are already familiar with. It works just like a formula in Excel, you just need to enter the appropriate formula to create the desired field.

Formula editor interface

Formula editor interface

Then using the “Summary Column” function we can easily calculate the sum.

Select the appropriate group and calculation method to perform calculations for the specified indicators.

Select the appropriate group and calculation method to perform calculations for the specified indicators.

The Condition Tag Column feature helps solve the most problematic issues for analysts related to the IF condition. So, we do not need to write 7-8 levels of IF conditions – we only need to configure some conditions manually to assign different values ​​to the data.

By adding a condition, we can filter the data and assign appropriate values ​​to it

By adding a condition, we can filter the data and assign appropriate values ​​to it

Lesson 4: Verification of processed data

For beginning BI developers, the main difficulties lie not only in understanding the computational logic of a large number of functions of a BI solution, but also in verifying the processed data. Surely you often have similar thoughts: “I did everything right, but did it turn out right?”

To solve this problem, FineBI provides a large number of useful functions.

4.1. Verification by title

To quickly check the sum and average value, just select the required column and pay attention to the information in the lower left corner. In this way, you can check the data obtained and determine its reliability.

When checking the data, we received an average value of 85.92, which corresponds to our data

When checking the data, we received an average value of 85.92, which corresponds to our data

4.2. Adding and removing data processing steps

FineBI allows you to seamlessly add or remove some data processing steps. Thanks to this feature, we can filter out important data and then remove some processing steps that are questionable. Thus, we can use trial and error to see how the data will change. It’s similar to how we used to check many times in school to see if an equation was solved correctly.

Let's filter some of the data by title to conduct a “selective check”

Let’s filter some of the data by title to conduct a “selective check”

Let's use the prompts in the action area to quickly check the data

Let’s use the prompts in the action area to quickly check the data

Our introductory lesson on data editing has come to an end.

The training dataset can be downloaded link.

You can safely ask your questions in the largest FineBI community in Russia: https://t.me/FineBIChat.

For all questions about the product, license prices, migration, write to: bi@glowbyteconsulting.com.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *