Analytical engineer or data engineer: How to choose the right specialist?

Madison is an analytical engineer with a passion for data, entrepreneurship, writing, and education. Her goal is to teach in a way that is beneficial to everyone, whether they are just starting a career or have been in engineering for 20+ years.

And so you realized that you could use some more help working with data. But you don’t know who exactly you should look for, because there are data analysts, data engineers, and now… and analytical engineers?!

Without understanding it, you might think that data engineers and analytical engineers are the same thing. It sounds similar, right? But in reality, analytical engineers and data engineers have different responsibilities. Therefore, choosing the right professional for you and your organization depends on your needs, the position they occupy in relation to the business, and the skills required for the job.

TL;DR: Analytical engineers vs data engineers

Essentially, analytical engineers are closer to the business and focus on the data itselfwhile data engineers are closer to engineering and are more focused on processes and infrastructure for correct providing data.

To put this into perspective, let's say you work in the development department and are responsible for launching a new product. In this case, you will work with a data engineer to ensure that the correct information is displayed on the product page on the company website. He will also be responsible for ensuring that traffic is properly monitored on this page.

Once all of this is set up, you and your analytics engineer will work to ensure that your data warehouse contains the data and metrics you need to track the success of your product. Or if the data in its current form is not usable, it will also be his area of ​​expertise to create a dataset that will allow you to get the metrics you need.

It's clear? Now let's take a closer look at the organizational differences between analytics engineers and data engineers, as well as their direct responsibilities.

What place do they occupy in relation to business?

Analytics engineers are typically found in the analytics or data science department, which falls somewhere between the business department and the engineering team. They act as a liaison between the two departments, as they work with both technical and business concepts. This typically involves communicating closely with stakeholders to understand their needs and then building data models around them. For this reason, analytics engineers must also be familiar with transactional database models, including how to manage the warehouse's consumption of such data.

The distinctive feature of analytical engineers is business context. Since the data models created by analytical engineers are built with business interests in mind, it is important for them not only to know the various metrics, but also to understandHow they are the ones that will be used.

In contrast, data engineers are usually part of an engineering team. They rarely interact with the business department, and instead communicate with the analytical engineers. Their tasks are usually assigned by a facilitator – such as a Scrum Master or Project Manager – who decides what is most important to the business from an engineering perspective. At their core, data engineers are responsible for collecting data into a transactional database and various integrations.

How do their skills differ?

It's important to note that skills often vary depending on your organization and its size. Below are the general skill differences between the two specialties.

Analytical Engineers

The role of an analytical engineer blurs the line between technology and business. Although responsibilities may vary greatly from company to company, every analytical engineer should have at least the following skills.

Data Modeling

Analytical engineers have a deep understanding of data modeling and transformation. This means they know how to piece together complex logic to create automated, reusable data sets that serve as the basis for your dashboards and reports.

SQL

While knowledge of SQL is important for every data scientist, analytics engineers speak SQL as their first language as it is the primary language used for data modeling and the core of many popular tools. It is SQL that allows them to query the databases in your warehouse and calculate KPIs.

Data store

Your data warehouse (or the place where all your data is stored) is at the disposal of an analytical engineer. It configures the storage architecture so that it is properly optimized for use by analysts and your data visualization platform.

An important part of such storage is also a clear understanding of the correct roles and permissions that ensure data security. Popular data warehouses such as Snowflake, Databricks, and BigQuery serve the same purpose, but each has its own unique features. But if you understand one of them, you can easily learn to use the others.

Modern Data Stack Tools

The modern data stack can be broken down into several different parts, but two of them are fundamental: ingestion and orchestration. Analytical engineers understand how to use these tools to manipulate data within the stack.

To organize their data consumption, they need to have an understanding of several of the most popular data collection tools, such as Fivetran, Stitch and Airbyte. Fivetran and Stitch are easier to deploy tools with fairly simple setup, while Airflow is an open source tool that requires more technical knowledge.

Analytics engineers should also be familiar with orchestration tools (or tools that help deploy your data models to production). However, depending on your team, data engineers may also have this skill.

Interestingly, Airflow is more popular among data engineers, while analytics engineers prefer other tools such as Prefect and Dagster. But in reality, these preferences depend only on experience and skills in writing code.

dbt

dbt is a tool used for data transformation. This tool combines an analyst's knowledge of data modeling and SQL, and offers unique capabilities to simplify modeling.

It helps the analytical engineer create modular, efficient and easy to read data models by reducing the amount of repetitive code with advanced features like macros and many other packages that can be easily installed.

Fun fact: dbt actually created the Analytical Engineer specialization!

Analytical dashboards

While some may argue that creating information dashboards is the primary responsibility of data analysts, analytics engineers are also adept at creating visualizations using popular tools like Tableau, ThoughtSpot, and Looker. After all, it is their data models that serve as the basis for visualization. Therefore, it is important for analytics engineers to understand how to take these data sets and then use them to display the data exactly as stakeholders need it.

Data engineers

Like analytical engineers, a data engineer's responsibilities typically vary depending on the type of company he works for and the specific industry, but they can generally be divided into three main categories: generalists, pipeline specialists, and database specialists data. However, no matter which category they fall into, every data engineer should have the following skills.

Python

If you work as a data engineer, you need to know Python. This language is commonly used in orchestration tools such as Airflow, Dagster and Prefect, but is also widely used for API development, interactive testing and scripting. Luckily, it is one of the easiest to understand and learn, which is what makes it so popular in the data world.

DevOps

Data engineering is about developing applications and ensuring they are deployed correctly in production. Depending on the responsibilities of your team and its size, a data engineer may also be responsible for making the necessary changes to the code.

If not, then he should at least be familiar with cloud services such as AWS, Google Cloud and Azure. Knowledge of at least one of these platforms is necessary to host almost any service that supports various applications. Separately, it can be noted that knowledge of Kubernetes, an open source container orchestration service, is typical for data engineers who do a lot of DevOps.

Bash

Bash is a command line language that makes directory navigation and file editing easy. It is often used in DevOps deployment scripts, allowing you to automate time-consuming tasks.

Git

Git is a version control system that helps keep track of code changes so engineers can easily save them and collaborate on code across their teams. It's also a great tool to use as a “best practice” in case bad code is sent to production and the deployment has to be rolled back.

Orchestration Tools

As already mentioned, orchestration tools such as Airflow, Dagster and Prefect are also important for data engineers. Depending on the composition of your team and the qualifications of each member, a data engineer may be responsible for thisso analytical engineer. Honestly, it really depends on your organization. However, since data engineers are often proficient in the Python language that powers these platforms, this task will most likely fall to them.

So who should you hire?

Now comes the fun part. Which specialist is right for your team? Overall, here's a helpful visual diagram to help you understand the difference.

=

=

While the lines between these specializations may be blurred, you can consider these differences when you think about what you're trying to achieve and the specific pain points you hope to eliminate.

Is your data a mess and difficult to use? → Analytical engineer

Let's say you're faced with data quality issues—from bad and missing data to no data at all—that make it difficult to use your data. In this case, you need to hire an analytical engineer.

Analytics engineers are responsible for the data pipeline from ingestion to visualization, so they will be the ones who will closely monitor the data to ensure it meets company standards. In other words, if data is missing or incorrect, they will be the first to know. An analytics engineer can run tests using tools like dbt and re_data, set up alerts, and take proactive measures to ensure the data always looks the way it should.

Problems with data collection on your site? → Data engineer

Data engineers usually work on internal website processes that help collect all the important customer data. If you are not collecting this data or are having trouble collecting it, you will need to hire a data engineer.

Analytical engineers typically work with data after it's already collected and is in the business of moving it from point A to point B. Data engineers, on the other hand, can help develop systems and processes to ensure that this data gets sent to a place where it can be used by analysts.

Is your data scattered across several different platforms? → Analytical Engineer

If you're having trouble creating a single source of truth, you might want to hire an analytics engineer. Analytics engineers are responsible for the data warehouse, which acts as a single source of truth for all company data. They bring data into this warehouse from various sources, clean the raw data, and then form the underlying data models. Among other things, they will help you properly document and consolidate data sources so that metrics and KPIs are consistent across all areas of the business.

Do you need a custom data pipeline? → Data engineer

While this may be a controversial point of view, if you want to build your own data pipeline, it's better to hire a data engineer because they usually have more experience with tools like Airflow, which require deep knowledge of Python, DAG, and cloud infrastructure.

Building a custom data pipeline can be quite technical for someone with little data science experience. While some analytics engineers can certainly take on this work, they are usually more comfortable using easier-to-maintain data pipeline tools such as Prefect and Dagster.

Choosing the right engineer

Understanding the pain points you're trying to solve will help you hire the right data expert for your organization—and make your hiring process more efficient.

Remember: Analytical Engineers focused on the data itself, paying attention to issues such as quality, freshness and adequate arrival time. They manage data at all stages of its processing. In turn, data engineers are engaged in data infrastructure on the website and in the company’s systems, as well as tools that support data processing pipelines.

While this article provides an overview of the differences between these two roles, keep in mind that they vary from company to company. So discuss this with your data and technology teams to better understand what gaps exist and who is best suited to fill them. Better yet, send them this article and ask them to identify the scenarios and specific skills they find most relevant. Happy hiring!


In conclusion, we invite everyone to the open lesson “Reverse ETL. Why? For what? How?” 18th of March. In class we'll figure it out:

— What is operational data and why is it important for business?
— What instruments are available on the foreign market?
— Why has the trend for their use never reached Russian companies?

You can sign up for a lesson on the “Data Warehouse Analyst” course page.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *