general concepts. SA for the little ones

Lyosik | Lead Systems Analyst (SA Lead)

Welcome to the block of articles for beginner systems/business analysts. Here we are preparing to receive the coveted offer together

Perhaps, let's start with the most basic and primitive definition.

Database (DB) is a set of data stored in a structured form.

The second key concept is the DBMS.

Database management system (DBMS) – these are systems (or programs) that allow you to create databases and manipulate information from them.

The diagram below represents a simplified process for interacting with DB.

Explanation of the diagram:

When we, as users, want to retrieve data stored in a database (DB), we form a query in SQL language. The database management system (DBMS) accepts this request and accesses the database to find the required information. Once the data is found, the DBMS returns it to us and we can view it.

Database types

  1. Relational;

  2. Non-relational (NoSQL);

  3. Object-oriented;

  4. Hierarchical;

  5. Network;

  6. NewSQL.

We will mainly work with relational databases, which organize data in the form of tables. However, there are other types of databases that are also worth knowing about.

Second in priority are non-relational (NoSQL) databases.

Let's try to compare them according to the main criteria.

Features of a Relational Database

  • Rigid structure (data is recorded in separate cells and thus structured);

  • Vertical scaling;

  • SQL queries;

  • Simple and convenient;

  • Popular.

Features of a Non-Relational Database

  • No structure requirements;

  • Vertical and horizontal scaling;

  • Various query algorithms;

  • High protection;

  • High entry threshold.

Most likely you have a question regarding scaling. To put it in a nutshell, without going into too much detail:

Vertical scaling – this is essentially an increase in the power of one specific server where the database is deployed. In the case of horizontal scaling, we try to divide our database into several servers, connecting balancers to distribute the load evenly.

A short excursion into a comparison of the two types of databases was carried out. Let me remind you further (within the scope of this article) we will talk specifically about relational databases.

Relational data model

Relational model is a model that is focused on organizing data in the form of two-dimensional tables

Example of a relational model

Example of a relational model

Relational table properties

  • Each table element is one data element (that is, only one value is stored in each table cell);

  • All elements in a column have the same type;

  • Each column has a unique name;

  • Each entry has its own identifier (key);

  • The order of rows and columns is determined by the data sorting rules;

  • The relationships are presented in the form of tables, where the relationship and additional parameters are indicated.

Here we touched on such a concept as key. Understanding it is important, so let’s take a closer look.

Keys

Key is a column or set of columns (composite key). The key identifies each row in the table. With it we can access a specific row of data in a table

Types of keys

Try to answer yourself What is the key difference between these two types of keys?

Clue

There is already an analysis of this issue in my tg channel, if anyone is interested – welcome 🙂

So, let's dive into the definitions.

Primary key

Primary key is a unique identifier for a record in a database table.

There can be two types:

Consists of a single field that uniquely identifies a record in a table.

For example, the “Employee number” field in the “Employees” table.

Example of a simple key

Example of a simple key

Consists of two or more fields that together uniquely identify a record.

For example, in the Departments table, a composite key could be a combination of the Department Code and Department Number fields, which together determine the uniqueness of the department record.

Example of a composite key

Example of a composite key

Foreign key

Foreign key is an attribute or set of attributes that references the primary key or unique key of another table. In other words, it is something like a pointer to a row in another table.

Example of a foreign key

Example of a foreign key

So, we have more or less figured out the keys in the database. Let's move on to the next part of our topic.

Relationships between tables

The basic information is shown in the diagram below, so read and remember carefully:

Relationships between tables

Relationships between tables

Important!

In the case of constructing a database, the connection many – to – many implemented through an intermediate table that contains the relationships between them.

When building a many-to-many relationship, we essentially have two tables that are data sources. For example, the “Order” and “Product” tables. The intermediate table will contain order and product codes.

Example of a many-to-many relationship

Example of a many-to-many relationship

Now I propose to consolidate this information and complete a small task for training.

Defining Relationships

  1. Determine the type of relationship between the Products and Prices tables if the Products table has fields for Product Unique ID, Product Name, Product Weight, and Product Price, and the Prices table has fields for Product Unique ID and Price.

  2. Determine the type of relationship between the Movies and Actors tables if the Movies table contains information about each movie, including its title, year of release, and a unique movie ID, and the Actors table contains information about the actors, including their names , dates of birth and actor's unique identifier, as well as a field for the unique identifier of the film in which the actor took part.

  3. Determine the type of relationship between the Students, Courses, and Registration tables if you know the following: Each record in the Registration table contains a unique student ID and a unique course ID.

  4. Specify the type of relationship between the Employees and Departments tables if the Employees table has fields for Employee Unique ID, Employee Name, Employee Title, and Department Unique ID, and the Departments table has fields for Department Unique ID and Title department.

The correct answers are already in my tg channeland also interesting posts about it are regularly published there: the everyday life of an IT specialist, analyzes of real vacancies, practical interactives and much more – follow the link below and subscribe 🙂

Let's see what ER diagrams might look like in practice:

ER diagram (Entity-Relationship Diagram) is a graphical representation of entities (objects) and their relationships in a database.

Example ER – diagrams

Example ER - diagrams

Example ER – diagrams

An ER diagram consists of the following main components:

  1. Entities: objects that can be represented in the database (for example, Author, Book).

  2. Attributes: characteristics of entities that describe their properties (for example, author’s full name, book title).

  3. Relationships: relationships between entities.

  4. Cardinality: indicates how many instances of one entity can be related to instances of another entity (e.g., one-to-one, one-to-many).

Next on our agenda is one of the most complex topics in the field of databases, so let’s stock up on some tea and let’s go.

Database normalization

It is a way of organizing data in a database to reduce redundancy and improve storage efficiency.

Redundancy is when the same data is stored in several places in the database.

The basic idea is to split tables into smaller ones to get rid of attributes:

  • with multiple meanings;

  • repetitive;

  • unclassifiable;

  • with redundant information;

  • created from other features.

This is precisely why the so-called normal forms are used, which we will talk about now.

Zero normal form

A situation where all data is on the heap.

Look at the table below and try to say, what's wrong with it:

I hope you have answered this question for yourself. Let's see if our opinions match

What's wrong?

  • duplicate lines must be removed;

  • you need to store one phone number in cells, not a list;

  • put the phone type in a separate column.

Now let's see what the correct option should look like:

First normal form (1 NF)

A relation is in 1NF if the values ​​of all its attributes are atomic (indivisible).

Again, a little task to think about. What's wrong in the table below?

Let's check:

  • A violation of the normalization of 1 NF occurs in the BMW brand, because one cell contains a list of 3 elements: M1, M5, X7, i.e. it is not atomic.

The correct option is presented in the table below:

Second normal form (2 NF)

A relation is in 2 NF if

  1. It is already in 1 SF;

  2. A table has a primary key and all records depend on it.

    Data that does not depend on this key should be placed in a separate table.

Think about how you can decompose the following table?

Did you really think so? Then look at an example option:

Third normal form (3 NF)

A relation is in 3NF if

  1. It is already in SF 2;

  2. Every non-key attribute depends on the primary key

Simply put, the rule requires that all non-key fields, the contents of which may relate to several table records, be placed in separate tables.

So, the question is already known to you. How can you decompose the following table?

As an option, for example, like this:

This decomposition option is not entirely obvious, so let’s look at it in a little more detail.

As you can see, we now have a “Store ID” column. This is not a natural PK that we introduced ourselves. What is this for? – you ask. In fact, this approach will make it easier to work with the database in the future if you suddenly have to change some values ​​(and in real practice you definitely will, believe me). For example, when changing the phone number of the “Real-auto” store, we will need to enter the new one only once in the “Store” table.

So what is third normal form for?

Basically, to simplify our lives 🙂

Name change without 3 NF:

Name change With 3 NF:

There are also 4, 5, and 6 NF, but they are rarely encountered in practice, and at the initial level their detailed study is not required, just knowing about their existence is enough.


In general, this is all I wanted to discuss in this article. Of course, along with the concepts of databases come various applications for managing them (like DBeaver and others), as well as approaches to building ER diagrams. But I decided to put this in a separate article, so as not to overload the current one and your brain 🙂

Subscribe to my TG channel about IT and see you soon!


Client-server architecture:

Client-server architecture. SA for the little ones

Lyosik | Leading Systems Analyst (SA Lead) Welcome to the block of articles for beginners in systems…

habr.com

General principles of system integration:

General principles of system integration. SA for the little ones

Lyosik | Leading systems analyst (SA Lead) Welcome to the block of articles for system beginners…

habr.com

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *