Systems Analyst. A Brief Guide to the Profession. Part 1

This text is primarily aimed at newcomers to the IT industry, intended for those wishing to become familiar with the profession, learn about its content, basic principles, practices and tools used in it, and is an attempt to structure and bring together the knowledge and experience accumulated during work in this fascinating profession.

Who is a systems analyst?

A systems analyst is an intermediary between business and developers. He studies business needs, formalizes them, passes them on to developers in the form of requirements, and participates in the acceptance of the final result. This job requires good communication skills and an analytical mind.

The main goal of an analyst is to understand a business problem and propose a solution using information technology.

A systems analyst must be immersed in the technical aspects of information systems. He must have a large number of practical technical skills and try to “speak the same language” with developers – to describe in detail all the technical nuances associated with development.

In today's reality, a systems analyst is often required to deeply immerse themselves in the subject area (business domain) and the specifics of internal processes. Analysts should try to follow trends in technology, as well as be able to obtain the necessary knowledge and apply it in practice.

To become familiar with the profession, it is necessary to study the basic concepts on which computer systems and networks are based.

Internet and interaction on it

The IT sphere owes its rapid growth and development to the Internet. The basis of all interaction on the Internet is the OSI model, or rather its development – TCP/IP model.
The OSI (Open System Interconnection) model, like the TCP/IP model, describes how devices on local and global networks exchange data and what happens to that data.

OSI model and TCP/IP model

OSI model and TCP/IP model

Channel layer (Link layer) describes the organization of a physical connection between devices within a single network. Link layer protocols work with physical addresses (MAC addresses) of devices.
Network layer (Network layer) describes the rules for transmitting data from one network to another (routing).
Transport layer (Transport layer) – is responsible for transmitting data with guaranteed or non-guaranteed delivery between devices.
Application layer (Application layer) – provides interaction between applications on the network.

TCP/IP Operation Process

  • Before sending, the information is divided into packets, each of which receives its own unique IP address indicating the final delivery point;

  • at the transport layer (TCP), it is checked that all packets arrive without errors and in the correct order. In addition, the flow of information is controlled, preventing network congestion;

  • At the network layer (IP), each packet is told which nodes or routes to use to reach its end point;

  • at the data link layer (Ethernet), each packet receives a physical number (MAC address) for delivery to a specific device within the network;

  • Upon arrival at the final delivery point, the packets are reassembled in the correct order and the original information is restored.

Networking. HTTP protocol

The rules for interaction between devices on the Internet are determined by the protocol HTTP (Hypertext Transfer Protocol). It establishes the order of forming requests from a web browser (client) to a web server (server).
The secure version of the HTTP protocol is the protocol HTTPS (Hypertext Transfer Protocol over Secure Socket Layer), which provides data protection on the network by encrypting transmitted data in accordance with the SSL protocol.

The basis of HTTP is the interaction of “client-server»:

Client-server interaction

Client-server interaction

  • the client initiates a connection and sends a request to the server (Request);

  • the server processes the request and sends a response to the client (Response).

HTTP request

HTTP request

HTTP request

An HTTP request from a client to a web server must include:

  • URL (Uniform Resource Locator) – a text string (address) that indicates the location of a resource on the network;

  • method (Method) – defines the action performed by the server after receiving a message;

  • title (Header) – a part of an HTTP request that contains technical information necessary to form a request or response;

  • body (Body) – the server and the client transmit data to each other in the body (Body) of the HTTP request.

Hidden text
HTTP Request Example

HTTP Request Example

The main methods that HTTP provides are:

  • POST – resource creation. The client transmits information to the server, which it must record in a certain way;

  • GET – reading a resource. The client requests certain information from the server and receives it in response;

  • PUT – full update of information. The client can replace the information existing on the server;

  • PATCH – partial update of information. The client can replace part of the information existing on the server;

  • DELETE – deleting information. The client can delete information from the server.

Hidden text

CONNECT – a method for establishing a tunnel (two-way communication) between a client and a server.
HEAD – requests headers. Can be performed before loading a large resource to obtain data about the resource.
OPTIONS – a method for requesting data on methods supported by a resource.
TRACE – a method for checking the connection. The server must return the received message back to the client in the response body.

The use of methods complies with the HTTP protocol specification, but this does not mean that you cannot send a request to create a resource using the GET method, or request a resource to be read using the POST method.

Hidden text

Most often, in the GET method, parameters are passed in the URL (1) or in query parameters (2):

1. GET https://qwerty.com/books/245 In this case, the request contains the identifier (245) of a certain book (books) in the store on the website (qwerty.com).

2. GET https://qwerty.com/api/v1/books?author=Orwell&title=1984 – in this case, the query contains parameters of the “key-value” type, which are “attached” to the resource URL via the “?” symbol. If the query implies a search by several parameters, they can be connected via “&”.

But if the request is more complex and you need to pass more parameters in it, you may encounter length limitation URL strings in 2048 charactersand the parameters will be lost when transmitted to the server.
Therefore, in such cases, the POST method is used, in which the requested parameters can be put into the request body in the form of a JSON object (we will talk about it later) and easily transmitted to the server.

HTTP response

The server also responds to the client in a specific way.

HTTP response

HTTP response

The HTTP response consists of response code, Headlines And bodies:

  • response codes are grouped by value, which provides brief information about the result of the request;

  • in the response header, the server tells the client how to interpret the received response and correctly process the server's HTTP message;

  • The response body contains data that is the result of executing the request. But not all responses have one. For example, a DELETE request may not have a response body.

Hidden text
HTTP response example

HTTP response example

Data exchange format. JSON

Both for transmitting the request body and for transmitting the response body, the most popular data format is JSON (JavaScript Object Notation), which defines the rules for describing objects.

JSON

JSON

Object (Object) is a certain concept of the subject area in the form of a pair
{ “key”: “value” }. The beginning and end of an object are indicated by curly brackets.

One object can be distinguished from another by their properties, called key (Key).

If the key set of valuesthe beginning and end of the set are indicated by square bracketsand the values ​​are separated by a comma. Multiple key/value pairs are also separated by a comma.

The key value can also be thought of as an object made up of objects. Such nesting is enclosed in separate curly brackets (as for the beginning and end of an object), and the properties inside it are also enclosed in quotation marks and separated by commas.

The following can be transmitted in JSON: data types: string, number (integer or fractional), logical data type (Boolean) – true or false, array and object.
To transmit a date, the unix-time format is usually used in a string, for example: “2024-07-26T14:30:00-03:00”.

Frontend and backend in API applications

Frontend and backend are two main parts of the system that are used in software architecture.

Frontend represents the presentation part of the system, its user interface and related components.
Backend is an internal implementation of the system.

This separation allows front-end developers to focus on interface solutions without knowing the details of the internal implementation, and back-end developers to focus on creating programming interfaces (application programming interface).

API (Application Programming Interface) – a programming interface that describes how one computer program interacts with others. The word “contract” is often used to describe an API. In other words, an agreement between two parties that defines the exchange of information between them.

In simple terms, the API includes: operation, input data operations, output data operations, as well as logic error handling.

Some of the popular approaches to development at present are:

  • Code first — first we write the code, then we generate a contract based on it;

  • API first (Contract first) — first we create a contract, then we write or generate code based on it.

APIs are divided into those that provide backend-to-backend interaction: backend-API and frontend-backend: web API.

Types of API

Types of API

Interactions between systems there are:

  • synchronous (for example: REST API, SOAP, gRPC);

  • asynchronous (for example: Kafka, RabbitMQ, WebSocket).

The difference between them is that applications with asynchronous interaction do not wait for a response after sending a request, but can continue to execute their main flow of tasks.

Synchronous and asynchronous communication

Synchronous and asynchronous communication

API integration is used when it is necessary:

  • exchange data in real time;

  • have flexible access to the functionality of the application (system);

  • have authentication and authorization methods to protect access to the system or its data.

Databases. SQL, noSQL

To store and process information, databases are used that differ in the way they store data.
If the data in the database is presented in the form of related tables, such a database is called relational (English relation, “connection”).

Basic concepts.

  • Essence (entity) – an object of the subject area that is represented in the database. Entities usually correspond to tables in relational databases.

  • Attribute (attribute) – a characteristic of an entity. Attributes correspond to columns in relational databases.

  • To ensure the uniqueness of a record in a table, a feature (attribute) is added to each record — primary key (primary key), which can be composite (consist of values ​​of several attributes).

Tables in a database do not exist in isolation from each other; connections are established between them.

  • Foreign key (foreign key) – an attribute that refers to the primary key of another entity. It provides an unambiguous logical connection between records between tables within one database.

Relationships between tables are of the following types:

  • One-to-one (one-to-one) – one record in table A corresponds to one record in table B;

  • One-to-many (one-to-many) – one record in table A may correspond to several records in table B (it does not work in the opposite direction);

  • Many-to-many (many-to-many) – one record in table A corresponds to several records in table B and the same is true in vice versa.

Every many-to-many relationship in a relational database must be changed to a one-to-many relationship by introducing additional tables.

Normalization – is the organization of data in tables in such a way as to eliminate duplication and redundancy of data and thus avoid violation of the integrity of data when they are changed (anomalies). A database is considered normalized after reaching the third normal form.

Types of normal forms (let us consider only the three forms most frequently encountered in practice):

  • 1NF: There are no duplicate rows in the table.

  • 2NF: The table has a primary key, and all other fields depend on it, but not on part of it (if the primary key is composite).

  • 3NF: All non-key attributes depend only on the primary key and are independent of each other.

Databases are managed by database management systems – DBMS (for example, PostgreSQL, mySQL, etc.).
There is a special query language for working with the database – SQL (Structured Query Language).
Queries in SQL are written using operators.

Hidden text
  • SELECT – an operator for reading a record from a table.

  • INSERT – operator for inserting a record into a table.

  • UPDATE – an operator for updating a record in a table.

  • DELETE – operator for deleting a record from a table.

  • FROM – a statement that specifies the table in which the operation is performed.

  • WHERE – filtering operator.

  • ORDER BY – operator for sorting records.

  • GROUP BY – operator for grouping records.

  • HAVING is a filtering operator over grouping.

  • LIMIT – operator for limiting the number of records to be read.

  • OFFSET – skip operator for reading records.

  • AND, OR, NOT – logical operators.

  • IN, LIKE, BETWEEN – conditional operators.

  • JOIN – operator for joining tables.

  • INNER JOIN – involves joining by the “internal” area common to two tables.

  • LEFT OUTER JOIN – the result of the join will include all records from the left table

  • RIGHT OUTER JOIN – the result of the join will include all records from the right table

  • FULL OUTER JOIN – the result of the union will include all records from two tables

If the data in the database is not presented in the form of related tables, then such databases are of the type noSQL (not only SQL). There are four main types of noSQL DBMS:

  • key-value DBMS – data is stored as a key-value record (Redis, DynamoDB);

  • columnar DBMS – data is stored as a sparse matrix, the rows and columns of which are used as keys (HBase, Cassandra);

  • document-oriented DBMS – store hierarchical data structures (MongoDB);

  • graph DBMS – store data in the form of a graph and its generalizations (Neo4j).

How modern applications and systems work

In conclusion of the first part of the article, let us consider an example of an abstract system.

  1. Let's consider this option client-server architecturein which we have a server (Application Server) with a monolithic application and two clients communicating with the application via REST API.

The simplest version of client-server architecture

The simplest version of client-server architecture

  1. In order to store data, the application needs to have a connection to the database. In this case, relational databases are most often used to store structured data.

Adding a DB for storing data

Adding a DB for storing data

  1. As the number of requests to the application increases, it is necessary to provide cacheto reduce the application response time when Frequently asked questions the same ones rarely changeable data.
    Since the database stores data on the hard disk, it makes sense to store data in RAM on a local server (in-memory cache) or on a separate server using an in-memory database to speed up access to it.

Adding a cache to store frequently requested and rarely changed data

Adding a cache to store frequently requested and rarely changed data

  1. To ensure application resiliency, storage and access to the most critical data, it is necessary to use approaches to scaling the databaseone of which is replication.
    For this purpose, the DBs are combined into a cluster (Database Cluster). A Master Server is declared, in case of its failure, the Slave server takes the main load until it is restored. Data is replicated from the Master node to all Slave nodes.

Scaling the DB

Scaling the DB

  1. As the load increases, there is a need to move on to microservice architecture. Monolithic the application is divided into services (each of which executes a specific piece of logic within a business context).
    A separate layer (application) is allocated for routing requests from client devices. API Gatewaywhich, in addition to routing requests between services, performs a set of additional functions (which will be discussed below).
    If several microservices access the database, it is recommended to allocate a separate service (handler) to work with this database.

Moving to a microservice architecture

Moving to a microservice architecture

  1. Let's consider scaling applications.
    In order to handle the increased number of requests, you can add replicas (copies) of each application. Thus, the number of servers that are combined into clusters (Server Cluster) will also increase.
    For traffic balancing API Gateway with its load balancing functionality is used between server clusters and service instances.

Scaling the application

Scaling the application

  1. As the load increases synchronous the approach (REST API) to messaging will begin to introduce increasing latency, caused by the fact that all service resources (ports) will be busy during the connection establishment.
    In addition, if the connection is broken or the service fails, some messages may be lost.
    To avoid these situations, move on to asynchronous interaction. One of the most common approaches is the use of message brokers (Kafka, RabbitMQ).
    In the case of Kafka, the service sending a message writes it to the corresponding topic (message queue), from which the message is read by another service.
    This ensures loose coupling services and guarantee deliveries messages.

Adding asynchronous interaction between services

Adding asynchronous interaction between services

  1. For storage unstructured data (time series, JSON documents or files) a more performant solution compared to relational databases will be noSQL Databases and object storage S3.

Storing unstructured data

Storing unstructured data

  1. To ensure observability services need to use systems monitoring, collection of logs and metrics.
    The most popular solutions are to use a combination of ElasticSearch (DB) + Kibana (GUI) and/or PrometheusDB (DB) + Grafana (GUI).

We provide observability

We provide observability


In this article, you learned about the basics of network interaction, the main protocol and format for data exchange in networks, what parts the simplest applications consist of, and saw an example of a complex application in the form of a distributed system.

In order to understand how to design and create such systems, the following parts will consider the types of requirements, the process of collecting them, as well as methods and tools for documenting, types of software architecture and interactions, and much more that will help to get a more complete understanding of the profession of a systems analyst.

You can find this and other articles on system analytics and IT architecture in my small cozy Telegram channel: Notes of a systems analyst

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *