The experience of our DTG team in creating digital products and services allows us to state that the problem of three monoliths interferes with the solution of these problems: application monolith, integration monolith and data monolith. They are the result of the inherited paradigms of traditional architecture, culture, relying on existing data and working in a “layered” system, where the isolation of the IT department and business leads to the loss of data and knowledge about them. As a solution to this problem, we see a transition from traditional development and management approaches to distributed ones, which implies serious technical and cultural changes in the organization.
But first things first. Let us briefly describe what the notorious monoliths are, and then we will move on to the solutions we propose to overcome the difficulties generated by monoliths.
One of the three architectural challenges in creating enterprise solutions is application monolith, which appears as you add more and more features to an existing application. Over the years, the application turns into a “monster” with a lot of interwoven functionality and co-dependent components, entailing the following negative points:
- the presence of a single point of failure (in the event of a failure in one of the application modules, the entire application fails and the work of all employees working with this application stops);
- difficulty in ensuring the required quality of the developed product, the need for volumetric regression testing;
- one monolithic team, which is not practical to expand, as this will not speed up and facilitate the development process;
- rare releases and many internal customers in the organization with their priorities, who have to line up for inclusion in the release; both negative from the customer and stress from the development side are growing;
- the impossibility of using various technological stacks (and this is becoming increasingly important in hybrid IT environments). You have to create and run the whole application with the same programming languages, tools and platforms for the reason that “it’s already done”. And in the application itself, updating the current library or moving to a new one turns into a non-trivial and high-risk task;
- difficulty in scaling.
Microservices help overcome the described problems. The meaning of the approach is that a monolithic application is divided into several small applications consisting of a group of services.
Unlike monolithic applications, this provides much greater scalability than the monolithic approach, since it becomes possible to scale highly loaded services as necessary, and not the entire application. Microservices allow several teams in an organization to work independently and release new features as they see fit.
Although the idea of modularity has existed for many years, the architecture of microservices provides much greater flexibility, allowing organizations to respond more quickly to changing market conditions.
But do not naively believe that microservices will completely save your IT environment from complexity. With the advent of microservices, there is a compromise to increase development flexibility while increasing the complexity of management, development and support due to their decentralization. Moreover, not every application in a corporate environment is suitable for microservice architecture.
The second architectural problem integration monolith, associated with the use of an integration corporate bus (Enterprise service bus, ESB). This is an architectural pattern with a single enterprise-wide interaction layer, providing a centralized and unified event-oriented messaging.
In this traditional approach, integration is seen as an intermediate layer between layers of data sources and their consumers. ESB provides services that are used by many systems in different projects. The ESB is managed by only one integration team, which must be very qualified. Moreover, it is difficult to scale. Due to the fact that the ESB team is the “bottleneck” of the project, there is a difficulty in issuing changes and a growing line of improvements:
- integration is possible only through the bus as part of the next release, which is better to submit an application for because of the large flow in a few months;
- any changes can be made only when coordinating them with other consumers, since not everything is decomposed and isolated. Technical debt is accumulating, which only increases over time.
In monolithic architectures, data is “resting”. But the whole business is built on streaming events and requires quick changes. And where everything changes very quickly, the use of ESB is inappropriate.
To solve these problems, the Agile Integration approach helps, which does not imply a single centralized integration solution for the entire company or a single integration team. Using it, several cross-functional development teams appear who know what data they need and what quality they need to be. Yes, with this approach, the work performed can be duplicated, but it allows you to reduce the dependence between different teams and helps to conduct mainly parallel development of different services.
The third, but no less important architectural problem is the problem of data monolith, associated with the use of a centralized corporate data warehouse (Enterprise data warehouse, EDW). EDW solutions are expensive, they contain data in a canonical format, which, due to specific knowledge, is supported and understood by only one team of specialists, which serves everyone. Data in EDW comes from various sources. The EDW team verifies them and converts them into a canonical format that should meet the needs of various consumer groups within the organization, and the team is loaded. In addition, data converted to a certain canonical format cannot be convenient for everyone and always. Bottom line – it takes too much time to work with the data. Accordingly, it is not possible to quickly launch a new digital product on the market.
This orientation to the central component, its dependence on changes in the surrounding systems is a real problem in the development of new digital processes and planning for their development. Changes can be conflicting, and their coordination with other teams further slows down the work.
For solutions data monolith problems an unstructured data warehouse was invented, Data lake. Its main difference is that “Raw” data is loaded into Data Lake, there is no single team for working with them. If a business needs to get some data to solve its problem, a team is formed that extracts the data necessary for a particular task. Nearby, another team can do the same for another task. Thus, Data Lake was introduced so that several teams could work on their product at the same time. This approach implies that the data can be duplicated in different domains, because the teams convert them into a form suitable for developing their product. Here the problem arises – the team needs to have competencies to work with various data formats. However, this approach, although it carries the risk of additional costs, gives the business a new quality and positively affects the speed of creating new digital products.
And only a few among advanced organizations use an even more “mature” approach in working with data – Data mesh, which inherits the principles of the two previous ones, but eliminates their shortcomings. Data Mesh Benefits are Data Analysis in real time and lower costs for managing big data infrastructure. The approach favors stream processing and implies that the external system provides a data stream that becomes part of the source solution API. The data owner is responsible for the command of the owner of the system generating this data. To maximize this approach, more stringent control is required on how the data is processed and applied to avoid “getting people into a bunch of meaningless information.” And this requires a change in the thinking of management and the team regarding how the interaction of IT with the business is built. This approach works well in a product-oriented model, and not in a project-oriented one.
Such a data infrastructure opens up a completely different perspective and contributes to the transition from a state of “storing data” to a state of “responsive to data”. Streaming processing enables digital businesses to respond immediately to events when generating data, providing intuitive tools for obtaining analytical data and real-time settings of products or services that will help the organization go one step ahead of its competitors.
To summarize, the solution to the problems of all of the listed monoliths is:
- dividing the system into separate blocks focused on business functions;
- the allocation of independent teams, each of which can create and operate a business function;
- parallelization of work between these teams in order to increase scalability, speed.
There are no simple solutions in building the IT infrastructure of a modern organization. And the transition from traditional to distributed architecture is not only a technical transformation, but also a cultural one. It requires changes in thinking regarding the interaction of business and information systems. And if monolithic applications existed in the organization before, now thousands of services work that need to be managed, maintained and compared in terms of interfaces and data. This increases costs, increases the requirements for the skills of people and project management. The IT department and the business must take on additional responsibilities, and if they learn to manage this complexity, then this infrastructure will allow the business to respond to market challenges with a new, higher quality.
And now about what exactly do we use in DTG as a solution to the “problem of monoliths” when optimizing the digital processes of our customers and their integration into the ecosystem of partners? Our answer is the Digital Business Technology Platform class (see Gartner analytics classification). We named it GRANUM and, according to tradition, built on a combination of open source technologies, which allows us to quickly and easily create complex distributed systems in a corporate environment. We will touch on technologies in more detail below. What has become easier and faster? Using the platform, we significantly accelerated the integration of existing customer IT platforms, customer interaction systems, data management, IoT and analytics, were able to quickly integrate customer systems with ecosystem partners to handle business events and make joint decisions to create common value. Also, the use of open source technologies helped us respond to customer requests related to avoiding licensed software.
From a technical point of view, during the digitalization of processes through the use of a distributed architecture (microservices and the DataMesh approach), we were able to reduce the interdependence of components and solve the problem of complex and lengthy development. In addition, we were able to process streaming events in real time, preserving the quality of data, and also create a trusted environment for interacting with partners.
The platform can be divided into three logical layers.
- The bottom layer is infrastructure. Designed to provide basic services. This includes security, monitoring and analysis of logs, container management, network routing (load balancing), devops.
- Integration layer – supports a distributed architecture (DataMesh approach, microservices and streaming data processing).
- Layer of frameworks – it contains useful functionality for business. Our platform already has frameworks for tracking products (track & trace), corporate communications, labeling and other solutions. This layer is planned to be expanded with other frameworks.
We will tell more specifically about the open-source technologies that we have chosen. Which of them are used in their best practices by leading Internet companies such as Netflix, LinkedIn, Spotify. Kubernetes, Jenkins, Keycloak, Spring Boot, Fluentd, Grafana, Prometheus technologies are chosen to combat the application monolith and to build and work with the microservice architecture, as well as in pursuit of flexibility and speed of changes. To move away from a monolithic architecture, the Agile Integration approach usually uses Apache Camel, NiFi, WSO2 API Manager. And, finally, Kafka, Flink, Salase Event Portal are useful for solving the problem of data monolith, its partitioning and transition to real-time data analysis using the Data Mesh approach.
The illustration below represents a set of technologies that, as a result of experiments, we at DTG considered optimal for solving the problem of three monoliths.
We started the practical application of the described platform about a year ago and today we can already conclude that, regardless of industry, such a solution arouses the interest of organizations that are thinking about reducing the cost of executing their business processes, increasing the efficiency of interaction with partners, creating new value chains. Such companies are aimed at fast digital flow experiments (hypothesis testing, integration, rapid market launch and, if local success, global implementation), and will also open up new channels of communication with customers and build more intense digital communication with them the world.
Our group of companies is always open interesting vacancies. We are waiting for you!