Automation and optimization of service provider signaling processes using the gateway API
In this article, I share a flexible architectural approach to automating service provider-level networks and my personal experience in debugging signaling exchange.
The article is not intended for beginners, but rather for those who are familiar with the basic architectures and technologies used in operator-level networks and have practical experience in this.
Imagine a multivendor network of a large mobile operator or service provider. This network contains many different architectural elements, each of which has its own interfaces and M2M interaction formats. Managing, maintaining and responding to various network events requires constant attention. Responsibility for different elements of the network often lies with different departments, which sometimes makes it difficult for operational interaction between them. Within the framework of such a network, projects are constantly being carried out to replace (swap) platforms, introduce new systems, update outdated equipment and other changes. The available budget is usually limited, so solutions with the minimum required set of functionality are often chosen, which later may sometimes turn out to be insufficient to implement certain services. Sometimes the required functionality is not included in the selected solution, and the vendor may not provide customization or do it at a high price, especially when it comes to large global brands. Moreover, we are not talking about the lack of business functionality provided for by the standard, but rather about some special features required for a particular company. In such cases, the choice of alternative solutions for enterprise-grade carrier platforms may be limited. Often, an obstacle to the full use of the platform can be incompatibility of the API format, lack of necessary procedures, or other features that can make operation inconvenient. The slightest changes to the platform's operating algorithm can become difficult or even impossible to implement, which creates problems in the long term.
To create a flexible, monolithic, automated and streamlined operator network that is easy to use, it is important to have a system that can effectively manage the interaction between all service elements and systems, such as billing/CRM systems, self-service systems, service platforms and automation elements.
Moreover, the network of a large operator is a system where thousands of events can occur simultaneously, that is, a solution is required that will flexibly expand and scale. It is also worth noting that, as a rule, the provider’s network is a business-critical service for its clients, so the solution must also be fault-tolerant and, if possible, geo-distributed.
What to do in this case? There are 3 possible solutions for network management, which have their own advantages and disadvantages:
Each platform is on its own. Operation must be carried out manually. Hire people for each platform (or group of platforms) and entrust them with their operation. Not exactly a solution to the stated problem, but an approach that some operators use. This solution is expensive, and with a large number of routine operations, it is not always effective. Also in this case, the human factor is important, when an engineer who did not get enough sleep after the third night of work makes a mistake in an entered command or an open console and, for example, reboots something that should not have been rebooted, thereby depriving the entire region of communication.
Partial automation, devoid of monolithicity and centralization on its own. Write a lot of scripts for automating business processes and technological processes, also implement one or more WMS, spend money on further technical support for this solution, as well as with every change on the network (whether it be hardware/software platform swaps, software updates that may change API formats) rewrite these automation scripts. In this case, the network loses its monolithic nature, and the backup for running automation scripts will leave much to be desired. And also the question arises about the continuity of the decision when the employee who wrote this or that script is dismissed.
Separate the functions of launching control commands and the interface for interacting with the network into separate network elements. Moreover, as an interface, launch a centralized scalable system, a kind of signal router for transmitting control commands between network nodes, a centralized interface for interaction with the network. And the transmission of control commands in this case will be reduced to interaction with this single interface, the API gateway.
Moreover, the third option – the development of a centralized scalable system, an API gateway for network management, looks very promising and effective. This can indeed greatly simplify network management, providing high resiliency, scalability and flexibility. I have already tested this approach in practice and showed excellent results. The most important thing is that he solved all the identified problems described above.
Before talking about the System itself, I would like to show the general trend and relevance of the solution of the described architecture, indicate how the protocol stack used in operator networks evolved, as well as how the core of the operator network was transformed for different technologies.
Let's start with mobile communication technologies, namely 2G/3G networks. The architecture of their core is similar, and in this context, considering 2G separately from 3G does not make much sense.
So, 2G/3G networks have many different signaling protocols. Here are the SS7 and Radius/Diameter stack protocols. Explicitly dedicated voice core and packet core. Base stations are included in radio controllers that route voice and SMS on one side, and packet data on the other. Many different specific protocols have evolved since the days of SDH/PDH.
Interaction with third-party management systems is possible both using the protocols noted in the diagram and standard management protocols, for example, SSH, SNMP, API (if supported by network elements). There is no talk of any unification here; there is a large set of different message formats that can be improved and simplified, which is what was done subsequently. Also, to route SS7 stack messages, centralized signaling routing via STP is typically used.
Further, in 4G networks there is no longer the use of SS7 stack protocols and it has been replaced by the Diameter and SIP protocol. To support legacy, the old principle remains regarding the transmission of SMS messages via the SGs interface (the interface between MME and MSC), if VoLTE functionality is not implemented in the operator’s network and, as a result, SMS transmission over SIP via the IMS platform is not available.
Moreover, please pay attention, the protocols are changing only in terms of signaling, in terms of data transmission, as GTP-U was, it remains the same (including for 5G), they haven’t come up with a better one yet, only the version is changing. As a matter of fact, they haven’t come up with a better protocol than the BGP protocol for transmitting routing information on the global network.
There is no longer a clearly defined vocal core here. Voice transmission is possible in two options – perform the CS fallback procedure, that is, handover in 3G during a voice call, or connect voice calls over a packet network using the SIP protocol and IMS platform. This is already IP telephony.
That is, here the process of simplification and unification has already begun, but, also due to the need for compatibility of 4G networks with networks of previous generations, it is not completed.
But the essence remains the same: when control commands are transmitted, a set of attributes is sent to the server and in response we receive a set of attributes.
An element of centralized routing of Diameter messages – DRA (analogous to STP in 2G/3G networks) has also appeared.
5G networks are already more interesting in terms of transmitting control commands.
With the development of 5G technology, a new network architecture concept has been proposed – Stateless Network. In this architecture, the network becomes more flexible and scalable, and also gains the ability to quickly respond to changes in load.
For signaling in 5G networks, only one protocol is used – REST API over HTTP/2. That's it, that's enough. But the essence remains the same, we pass a set of parameters and receive a set of parameters.
Architecturally, the network is fully divided into CP (control plane) and UP (user plane) functions, that is, the base station passes user traffic directly to the UPF gateway (unlike the 4G principle, when traffic goes first to SGW, and then to PGW, and SGW and PGW also had some control functionality).
The voice here, just like in 4G, goes over SIP through the IMS platform and the packet core in this case acts as a transport to IMS. There is an alternative option – 4G fallback to support legacy.
Currently, there is a gradual unification of protocols, which is especially noticeable in the field of management, where there is a transition to a single protocol – REST API and Service Based Architecture. REST API is becoming increasingly popular as a protocol for transmitting control commands.
Thanks to HTTP, the REST API is easy to use and has a simple, clearly structured format. It allows you to transfer data in the form of JSON objects containing all the necessary parameters. At the same time, it is possible to add additional options to request headers.
The basic principles of protocols such as Diameter, Radius and SS7/MAP are similar to the REST API. With its help, you can transmit a set of attributes (parameters) and receive another portion of them in response. All server logic that implements the functionality consists of what happens on the server from the moment the request is received until the response data is sent. For developers who are often unfamiliar with telecom processes, this approach is more intuitive than working with a variety of exotic protocols.
It is likely that in the future this trend towards changing the principles of signal exchange will only intensify. Legacy AAA messaging protocols will likely be replaced by REST APIs. This will lead to the need to create an element that will provide connectivity between operators; such an element can become an API gateway.
There are a number of benefits to using REST APIs in telecommunications. This approach simplifies and speeds up the process of improving protocols: it is enough to change the list of fields in the transmitted JSON without completely redoing the protocol. Despite the established data exchange systems in operator networks and their inter-operator interaction, the transition to REST API is obvious. To speed up protocol adaptation, systems that can convert one protocol to another can be useful, for example, transforming Diameter or Radius into an HTTP REST API. This emphasizes the importance of such an architectural element as the API gateway.
An API gateway is a centralized system that provides a single unified interface for interaction between various external and internal applications. It acts as a single point of access for client applications and systems, making interactions with the outside world easier and more secure. The very concept of an API gateway has long been known in IT, but here we are talking rather about its broader purpose in the form of a separate platform with additional wiring.
Back in 2016, I developed and implemented a similar system, which subsequently showed its effectiveness.
The result is a platform that processes incoming API requests, interacts with external systems if necessary, and returns a response. In addition to the main function, the platform also has additional functionality for self-monitoring: monitors KPI indicators, monitors the status of nodes, provides internal data exchange between system components, collects telemetry and uses redundancy of structural elements.
The method of integrating this platform into the network is shown in the diagram below.
Everything that can be automated must be automated. Modern programmable network architectures are based on this principle. Especially connecting services, setting up technological parameters, implementing work scenarios. But in practice, it is not always possible to connect Billing/CRM with service elements directly. Also, due to the widespread development and implementation of artificial intelligence systems in company processes, an interface is required through which such a system will interact with the network. Moreover, giving direct and uncontrolled access to such systems to network elements is too risky. It is much safer to use an API gateway layer that provides a limited set of accesses and procedures that can be used.
An API gateway can also serve as a centralized link to unify the API format between different software manufacturers, which is convenient when changing platforms, integrates different systems and applications, allowing them to work synchronously. This avoids the need for modifications on both sides.
Sometimes additional functionality is required, such as limiting the number of requests for a specific procedure or setting a quota for the number of requests. Based on these requirements, a request processing architecture for an API gateway is presented.
The function of routing API requests, for example, by source IP address, header or other parameters is also useful. In this case, the API gateway functions as an L7 router (similar to STP or DRA for 2G/3G, 4G networks).
For flexibility in writing a request processing script in my System, you can use any programming language or constructor from ready-made blocks. Thus, the processing core is a subroutine in which all the necessary logic is implemented.
The set and order of preprocessing and postprocessing blocks for different procedures may differ. It is possible to exclude some modules, for example, completely disable validation or speed control.
Speed control is carried out at two levels: for incoming requests and for interaction with the backend. If the set threshold is exceeded, a response with the appropriate error code is returned. Interfaces for various backends supporting a variety of protocols are also available.
Here are a number of scenarios for which this system has been used in practice:
Implementation of a single scenario. There was a system that could access a third-party SOAP API, but did not support the execution of complex queries, when it was necessary to first request one procedure, extract a certain parameter from the response to it, and then request another procedure. Using the gateway API, customization was made and a similar algorithm was reduced to a single procedure.
Adaptation of the API format to the format supported by the system. That is, the system already had a specific API format built into the code; only the parameters that are passed can be changed through the configuration tools, but not the format itself. Of course, there were no source codes; there was no possibility to change the format. And the other system had its own slightly different request format. Accordingly, a similar transformation was carried out.
Preparation of an API method that implements a rather complex algorithm for working with interaction with a number of different systems, with the provision of this method to WEB portal developers to implement the necessary functionality on it. The principle is similar with the desktop application and the mobile application.
Interaction between systems of different departments in the company. When it is also necessary to reduce the implementation of a complex algorithm to the execution of a simple procedure.
API request routing. Hiding the internal network architecture.
Yes, indeed, the possibilities of using an API gateway are very diverse and the above-mentioned uses simply highlight its importance in the modern digital world. A single API gateway for all interactions both within the organization and with external partners can significantly simplify traffic management, ensure security and control over data exchange and the network perimeter, which is a critical aspect in modern information security.
Integration with microservices, DevOps tools, and other open source systems using a gateway API can significantly speed up the deployment of new services and ensure their reliable operation. Creating a server API with a specific response format for lab purposes can also be very useful for testing and debugging various scenarios.
The API gateway can be used to implement various architectural elements of complex platforms, including in the context of 5G networks.
In conclusion, I would like to add a little commercial component, how the implementation of such a network element can be monetized.
Selling a package with a certain number of requests or paying according to the number of requests. In this case, the operator provides an interface to the M2M end client to obtain some information or perform actions. For example, checking which operator a specific subscriber number belongs to (by implementing an HLR request). Offering different pricing plans depending on the volume of requests can attract customers.
Automating routine tasks by introducing new network elements, saving man-hours, can also be a commercially successful strategy. Customers will be interested in optimizing their workflows and will be willing to pay for automation.
Increasing the competitive advantage and attractiveness of a company's other services by processing requests faster through automation is a great way to attract new customers and retain current ones.
Offering an API gateway as an integral part of other service platforms/systems can also be profitable. Companies working with similar platforms will evaluate the opportunity to use your integrated technology.
Reducing the cost of implementing other platforms by making them easier to interact with the network can attract companies and organizations that seek to optimize their business processes.
By combining these commercial capabilities, you can create an attractive proposition for customers and achieve successful monetization of the implementation of this product.
If you have any additional questions or need help with anything else, don't hesitate to reach out!
I would be glad to receive feedback and hear your vision of the described architectural approach.