Cellular architecture

The main goal of this division is to minimize damage radius when failures occur and simplify system scaling.

But you probably ask: – What are the differences from microservice architecture?

Cellular architecture and microservices architecture do have a lot in common, as both focus on breaking up large, complex systems into smaller, manageable components. However, there are key differences:

  1. Isolation level:

    • Microservices divide the application into independent services, each of which performs a specific function or business logic. These services can communicate with each other through certain APIs.

    • Cellular architecture goes further, offering not only separation by functionality, but also isolation of runtime environments. Each cell can contain one or more microservices, but they are isolated from other cells in terms of resources, dependencies, and network calls.

  2. Failure Handling:

    • IN microservice architecture Each service can be independently scaled and updated, which somehow increases the flexibility of the system, but a failure in one service can affect other services through network calls.

    • Cellular architecture seeks to minimize this effect by creating even tighter isolation, so that failures in one cell do not affect the functioning of other cells. This is achieved by using separate runtimes and data for each cell.

  3. Scalability:

    • Microservices offer scaling by adding service instances depending on the load.

    • Cellular architectures use a similar approach to scaling, but scaling can be more granular, since each cell can include all the necessary services and data to serve a specific segment of users or tasks.

Basic components of cellular architecture

Cell router: is the central element that controls the routing of requests to the appropriate cells based on the partitioning key. This key is usually associated with certain attributes of the request, such as user ID or resource. The cell router must be highly optimized and have minimal latency as it processes all incoming requests and routes them to the appropriate cells.

Cell: Each cell is a self-contained module that contains all the necessary resources and components to handle a portion of the overall system load. Cells are designed to be self-sufficient and independent, allowing failures to be isolated within one cell without affecting the functioning of other cells.

Control plane: Responsible for admin tasks such as deploying new cells, deleting them, and migrating clients between cells.

Where was the cellular architecture implemented?

Slack

Slack introduced cellular architecture to solve problems with gray refusalswhen different system components have different perceptions of each other's availability. The main concern was that a failure in one Availability Zone could negatively impact the entire system.

The main element in implementing Slack is drainage mechanisms, which allow you to quickly redirect traffic from one availability zone to another. This is achieved through a model in which each availability zone is treated as a separate cell that can be independently isolated from traffic if necessary.

The Slack system is built in such a way that each service within AZ operates in isolation, processing only traffic within its zone. This is achieved by using various service discovery services such as Envoy And Consulwhich support dynamic traffic control and configuration mechanisms.

Slack also integrated cellular architecture with the AWS cloud infrastructure, which made it possible to use the capabilities of automatic scaling and resource management at the cloud level, thus optimizing the performance and availability of services. This highlights the importance of tightly integrating cellular architectures with cloud platforms to achieve the best results in managing distributed systems.

Okta

Cellular architecture in Okta implemented as a core element of their scalable infrastructure. This architecture consists of individual cells, each of which is an isolated, autonomous replica of the infrastructure, thereby minimizing the impact of failures in one cell on the entire system. Okta builds its cells so that they can be deployed in different geographic regions, which improves the availability and reliability of services on a global level.

The main feature of Okta's cellular architecture is its ability to scale horizontally using tools such as Elasticsearch, Kinesis, ProxySQL, Redis And Storm, allowing the number of cells to be increased linearly depending on customer needs. Each Okta cell is designed to handle a certain amount of traffic, and when it becomes saturated, the system automatically scales to add new cells.

To manage the release of updates and changes, Okta uses tools that simulate release trains, which allow cells to be updated and scaled without the need for manual intervention. This ensures uniform deployment of artifacts across all environments and allows for parallel release of updates with the ability to rollback in case of failures.

Door dash

primary goal DoorDash in the transition to cellular architecture was to reduce the cost of data transfer between different availability zones when using microservices. Traditional round-robin load balancing architectures resulted in significant overhead due to traffic crossing AZ boundaries.

DoorDash implemented zone routing using a service mesh powered by Envoy, allowing it to manage traffic within a single availability zone while minimizing more expensive inter-zone traffic. To achieve this, the company's own service mesh was modified to provide Envoy with zone information for each node.

Example configuration for Envoy:

resources:
 - "@type": type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment
   cluster_name: payment-service.service.prod.ddsd
   endpoints:
     - locality:
         zone: us-west-2a
       lb_endpoints:
         - endpoint:
             address:
               socket_address:
                 address: 1.1.1.1
                 port_value: 80
     - locality:
         zone: us-west-2b
       lb_endpoints:
         - endpoint:
             address:
               socket_address:
                 address: 2.2.2.2
                 port_value: 80
     - locality:
         zone: us-west-2c
       lb_endpoints:
         - endpoint:
             address:
               socket_address:
                 address: 3.3.3.3
                 port_value: 80

Every cell in cellular architecture functions as an independent module with its own set of services and storage. This leads to increased complexity routing and coordination between cells. You need to ensure that systems can scale and manage without significantly increasing communication latency or complexity.

The solution to this problem is often solved by using centralized dashboards and monitoring, for example, as implemented in Amazon CloudWatch, which allows you to monitor the status of all cells in real time and quickly respond to any problems.

In addition, each cell must have clearly defined security and data management policies. To do this, you can implement automation and cell updates through AWS Step Functions and AWS CodeDeploy.


You can get more practical skills in application architecture as part of practical online courses from industry experts.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *