Cloud architecture: Serverless and Cloud Native

IT blogger Dmitry Rozhkov spoke about cloud architecture in a new video. What approaches exist in modern practice, what are their strengths and weaknesses. And also, using AWS as an example, he showed how to build a Serverless architecture.

Here's what the author said in the video:

  1. Cloud application architecture includes the core concepts of Serverless and Cloud Native, which offer different approaches to development.
  2. Cloud Native is an approach to developing applications with scalability, resilience, and agility on cloud platforms, including the use of microservices, containers, and DevOps.
  3. Serverless (serverless architecture) allows you to automatically manage computing resources, focusing on independent functions performed in response to specific events.
  4. Implementing a Cloud Native approach requires a change in engineering culture and may face challenges such as managing distributed systems and controlling cloud costs.
  5. Serverless is ideal for applications with variable workloads as the platform automatically adapts to changing requirements.
  6. The benefits of Cloud Native are achieved by leveraging microservice architecture, containerization, and orchestration rather than simply moving existing applications to the cloud.
  7. Cloud Native does not require the use of a public cloud platform and can run on private clouds or data centers.
  8. Serverless can result in significant savings for applications with infrequent function calls, but can be expensive when the load increases dramatically.
  9. Using serverless components allows you to assemble a complete application architecture, including computing, system integration, notifications and data storage.
  10. The multi-cloud approach demonstrates that the core components and principles of Serverless are available across different cloud providers, albeit under different names.

Below is a transcript of the video.

Cloud architecture

Serverless and Cloud Native are two key concepts in modern cloud application development, but they serve different purposes and offer different approaches to architecture and deployment.

To talk about cloud architecture, you first need to understand what a Cloud Native application does. If I just use the cloud to host a website, is it already Cloud Native or do I have to use lambdas? Can a Cloud Native app run outside of the cloud? In short, the answers to these questions are no, no and yes. As with DevOps, a precise definition is difficult to define.

What is Cloud Native?

Cloud Native, as defined by cloud providers themselves and the Cloud Native Computing Foundation (or CNCF), is an approach to building, deploying and managing modern applications on cloud platforms.

Cloud Native is an approach in which at the design stage you build in the ability to scale an application elastically on a specific platform.

What is elastic? This means that the installation can be quickly expanded in response to increased load and just as easily compressed back when the load is gone. Ideally, this should happen automatically. A distributed platform means that you definitely have more than one server, which are also geographically distributed, most likely.

There are also several pillars of Cloud Native:

  • microservices;
  • containers;
  • immutable infrastructure;
  • declarative API;
  • DevOps;
  • CI/CD.

That is, literally, if you have microservices on Kubernetes, then your application is almost Cloud Native.

These applications are designed with scalability, resilience and flexibility in mind, allowing them to easily adapt to changing requirements and conditions. It’s clear that business is changing, conditions are constantly changing, the load changes, sometimes increases, sometimes decreases. And, accordingly, your architecture must be flexible and elastic enough to respond to all these changes painlessly.

Problems and difficulties of the Cloud Native approach

Despite all the advantages, implementing such a system in practice is not so easy. Most likely, if you have not worked with such applications before, then your team will have to change the engineering culture, and not just change some frameworks.

Here are just some of the difficulties you may encounter.

The first is distributed systems. They are difficult to keep in mind, difficult to design, difficult to even understand what is happening there. If something falls off somewhere, it is not always easy to trace why it happened. Therefore, usually in such installations they use advanced logging, monitoring and the so-called tracing or tracing, when you have the entire request logged through all services, exactly how it went through, so that you can understand at least something. Due to the number of variables, you can only be saved by the correct selection of tools, as I already said, and development, testing and deployment processes.

You will also have to test in a distributed structure. Likewise, if you have a Kubernetes cluster, you usually have at least three Kubernetes clusters. This is a test cluster, this is a staging cluster, and this is a production cluster, actually. And then each team can have its own cluster, in general, as soon as you have Kubernetes, you immediately begin to acquire clusters. But this is an example; you can do Cloud Native without Kubernetes.

Secondly, you won’t be able to set up processes once to make them work. This should become part of everyday work. Tools will also become outdated, require updating, and processes will fail. Be prepared to constantly monitor performance, discuss solutions with the team and implement them. Actually, this is the same cultural shift.

Third, the cost of the cloud can easily get out of control. Every company I've worked for has always had an initiative to cut infrastructure costs. Even if it wasn’t there initially, it still appeared at some point.

All sorts of Janitor services appeared, such as a garbage man or a janitor, who cleaned some kind of infrastructure that they forgot to turn off. And some letters constantly came: “This is how much we spend, this is how much we need to save.” In general, just look at how many startups have been created in this niche. I even have a friend, a good former colleague, who, in fact, also made a startup to optimize infrastructure costs. That is, accordingly, these startups make money by optimizing your expenses. These are infrastructure accountants or something.

Fourth, you simply may not have the necessary experience and skills. As I said earlier, programming distributed systems is significantly different from monoliths and some kind of ordinary programming that we all learn. And here not only the programming is different, but also the whole environment is different, the deployment is different, the whole ideology is different. So I always try to get my team through courses or books when we need to move into a new ecosystem.

For example, now for us it is Kubernetes, and, in fact, I recommended that our guys take courses on Kubernetes.

The most important difficulty that you may encounter, if you overcome it, then it’s all a matter of technique, I think it’s resistance. And here I can talk for hours, but, unfortunately, I don’t have the right to make such things public, and it would seem to be unethical in reality.

Resistance to change is the main reason why companies cannot switch to Cloud Native, and not only to Cloud Native, but in general, make any changes in principle.

Everything seems to be normal for people, salaries are paid, but here we need to learn something, some new processes.

Here you really need to work like with any Change Management. You start with some kind of initiative group, a prototype that is launched and works successfully, and then you simply involve more and more people.

But despite all the difficulties, for a certain scale Cloud Native is considered the only approach, despite the exceptions, where the guys spent a lot of effort, but retained their monoliths. And they work for them, they are happy.

But no matter how strange it may seem, it is a monolith on both sides of the same graph, where your negative intelligence uses the monolith, in the middle you hack at Cloud Native, and at the very end of the spectrum you again come to the monolith.

The thing is, the knowledge that you have here, at the very end of the spectrum, and you are using a monolith, it is significantly different from the knowledge that you had at the beginning. I believe that to scale a monolith, you need to have much more knowledge and engineering culture than even to use Cloud Native.

Therefore, simply using the services of a cloud provider does not make the system Cloud Native. Cloud Native involves deep integration with cloud infrastructure and the use of cloud-specific capabilities and practices.

The benefits of Cloud Native, such as scalability, resilience and flexibility, are achieved through the use of microservice architecture, containerization, orchestration, and not by simply moving existing applications to the cloud.

Thus, transforming a traditional application to Cloud Native requires rethinking and redesigning the architecture and development processes, as I said earlier, to realize the full potential of cloud technologies.

Now it’s clear why Cloud Native does not require, in fact, a public cloud platform, I hope. And why, if you just dumped the monolith into AWS, it's Cloud, but not Native. Or why Serverless is Cloud Native, or rather, maybe part of Cloud Native. But Cloud Native is not necessarily Serverless, right?

If you have a private cloud, or private data centers in different geographic areas, in places, respectively, where you have a distributed Kubernetes cluster, for example, and, in fact, on top of it you run your microservices that can expand and contract, then By and large, you have Cloud Native, but you don’t seem to be using the public cloud. You don't have to use the public cloud to be Cloud Native.

Interesting fact. The clouds are almost 20 years old. The first S3 and EC2 services in AWS were launched in 2006. Even AWS Lambda is already 10 years old. Therefore, this is no longer something new and untested. This is quite a mature technology.

What is Serverless?

Serverless, or serviceless architecture, focuses on managing computing resources automatically, eliminating the need for manual management of services.

In the Serverless context, applications are broken down into independent functions that execute in response to specific events. This allows developers to focus on writing code while the cloud platform automatically takes care of resource provisioning and scaling.

Serverless is ideal for applications with variable load. Because the platform can instantly adapt to changes in requirements and computing resources, optimizing costs and efficiency. I'm talking about functions, but this also applies to other serverless components.

For example, to use a serverless database, you don't have to worry about having enough disk space. You simply create the necessary tables, write or read data, and the platform does the rest for you.

I said functions, but a function does not mean a program function, a single function. Of course, you can deliver your code in such minimal blocks, but in AWS Lambda you can add even a small microservice and everything will work quite normally.

It is also important to understand that Serverless is more about unpredictable resource usage. On the one hand, if the function is rarely called, then you save significantly compared to other infrastructure options. On the other hand, if your workload suddenly increases dramatically, then you won't have to worry about scaling. Lambda will do everything for you, but you will have to pay a lot.

I’m sure you’ve already read all these stories online, when a person suddenly receives a bill for 100 thousand dollars, because he has a lambda there, someone sent traffic to his site and now he owes a lot of money. This happens, yes. If he had just one service, the server would simply crash and could not cope with the load. And the lambda coped with the load, but now we have to pay.

But again, if you know that the load will be uniform and constant, then it is better not to consider lambdas. Recently they wrote a lot of negative things about lambda, how expensive and slow it is. And it was as if everyone’s eyes had been opened. They use technology for other purposes, and then technology is to blame. However, nothing new. That is, even AWS itself used a lambda to detect broken video. But the thing is that they have these video streams there, they were constantly going on. That is, these lambdas threshed endlessly.

It is clear that in such a situation, the correct architectural solution is micro-monoliths that locally store this data, do not transport it anywhere, do not save it to the S3 cloud, but quickly process everything locally, and their costs for this infrastructure would be reduced by 90 %. As if nothing new, even AWS engineers make mistakes, so think before you use serviceless architecture for your applications.

Serverless Components

In general, the entire architecture can be assembled based on serverless components. Let's take a look at what the components are, specifically on the AWS cloud.

It’s clear that any self-respecting cloud will one way or another have some similar set of serverless components. But here you can understand that the components are divided into different modules.

Serverless and Cloud Native: Cloud Application Development 1Compute services

Compute – calculations. The smallest module is AWS Lambda. This is an event-based pay-as-you-go model. Lambda responds not to an HTTP request, not to a request from HTTP, but simply to some abstract event. This event can be anything.

You uploaded a file to an S3 bucket, an event was triggered, this event triggered Amazon Lambda, which did something with this file. Or, for example, sends you a letter. And by and large, the entire AWS infrastructure generates events. You can connect to all these events. The server crashed, an event triggered, the lambda did something. Since a lambda is a very small, the smallest atom of calculation, it has limitations there. It has limitations on the size of the resulting image. It has limits on how many events it can handle.

And if this is no longer enough, you can actually switch to AWS Fargate. This is like a computing engine that works directly with the Amazon container service and the Kubernetes service. By and large, these are containers that rise upon request.

How does this all happen? If no one accesses your infrastructure for a long time, then by and large Amazon will put an end to it all. Not a single lambda will spin, not a single Fargate container will spin. And this can be a small problem, because these services have a so-called cold start.

A cold start is when the service has not been accessed for a long time, now it has been accessed for the first time and, accordingly, Amazon must either raise a container from Lambda for us, or raise a container on Fargate. And this time, usually about 100 milliseconds, is included during the response. It is clear that if a second request arrives immediately, then this cold start will already be leveled out and the response will be much faster.

So, on Fargate you can spin larger containers; in my opinion, they don’t have a timeout. Lambda, in my opinion, has a timeout of half an hour, that is, if Lambda does not work within half an hour, it crashes with an error of 500, and Fargate is just a container, and, accordingly, it costs more.

Serverless and Cloud Native: Cloud Application Development 2Application integration

Next we have components for system integration – Event Bridge, by and large this is an event router. To route events from some applications, or a node of external applications, including those outside of AWS, and internal applications between your event handlers.

AWS Step Functions. This is a service that allows you to make finite state machines. If you imagine that your lambda function is just a function in your code, then step functions are an if-else in your code. And just as you write branching logic directly in your programs, you can also write branching logic directly in your AWS or in any other service.

For example, a letter arrives. You check if this letter came from such and such an address, then one lambda function should process it, otherwise another lambda function should process it. And, in fact, you can do very complex workflows there, so-called. You can connect not only lambda functions, you can connect different services. And debugging all this is actually very convenient. The component draws a graph diagram of the execution of your workflow and where it fell. I recommend reading it, it’s very interesting how it’s implemented.

Then, of course, we have a queue – Amazon SQS. Notification service – Amazon SNS, that is, application-to-application or application-to-person. You send some notifications that something has happened. API Gateway is basically the entry point for APIs into your system. There you write some kind of table of your routes, URLs and what services these URLs should call. Well, there is also AWS AppSync – you can create GraphQL APIs.

Serverless and Cloud Native: Cloud Application Development 3DataStore

Amazon S3 object storage, we have a fully elastic Serverless file system – Amazon EFS. Again, you don’t care where you have this disk. Do 100 terabyte disks even exist in nature or not? You simply connect the elastic system and pay only for what you use.

Then we have NoSQL, that is, the DynamoDB key-value database, which is also scalable and you literally pay a pretty penny for each line in this database, plus for a read-write operation. There is a SQL database if you need it. Well, and a bunch of other engines, you see.

And from all this, as you can see, we have, we get a kind of three-layer architecture. Presentation can be made via S3, CloudFront and API Gateway. Logic on Lambda and Fargate. And, in fact, the data on everything else.

Demonstration

It's time to sketch something on these lambdas of yours. Let's move on to the demo.

Serverless and Cloud Native: Cloud Application Development 4For the demonstration, I'll be using a service called Brainboard. This is not an advertisement, I just found it on the Internet. They have a free trial for 25 days. It allows you to build such schemes, it has all the popular clouds, and plus it also generates Terraform code, that is, it can connect everything that you have done here with Terraform code.

Here they offer us one region. Well, in principle, it’s probably not so scary. But it all depends on your service. You may not be able to get by with just one region. It’s clear that here we have our users identified, who somehow come knocking on our application. And our application, by and large, can be seen as having, one might say, a three-layer architecture. That is, we have some kind of UI, we have some kind of logic, we have some kind of data. That is, in principle, everything that I said in my video with Lego can be seen here.

Let's see that here we have a clear case that we need to somehow resolve our dns names and for this we have route 53 here service. Next, Cloud Front Distribution is a CDN or content delivery network, and this CDN, by and large, is just a set of cache servers that are distributed much more widely. It turns out that there are these cache servers all over the world, and they simply make your static content closer to the user, and, accordingly, the loading time and response time of the site is reduced. That is, this is a very popular solution, when, for example, you upload the statics of your site to a CDN, it is clear that the CDN takes all the statics from the S3 bucket.

That is, S3 is the so-called object storage. What does object storage mean? There is object storage, there is block storage. And under block storage you can imagine the same hard drive that is located right in your computer. And there they directly allow you to address blocks on the disk. And in object storage, these blocks on disk are abstracted from you through the abstraction of, well, a file, you might say. That is, the smallest addressing unit in object storage is a file. And the smallest addressing unit in block storage is, in fact, a block on disk.

Therefore, if you, for example, want to run some classic database, such as Postgres, then Postgres will need block storage.

Well, in general, all the static content of your site is dumped there, in S3. For example, you may have a website generated using Next.js. And, accordingly, during compilation it generates all the pages. You put all these pages in the S3 bucket. From there they are picked up and distributed by CloudFront. And people can see your site. Of course, they get to your site via Route 53.

Next, we may have some kind of personal account for our buyer. Let's say if this is a store on Next.js, the Cognito service is used for this. Some of the content is available to us without a login, some of the content is available to us with a login. And, accordingly, we have some kind of API that allows us, for example, to add our products to the cart, or to buy something. What is an API Gateway? By and large, API Gateway is a service that provides routing or redirection of your requests.

Roughly speaking, in API Gateway you write which request URLs will be processed by what. For example, you write that your /shopping-cart should be processed, requests to this API point should be processed by this lambda function.

It’s clear that it scales, that it keeps the number of requests absolutely amazing and allows you to switch backends transparently for the user. That is, you can, for example, later understand that the lambda function cannot cope, or you decide to change the architecture. Here you don’t change anything, you just reconfigure the traffic so that it no longer goes to the lambda function, but to the Kubernetes cluster or somewhere else. Well, it’s clear that some large files, for example, documentation, or if you sell equipment – firmware, binary files. They can be distributed directly to people from the S3 bucket.

And it’s clear that here we are looking, from the S3 bucket we have a handler in a lambda function, that is, in the same way, through a lambda function, for example, a user can upload something to S3. If you sell T-shirts and say “send us your photos,” you can make such a connection that your user uploads it to S3 through a lambda function. Or this lambda function can also process anything based on events that happen in S3.

As I said, lambda functions are generally event handlers. That is, here in a particular case this event is an HTTP request. While in Cognito user pool, for example, an event is a login attempt, and in SES email an event can be a sent email. The email is sent, we somehow dump it all into the CloudWatch Dashboard.

You need to understand that in place of each of this lambda blocks there could simply be, say, some service, or some sub in Kubernetes, that is, it is just a unit of logic, and a lambda should be understood this way. It's just a lambda, it kind of pushes you to ensure that your unit of logic becomes the minimum possible. That is, if, for example, with a monolithic approach you have all the logic in one monolith, in a microservice approach you have some small microservices allocated there. For example, you will have a microservice for, in fact, a basket of goods, which will both draw this basket of goods and process this basket of goods.

Lambda functions in general should ideologically divide microservices even further. That is, you will have one function that renders, in fact, the list in the shopping cart. And another function that works with this basket of goods, that is, for example, placing an order. But it is absolutely not necessary to follow this rule. You are free to share your code as you see fit. It’s just that these lambda functions allow you to separately scale and fine-tune your architecture even at this level.

Looking ahead, you can kind of use lambda functions without, in fact, cloud providers. On Kubernetes, you can raise something similar to lambda functions on your own cluster, and lambda functions, by and large, are inside, underneath, this, as far as I know, is Firecracker, a virtual machine.

Serverless and Cloud Native: Cloud Application Development 5

Here we see for each of our blocks, you see, our code in Terraform is highlighted, that is, each of these blocks is described by some kind of code, yes, and this code allows us, in fact, to raise this entire infrastructure. That is, we are already describing this infrastructure not as something clicked on, manually configured in the cloud, but literally writing code, saying, we have a resource. This resource is an AWS Lambda function. This is her identifier, these are her tags.

What's the beauty of this approach? In what you receive with your code, you also store the infrastructure. If a person wants to, say, copy all this and raise it in another data center, he can simply click Plan Deploy, and, accordingly, it will all fly to your production. Or in that person’s production. And another interesting point here.

If you change something in Terraform, this does not mean that this particular resource will be destroyed and created again. It is subject to change. That is, Terraform can understand the difference between the original infrastructure, what is loaded, and what you want to change. Then it makes such a patch, and applies only this patch. That is, you don’t have to think that for every deployment your entire infrastructure will be destroyed and created anew. No. Terraform is a very smart thing and can make such patches.

Serverless and Cloud Native: Cloud Application Development 6

Anyway, let's take a look further. Here I see, for example, neither a queue nor a notification system, that is, an SNS service. But in fact, you can kind of add them here, but that’s not the point.

That is, here we have a service that sends mail, and here we have DynamoDB Global Table, that is, it is a serverless database and, moreover, it is a Global Table database, that is, all the data is synchronized between regions, to what extent I know. It turns out, by and large, that's all. Well, the last thing to look at here is CloudWatch.

CloudWatch is the same monitoring system for the entire store, but I think that here it is introduced as a business monitoring system, and not a monitoring system for the installation itself. That is, you will also have to monitor the installation itself, of course, here we have a very simple example, but usually everything does not end so simply. Therefore, it will not be possible to do without monitoring.

Serverless and Cloud Native: Cloud Application Development 7

Let's take a closer look. This is already an example of a multi-cloud approach. Our clients are here. These, by and large, come to some HTTP proxy, and the proxy already scatters them, you see, and he scatters them straight across the clouds.

Serverless and Cloud Native: Cloud Application Development 8

AWS cloud, the same Route 53 Service, the same API Gateway, the same lambda functions, as I said, here is the cart, orders, inventory. Right there we have some kind of queue, right there we have some kind of notification service. And if we look at the world, it turns out that we have Azure, yes, or what kind of cloud we have, then everything is the same here.

We have API Management here, we have some Cloud Functions here, we have some Event Hubs, Notification Hubs, and, in fact, Storage. That is, Storage, I think they emulated this S3 bucket here.

That is, everything the same is possible with a set of some exceptions, it is clear that they do not replicate all the features one to one, they steal from each other, but any cloud provider that respects itself has all the basic things, so to speak.

So I think you'll notice that the components are still the same, just named a little differently, and the main difference here is the granularity of the approach and who actually manages it all.

If you managed your servers yourself, then the platform manages lambda functions or serverless architecture for you, and the server is abstracted from you. You don’t know at all how many servers there are, how much RAM they have, how they scale, how they understand that the load has increased. This whole problem, all this work is abstracted from you. And you just write code, divide your code into some logical blocks.

This is the advantage of Serverless. It’s clear that the server is there, you just don’t worry about it. There is no server for you.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *