What a system analyst needs to know about gRPC

gRPC is an RPC framework from Google. gRPC and REST are two ways of developing an API, a mechanism that allows two software components to communicate with each other using a set of definitions and protocols. Clients send information requests to the server, and the server provides responses. The main difference between gRPC and REST:

  • In gRPC, one component, the client, calls certain functions in another software component, the server. At the same time, the software implementation of the client and server is not particularly important due to the cross-platform nature of the gRPC protocol.

  • In REST, instead of calling functions, the client queries or updates data on the server.

Independence of the implementation of interaction from the selected component implementation language.  It makes no difference what language the client and server are written in, the interaction will be the same

Independence of the implementation of interaction from the selected component implementation language. It makes no difference what language the client and server are written in, the interaction will be the same

There are four uses of RPC in general and gRPC in particular.

Unary – Unary RPC, 1-1. A synchronous client request that blocks until a response is received from the server. The client can do nothing until a response is received or the request times out.

Unary RPC

Unary RPC

Client streaming – Client streaming RPC, N—1. When the server connects, the client begins streaming messages to it. The client makes a request to the server in the form of a sequence of N messages and receives a response in the form of one message from the server.

Client stream

Client stream

Server streaming – Server streaming RPC, 1—N. When the client connects, the server opens the stream and starts sending messages. The client makes a request to the server in the form of one message and receives a response in the form of a sequence of N messages from the server.

Server stream

Server stream

Bidirectional streaming, N—N. The client initializes the connection, two streams are created. In general, the client makes a request to the server in the form of a sequence of N messages and receives a response in the form of a sequence of N messages from the server. The server can send initial data upon connection or respond to each client request in a ping-pong manner. The two threads operate independently, so clients and servers can read and write in any order. For example, the server might wait to receive all client messages before writing its responses, or it might read the messages one at a time and then immediately write responses to them. Some other combination of reading and writing is also possible. The order of messages in each thread is maintained.

Bidirectional Stream

Bidirectional Stream

Let’s see how to distinguish uses from each other in the proto file:

service Greeter{
	rpc SayHello (HelloRequest) returns (HelloReply) {} // Унарный
	rpc GladToSeeMe(HelloRequest) returns (stream HelloReply){} // Серверный стрим
	rpc GladToSeeYou(stream HelloRequest) returns (HelloReply){} // Клиентский стрим
	rpc BothGladToSee(stream HelloRequest) returns (stream HelloReply){} // Двунаправленный стрим
  }

On the web, a unary implementation is more often used to communicate between the front and back. Relatively recently, support for streaming appeared on the front, so it became possible to use the rest. Mobile apps have support for Kotlin out of the box and Swift in development. There are no restrictions for back-to-back communication.

Writing a contract, in my subjective opinion, it looks elegant and friendly. Read proto file much simpler and more enjoyable than a rest swagger.

A contract is a set of methods combined into services. A method description consists of a name, a request message, and a response message. In the request and response, you can either specify standard data types or create your own object with the necessary content. In the second case, you will need to come up with a name for it and describe it with the keyword message.

Composition of service, method and message

Composition of service, method and message

The rules by which the request is built:

  • The method must take something as input and return something as output – HelloRequest and HelloResponse. If you do not need to receive or send any data, you can replace it with the empty value google.protobuf.Empty. Then no data will be sent in response to the request or in the request, but a response code will be sent. The answer is 2xx if everything is successful, or 4xx/5xx if there are problems. This reduces system load and improves security by transmitting only what is essential.

  • The method must indicate the data types it operates on. In the example above, this is string for name and message, as well as HelloRequest and HelloResponse for the request itself. If the data type is not known in advance, you can use google.protobuf.Any, which replaces any data type.

  • The field in the message must have a non-repeating sequence number. If a field has been previously used and deleted, that number cannot be reused. Such fields can be reserved using the reserved keyword or leaving comments.

The following keywords are used to describe the contract:

  • ‘syntax’ is the current syntax version. Now, as a rule, new services are written in proto3.

  • ‘import’ – for importing standard packages. For example, “google/protobuf/timestamp.proto” will load the timestamp data type.

  • ‘service’ – for declaring a service. The service combines GRPC methods

  • ‘rpc’ – for declaring a method: its name and request messages.

  • ‘returns’ – to declare a method response, response message.

  • ‘message’ – to declare an object.

  • ‘enum’ – to declare an enumeration.

  • ‘repeated’ – to declare a repeating field.

  • ‘reserved’ – to reserve a field.

  • ‘optional’ or ‘required’; — to declare an optional or required field. In proto3 this functionality has been removed.

  • ‘oneof’ – to declare a complex field in which one of several values ​​can be obtained. This is a rather loaded and difficult to process structure, which can weigh a lot and waste a lot of resources, so it is better not to use it, but to replace it with something else. For example, on the line in which json will actually come.

  • A whole set of standard data types like bool, string, int64 and others.

We figured out how to read proto-contracts, now let’s look at an example of a service:

syntax = "proto3";
import "google/protobuf/any.proto";
import "google/protobuf/empty.proto";
import "google/protobuf/timestamp.proto";

 
service ProductService{
	// метод добавления книги в каталог
	rpc AddProduct(AddProductRequest) returns (google.protobuf.Empty) {}
    // метод получения книги по ID
	rpc GetProductById(GetProductByIdRequest) returns (GetProductByIdResponse) {}
    // метод получения всех книг
	rpc GetProductsList(GetProductsListRequest) returns (GetProductsListResponse) {}
}

message AddProductRequest{
	BookInfo add_book_info = 1;

}

message BookInfo{
	// эти поля нельзя будет использовать
	reserved 6, 15, 9 to 11;
	// данные об авторе не определены однозначно, и может прийти любой тип (строка или массив, например)
	google.protobuf.Any author = 1;
	string name = 2;
	int32 price = 3;
	Type type = 4;
	bool in_store = 5;
	// в типе bytes можно передавать файлы, но лучше заменить на ссылку в s3
	bytes book_cover = 7;
	// здесь мы будем использовать только одно из перечисленных полей, для пользователя это выглядит как, например, динамичная форма ввода
	oneof additional_fields{
        // используем с TYPE_UNDEFINED — он пригодится, когда будут добавляться новые значения в enum Type: созданные объекты BookInfo примут этот тип по умолчанию
    	AdditionalFieldsUndefined additional_fields_undefined = 8;
    	AdditionalFieldsDetective additional_fields_detective = 12;
    	}
}

enum Type{
	TYPE_UNDEFINED = 0;
	TYPE_DETECTIVE = 1;
}

message AdditionalFieldsUndefined{
	string description = 1;
}

message AdditionalFieldsDetective{
	string description = 1;
	string period = 2;
}

message GetProductByIdRequest{
	// protobuf еще не знает, что такое uuid, поэтому его можно передать типами string, bytes или кастомным типом
	string id = 1;
}

message GetProductByIdResponse{
	string id = 1;
	BookInfo get_book_info = 2;
	google.protobuf.Timestamp created_at = 3;
}

// limit и offset нужны для пагинации на бэке. Если у нас бесконечная лента, можно вместо объекта GetProductsListRequest передать google.protobuf.Empty
message GetProductsListRequest{
	int32 limit = 1;
	int32 offset = 2;
}

message GetProductsListResponse{
	repeated BookInfo get_book_list_info = 1;
}

In the library description example, I tried to collect the most common (good and not so good) contract implementation patterns. The code implements methods for adding a book to a catalog, obtaining a specific book by ID, and a complete list of books. Syntax version – proto3, import of data types – any, empty, timestamp and ProductService service from three methods: AddProduct, GetProductById, GetProductsList.

An important feature of gRPC is strong typing. When gRPC generates a file, each field is allocated a certain amount of bytes and a future base64 position. This means that for field data with type bool and serial number 1, a piece will be allocated at the beginning of the file, and field data with type string and number 2 will be located immediately after it. These fields will be allocated as much memory as required for bool and string, and their position in the file will be clearly recorded. Even if the fields are removed, the space will remain reserved and these bytes will be filled with empty values. Therefore, if you change the type or field number, this typing feature will lead to errors when compiling the contract. In order not to inflate the final file to enormous sizes, it is important to initially clearly understand what data needs to be transferred in the object.

An example of substituting data into a contract and a string in base64:

Contract:

syntax = "proto3";

message exampleMessage{
  string exampleString = 1;
  bytes exampleBytes = 2;
  uint32 exampleInt = 3;
  repeated uint32 repeatedInt = 4;
}

Substituted data:

{
  "exampleString": "test",
  "exampleBytes": [255, 15],
  "exampleInt": 2,
  "repeatedInt": [2, 4]
}

Encoded data:

Hex: 0a04746573741202ff0f180222020204
Base64: CgR0ZXN0EgL/DxgCIgICBA==

Characteristics of REST API and SOAP API

It’s not interesting to talk about a spherical horse in a vacuum, so I propose to compare the characteristics of the REST API, SOAP API and gRPC API.

SOAP API Characteristics

The basis: HTTP1.1.

An approach: service-oriented design. The client requests a service or function from the server that may affect the server’s resources.

Contract: WSDL schemas are required.

Transmitted data format:

Request: XML.

Answer: XML.

A client and server using SOAP always exchange standardized data

A client and server using SOAP always exchange standardized data

How to transfer data: uses only HTTP-POST requests.

Number of endpoints (application entry points): 1.

Working on the web: without permission.

Documentation: WSDL schemas are difficult to write and maintain.

For which architecture: complex architecture that goes beyond CRUD. SOAP is used by many banks.

Weight: XML weighs more than JSON and base64 and is mainly used in legacy systems that were developed in the late 1990s and early 2000s.

Advantages:

  • Does not depend on language.

  • Built-in error handling.

  • Built-in security protocol.

  • Self-documenting.

Flaws:

  • Heavy XML.

  • A complex set of rules to describe a contract.

  • Long message updates are features of the scheme.

  • Constant need in data coding on the server before transmission over communication channels and their subsequent decoding on the client. The physical layer of exchange protocols only understands sequences of binary data, which leads to increased transmission time, complexity of information framing and the risk of losing individual data packets.

Why use: fintech and other long-term massive projects with complex architecture, legacy from the 90-00s. or a historically chosen SOAP that cannot be abandoned. For a new project, it may be worth looking at alternatives.

Characteristics of REST API

The basis: HTTP1.1.

Server connection: 1-1.

An approach: object-oriented design. The client requests the server to create, share, or modify resources.

Contract: not necessarily OpenAPI, endpoints may not be documented.

Transmitted data format:

Request: mostly JSON.

Answer: json with all the data found on the server for this endpoint.

Client and server using REST do not standardize data

Client and server using REST do not standardize data

How to transfer data: uses HTTP as a transport protocol, creates a one-time connection between two points: created a connection, sent and closed. The client sends messages to the API and immediately receives a response or waits for the response to be generated. The client and server do not need to know about the internal data. Uses four main HTTP methods: GET, POST, PUT, DELETE.

Number of endpoints: there can be many – either one or much more, without restrictions.

Working on the web: without permission.

Documentation: In JSON, you need to document the fields it contains and their types. Often information may be inaccurate, incomplete or out of date.

For which architecture: mostly CRUD. The most popular API architecture for web services and microservice architectures.

Weight: JSON is smaller than XML, but larger than protobuf.

Advantages:

  • The client is separated from the server.

  • No long-term stateful connection → resource saving.

  • Scalability.

  • Easy to use and understand, large community.

  • There is a standard list of error codes, but everyone uses them differently.

  • Can be implemented in a variety of formats without standard software.

  • Caching at the HTTP level without additional modules.

Flaws:

  • Excessive load on the network.

  • Over- or under-sampling of data.

  • There is no documentation and standardization.

  • There is no standard for the use of response codes, so successful codes can often pass errors.

  • There is a constant need to encode data on the server before transmitting it over communication channels and then decode it on the client. The physical layer of communication protocols only understands sequences of binary data. This leads to increased transmission time, difficulty in framing information, and an increased likelihood of losing individual data packets.

Why use: Thanks to its simple implementation and display of the data structure, ease of reading, it is easy for novice programmers to work with it.

Examples of using the REST API:

  • Web architectures.

  • Public APIs for easy understanding by external users.

  • Easy data exchange.

gRPC API Features

RPC and REST are two different design approaches. REST was launched as an alternative to RPC to solve the main problem that it had – the complexity of integration due to dependence on the development language and the risk of exposing the internal features of the system. REST was no longer as lightweight as RPC, and created a large amount of metadata in its messages. Most likely, this led to the rebirth of RPC in the form of GraphQL from Facebook and gRPC from Google.

Google developed its framework for the internal needs of working with microservices, but ultimately opened its source code for widespread use. Now gRPC is still a fairly new protocol and not everyone has heard of it. But it is already used by companies with highly loaded systems, such as Google, IBM, Netflix, Twitter and others. Below are its characteristics.

The basis: HTTP2 (works in two directions and is therefore faster than HTTP1.1).

Server connection: 1-1, 1-N, N-N.

An approach: service-oriented design. The client requests a service or function from the server that may affect the server’s resources.

Contract: must be written according to the Protocol Buffers standard, compiled by the internal protoc compiler, which generates the necessary source code for classes from the definitions in the proto file.

Transmitted data format:

Request: binary file – protobuf.

Answer: The binary file is protobuf.

Client and server using gRPC always exchange standardized data

Client and server using gRPC always exchange standardized data

How to transfer data: creates a permanent connection – a socket – between two points, over which it transmits a binary file and calls a function remotely, passing parameters to it. Sends messages both ways: gRPC provides bidirectional data streaming – both the client and server can simultaneously send and receive multiple requests and responses within the same connection. REST can’t do that. In this case, both the client and the server need the same Protocol Buffer file, which defines the data format. Uses only HTTP-POST requests.

Number of endpoints: 1.

Working on the web with extra effort:

  • gRPC runs on HTTP2 and transfers a binary file, while JS in the browser runs on HTTP1 and only interacts with text files. That’s why there is gRPC-WEB, which can put base64 in the body of a text message, and then a separate JS library translates base64 into JSON. gRPC-WEB is a separate protocol from gRPC, exists only in the browser and acts as a translation layer between gRPC and the application in the browser.

  • Front codegen does not know where a field is required and where it is not. All non-basic types are generated as optional.

  • Streams for the front became available not so long ago. Previously, a rest dot was written to send a file.

Documentation: a clearly defined and self-documenting schema. The Protobuf API generates code, the code will not be out of sync with the documentation. When generating code from Protobuf, a basic check is made to ensure that the generated code does not accept fields of the wrong type.

For which architecture: mostly CRUD.

Weight: less JSON.

Advantages:

  • High performance and low network load.

  • Keeps the connection, no need to waste time connecting.

  • You can set a client timeout and thereby save resources.

  • Strict specification of data types. For each field, allocates a set of bits in order.

  • Standardization of error codes, hardwired into protobuf.

  • It generates source code using proto.

  • Self-documenting.

  • It does not depend on the language, the contract is the same everywhere.

  • Can be used to manage k8s containers and storage systems.

Flaws:

  • Doesn’t work without gRPC-WEB in the browser.

  • A person will not read a message without a decoder.

  • To work, you need gRPC software on both the client and server sides.

Why use: designed to enable developers to build high-performance APIs for microservice architectures in distributed data centers. Including for microservice architectures in several programming languages, for which the API is unlikely to change over time. It is also well suited for internal systems that require real-time data streaming and large volume downloads.

The gRPC API is best used when:

  • High-performance systems are being created. For example, in highly loaded systems where high throughput and performance are needed with low requirements for the network, as well as hardware resources of the server and client – Internet of Things platforms. And also in distributed computing or testing, when the execution of resource-intensive tasks is distributed among several servers or when checking the performance of tests on different platforms.

  • Big data is being loaded.

  • Real-time or streaming applications are being developed.

  • You need remote administration – managing configuration files from a single node.

  • Tunneling is required – going beyond the boundaries of the routed network.

In gRPC you cannot:

  1. Reuse field numbers. These are the features of the codegen, everything breaks.

  2. Change field type. Delete and add a new one with a new number. Features of the codegen, everything breaks.

  3. Forgetting to reserve deleted fields. Someone can reuse them. And what? That’s right, everything will break. This applies to both names and field numbers.

  4. Add required fields. Removing obligatoryness is difficult. If you need to make a field mandatory, make it so in the code and write a comment to the contract.

  5. Add many fields to an object. It will weigh a lot and in some cases will not even compile. In Java, for example, there is a hard limit on the size of a method.

  6. Forget about UNDEFINED with number 0 in the enumeration. Firstly, for proto2 and proto3 compatibility, this is the only option where nothing will break when adding a new value to the enum. Secondly, by directly specifying the value 0, we save ourselves from logical errors. In the example with a book site, if there is no UNDEFINED value, all books will automatically be assigned the “detective” type. But, if there is such a value, the user can be shown a required field with a drop-down list with a single choice and he will be forced to fill it out.

  7. Reinvent the wheel. Almost all data types are either imported by default (string) or included (timestamp). Check the documentation before making a custom type to save resources.

  8. Use constants and language keywords in enum. Everything will break.

  9. Change default value. Backward compatibility will break, so in proto3 this functionality was removed altogether.

  10. It is not recommended to remove repeated if it was present. We lose the entire message or a specific field depending on the version. Either way it’s bad.

My colleagues, developers and analysts, and I have identified a number of advantages and disadvantages in using gRPC. If you know any more, add them in the comments.

Advantages:

  1. Back-to-back communication. Fast and stable.

  2. Proto makes life much easier for the developers of the service for which the contract is being written. If the answer changes, the service will know about it immediately, and not when the documentation is updated (if updated). Swagger is always too lazy to write to everyone.

  3. Fast, base64 weighs less compared to JSON. Suitable for microservice architecture.

  4. Safe.

  5. The contract is human-readable.

  6. Typing.

Flaws:

  1. It does not work fully with the front, so if configured incorrectly it may not work correctly. And you definitely need gRPC-web.

  2. Front codegen does not know where a field is required and where it is not. All non-basic types are generated as optional.

  3. It follows from the first and second points that I am not particularly good at the web. But they use it.

  4. Codegen does not allow you to change the field type – only delete and add a new one, with a new number (typing features that are forgotten).

goodbye

The choice of technology for a project is a complex issue and should be decided jointly by architects and developers. A systems analyst is more of a theorist than a practical technical specialist, but he must have a good understanding of technology and have an advisory voice. As a rule, we do not write code and do not know the intricacies of using this or that technology, and the final implementation does not rest on our shoulders. But we can share our experience of other projects. And if today’s introduction to gRPC inspired you to explore it further and present it to your team, that’s great.

For the most curious: what is g in gRPC? In each version of gRPC, the value of g changes – in the current version of 1.61 at the time of writing, it means grand. Look values ​​in previous versions can be found on GitHub.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *