Toptal, GraphQL and linters

This is the new Ruby Russia 2022 report, in which Anvar Tuikin and Mikhail Pospelov talk about how Toptal taught developers how to write well-formed code. Below is a detailed text on why guidelines do not always work, what to do to make them work, and whether it can be automated.

Toptal is a huge Ruby monolith with hundreds of developers and millions of lines of code written. We use GraphQL, which is also quite a lot at this scale: more than 20 schemas. In order not to repeat typical mistakes over and over again and write similar code, we developed GraphQL cooking rules within the company. But rules don’t work on their own, so we want to talk about our rubocop cops, rspec matchers, and generators: which parts of GraphQL we test, why it’s important, and which of our best practices you can borrow for your projects.

Our freelance service has been operating since 2011. The company employs more than 450 programmers, 150 of which are Ruby developers. Accordingly, this is a large Ruby monolith, which we are gradually splitting into microservices. Five years ago, we decided to bring GraphQL into our work.

Wild Wild GraphQL

GraphQL itself is great, but there are nuances.

Each team works in its own way, so we have already filed more than 20 different schemes. Thus, even when moving from one team to another, expertise is lost, and it is even more difficult for a third-party programmer to join.

To solve this problem, we decided to develop a collection of best practices so that our engineers know how to cut GraphQL. The idea was to simplify the development, reduce the number of errors and answer three questions:

  • How to write (i.e. implementation rules).

  • How to test (matchers, test structures, etc.).

  • How to interact (API design).

However, it turned out that the guidelines do not work because:

  1. no one reads the documentation

  2. difficult to master and remember everything in one reading,

  3. guidelines are mandatory, but not necessary, i.e. you can write code that does not comply with the guidelines.

First, we need to understand where these standards come from and what our schemes are. Our schemas are, in fact, sets of abstractions, in particular, GraphQL abstractions familiar to developers, under our sauce.

GraphQL schema describes a set of types. Types are needed to get and modify data. We get the data from the root query type, change the data from the mutation root type. These two types are entry points to the circuit.

Typical circuit
Typical circuit

In this example, we have two fields − node And nodes. As we see Talent implements this interface, and through the node and nodes fields we can get Talent. Talent also has fields (nullable or non-nullable depending on the domain we are working on).

Reliability at runtime

In our schemas, we use an abstraction called Entity to implement a type. Entity is a Ruby class that delegates calls to the passed object, adding GQL specifics. Since we do not use static typing, a different object may come to the input, which may cause the implementation to work incorrectly.

The object passed to the Entity
The object passed to the Entity

To compensate for the lack of static typing, as well as to introduce guarantees for the transferred object, we added a special DSL – this object_type. We check that Entity (a class that wraps structures, in particular, ActiveRecord models, so that we can, for example, separate the implementation, delegation, business logic of models from the GraphQL level) – an object of type Talent ActiveRecord class. If it’s not, we return a GraphQL error.

Correct ID
Correct ID

Next runtime check. ID is Base64 encoded in Type and number (object ID in the database). The client can generate such an ID on its side, and pass not what we want. We need the client to pass what is declared in the types, for example, Talent. Accordingly, we check that the ID is of type Talent.

Availability of AR preload context
Availability of AR preload context

The next guarantee that we check in real time is that objects and instances of the model have a Lazy context property (introduced by the gem we use to deal with N + 1). If the context is missing, there is a chance of an N+1 problem.

N+1 ar lazy period
N+1 ar lazy period

N+1 AR lazy period is a magical gem that allows you to solve N+1, just like includes, preload in Rails, but without explicitly specifying what exactly we will load now.

How does gem do it? There is some kind of entry point and we instantiate a list of some objects, for example, talents, and we assign a link to the same context to all of them (i.e., they all have a link to some context). When we load the next association, we check for the existence of a context, and who is associated with this context. Accordingly, instead of loading something from one talent, we load it from ten talents at once.

For the next association, we also instantiate a new context and go deeper. This is necessary because it is impossible to predict what request the client will send: what fields and relationships will be needed, and at what depth.

Reliability in testing

The next need is reliability in testing. Unfortunately, we have a lot of abstractions, and they are related. In an ideal world, abstractions are not related or refer to anything, all functions are immutable, and the code is well covered by unit tests. However, in reality, you need to associate types with models, as well as entities, etc.

The described guidelines are based on real facts. Accordingly, these guarantees need to be verified. We use matchers.

Typical matchers:

The corresponding guideline provides that for each mutation, i.e. entry points for state modification, there are operation, i.e., an object that checks the precondition to perform the given modification. For example, we want to add a referrer to a talent, to indicate who invited him. If the referrer already exists, we cannot change it. A typical example for such operations is modal windows, in which, after checking operation either the mutation is performed or an error is displayed to the user.

be_competitive_with_policies – checking the authorization level for each type. An authorized user can only see data to which they have access. This example shows a typical validation implementation. TalentPolicy has two fields (full_name And activated_at), and the implementation provides that if the user does not have the appropriate access, the field is nulled. If the field is not nullablewe “hit hands” with the help of a matcher.

Of course, it is possible to use regular Ruby coverage, but in this case, delegation occurs, and the layer is thin enough, and we don’t want to mix them, we want to know that everything is fine with us at the GraphQL level. Therefore, when testing a type, we check that all fields that are declared in this type are present in the spec for the type.

With this matcher, we check that no N+1 requests occur when a block is transmitted. We use the N1 magic gem, which stores a hash that stores call stacks and query hashes. Accordingly, if the call-stacks and hash requests are the same (the block always works on several instances of the same type), the matcher will crash with an error.

Contract signed by code

So, we wrote a lot of cool abstractions and matchers, but no one knows about them. In addition, we constantly add them, change something, and people write in the old way. So we decided to use the cops.

Cops:

  • suggest how to write code using auto-correction;

  • contribute to the gradual immersion of the developer, instead of dumping on him “all the best” (best practices) at once;

  • help to announce new standards (new standard -> cop -> error the next time the developer updates the code).

At their core, cops are a contract signed by a code.

Typical cops:

We have a basic module that must always be included in the circuit. This is a parameterized module to which we pass some kind of config. This is not a typical construct in Ruby code. We make sure that we include this module in each schema and that we have base methods.

This cop checks that after the schema is declared, setup is called. We use GraphQL inside Rails, where there is Lazy loading from files and classes, so there is not always a guarantee that all types and all classes are loaded when we work inside a class declaration. Therefore, when end is written, whatever the schema refers to is already declared, and at the end of the declaration, setup must be called. This is a completely non-obvious thing that is important to remember, especially for new developers. It is for this that we explain these rules and, together with the error, give links to the guidelines.

We are extending DSL and we had a guarantee that the right type of ID is passed to the frontend. Accordingly, we should always use this DSL. If the developer missed this, we remind him of this.

This is a convention that an array cannot be of a non-scalar type, such as an array of objects. For this, connections are used – these are, in fact, also arrays, but with meta fields. For example, totalCount, or some data lies on the edge of the graph. And if it is an array, then such a representation will become impossible. And this will create additional restrictions on the evolution of the scheme.

We also use cops when the methods used change. For example, we previously used the method field_authorized_by_default, and now it has become a class method for some of our internal reasons. In this case, we are using cop to notify the developer that we now need to write in a new way.

How to write code comfortably?

So, we got a lot of cops, abstractions, etc. The user needs some kind of entry point. We decided to take advantage of the experience of Rails and wrote our own generators, which allow us not to write code manually.

bin/rails generate gql ...

In this case, we are generating a schema, this is the starting point. As a result, the developer follows our standards and guidelines out of the box.

We did not stop there and tried to write a generator for each abstraction.

We can generate a type for the schema, or extend some type with some field.

Mutations work differently. They use a different set of standards and require different inputs. And, accordingly, they need another generator.

Results

This story is not about GraphQL, but about standards and how to develop them. In fact, we have automated code standardization.

Since automation implies standardization, we came to the conclusion that it is necessary to develop standards first, and then automate them. At the same time, in order not to complicate things, it is important to clearly understand what and why we automate, and what can be left at the level of agreements.

In a global sense, we have not made any breakthrough. But what we have done has made our lives as developers much easier.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *