I have been working at Miro since the day of foundation, first as a front-end engineer, now as a manager of core teams that develop the inner core of the canvas and realtime collaborations on it.
We are growing very quickly: in users, in team size, in the number of released features. Here are some facts for 2020 for context:
Over 10 million registrations;
Peak online load increased 7 times over the year;
The development team has doubled (engineers, products, designers);
It looks cool! But there are nuances.
Having watched the company for 9 years, I see that a multiple increase in the number of engineers leads to a drop in development speed. The tasks of creating new functionality lead to crutches. The current architecture does not allow them to be done correctly, and there is not enough time for refactoring. It is not clear who exactly should take over the refactoring. The initial architectural changes are not carried through to the end due to the lack of focus. The threshold for entering the code rises, it becomes more difficult for newbies to onboard. Time to market of new features is falling.
This is where development with a focus on new functionality can lead, but without spending sufficient time on architecture. During the growth phase, new features often beat priority refactoring.
This is how we face the challenge “How to keep the speed of feature development and flexibility of the architecture during the growth stage?”… Let’s go back in time and try different approaches to arrive at what we have now.
Dedicate 20% of our time to refactoring and improving
And so, the situation is X time ago. We have several feature teams developing new functionality. Teams independently implement the feature from start to finish and can change any code necessary for this. Each team has its own product manager, who manages the focus and backlog of the team. The goal of a product manager is to achieve certain business metrics.
A typical conflict of “quality” versus “fast” often arises between product managers and engineers.
The product drowns for speed, the engineers say that “here it would be necessary to refactor”. On the one hand, the team needs to test the hypothesis and derive functionality here and now. On the other hand, you need to adapt or simplify the architecture and get rid of crutches so that you don’t have problems in the future.
Conflicts of interest do not arise because product managers do not understand the value of quality architecture, and the team does not want to create new functionality. The problem is that some changes in the architecture require the whole team to focus for several months, and for this it is necessary to refuse to release new features. In a fast growing product, this is extremely difficult to do.
In order not to sink in quality, we agreed to allocate 20% of the time for refactoring. It does not work well if implemented over a large codebase. It helps a little to tune the code in the context of which you are now, but no more.
The problem with this approach is that it does not allow making systemic improvements that require focused work for several weeks or months. The approach requires well-defined processes in the team, so that even this 20% is not given away for features.
But the most important thing is the motivation of developers to do it efficiently, not to make concessions and not to write temporary crutches.
The desire to do it right and work well for a long time, while the team is small and everyone understands that they will touch this code more than once. But as the team grows and the developers begin to specialize in specific areas, it is much easier to agree to a compromise and stick a crutch in code that you will never return to.
Assign responsibility for the code to specific teams
The essence of the approach is that the entire code base is divided into logical components that are assigned to specific commands. Teams implement new features on their own from start to finish. And they can, if they have competencies and desire, change any code necessary for this. But if you change the code of a component other than your own, you must get approval from its owner.
What does it mean to “own the code”:
Review pull requests for any changes;
Agree on technical solutions from other teams for big changes;
Help and advise other teams;
Align the code with the quality criteria adopted in the company, for example, have tests and documentation for the code;
And, of course, fix bugs on the market.
How it looks from a technical point of view.
The repository contains a file in which specific files and folders are mapped to specific commands. When creating a pull request, if the code in the file changes, all relevant team members are automatically added as reviewers. Those. the owner team will not pass changes to the code for which it is responsible.
The biggest challenge in implementing this practice with an already large codebase is agreeing who is responsible for what. The initiative will not take off without support from management.
In the course of the distribution of the components, a lot of code was “exposed” to us, which, in principle, did not fall on feature teams in the long term.
What to do? Let’s create a new type of teams – core teams that will be responsible for the development of the “foundation”.
We will move the focus on architecture and system tasks into separate core teams
This will avoid mixing feature and system backlogs in one command. Thus, the core teams will have a constant focus on the development of the product architecture, and the backlog of these teams will be managed by the tech lead or architect.
The goal of core teams is to develop an internal platform on the basis of which other teams will be able to build new functionality faster and better and develop existing ones.
Examples of core team tasks:
Simplifying the domain data model and providing an API for it;
Creation of an internal framework for feature teams;
Isolation of application layers;
Service stability and performance;
Carrying out major architectural refactorings that will unblock the implementation of new functionality that is not currently available to us.
I would like to point out the difference between core teams and infrastructure teams. Infrastructure teams are responsible for the environment, release cues, and resiliency of the horizontal services we use, such as databases. Those. infrastructure teams face almost no product challenges.
Core teams working with business code and without close work with product management cannot make long-term plans.
Okay. Created teams. Will that be enough, or could something go wrong?
Of course it can.
For example, a business can push and turn the core team into another factory for creating features, simply because they are always a priority during the growth phase. Or the team can get bogged down in raking other people’s mistakes and crutches. Or it can refactor a system that they no longer plan to develop. Or it may lose touch with reality and develop a spherical architecture in a vacuum that will not qualitatively change anything.
There are many ways to go wrong. The question is how to focus on the important?
How do you know what to do first? When the product is large, there are a lot of difficulties and they are all in different places, there are many requests from feature teams to improve the kernel. It is important to focus not on local optimizations, but on strategically important things that will help when playing a long game. This requires a long-term technical strategy worked out together with the business, which will determine the development of the product for two years or more.
Let’s create a technical strategy
The strategy should help core teams understand what we are actually going to build not in a month, but in a few years. It’s not so important what to call it: technical strategy, architectural vision, Painted Picture, just a plan – it’s more important to formalize it in order to transfer ideas from the head to the document and agree that everyone agrees with the strategy and understands it the same way.
A technical strategy includes three sources of requirements:
Business: vision of the company’s development, product strategy, major features that we plan to implement in the future, but cannot because of blockers in architecture;
Reliability and performance: what is the expected increase in the load, what should be optimized for performance, what metrics are priority for us;
A backlog of problems that slow us down or prevent us from living in the here and now. Convenience of development. Code quality compliance with industry standard practices.
We received the first draft of the strategy by locking ourselves in a negotiation room with the most experienced developers for several days, and after several brainstorming sessions we created an agreement on the following:
Target breakdown into components and layers from which the application should consist.
Areas of responsibility of teams: clearly agreed which teams are responsible for which layers and components;
Target data model: objects in the system and relationships between them;
A list of major changes in core components that will speed up or simplify development, as well as unlock product features.
The next step was to validate the strategy with technical and product management. Collected feedback, made adjustments.
We told the strategy to all teams so that when designing new functionality they take into account the target architecture and try to develop solutions that will bring us closer to it, and not distance us.
Having a document, not a verbal agreement, is an important step. The document made the vision explicit, it can now be referenced in the design and understand how much this particular task falls on our architecture.
And then regular work began to bring the vision to life. Planning, implementation, making adjustments. And regular communication about changes to all interested participants.
Steps that can help maintain the desired level of flexibility and speed in developing new functionality, given the rapid growth of the product and the size of the team:
Give feature teams dedicated time to deal with technical debt;
Agree on code ownership policies and turn it into a process;
Define long-term product strategy and business requirements: understand how the current architecture meets these requirements. If it does not satisfy and you need a lot of investments, i.e. changes cannot be implemented by the forces of the current teams in the project mode, then we create separate teams from guys with good knowledge of the product. And these teams are focused on bringing the architecture to the target;
We develop the target architecture together with the business, fix the description, plan the implementation;
Next, let’s figure it out, regularly update the vision, not forgetting to tell all teams about the key changes.
In some practices, we are still at the very beginning of the path. For example, core teams have existed for only six months, but thanks to a formalized strategy, it has already become much easier to communicate with product managers. Namely, to agree on which features we can do now and which ones need to be postponed until we adapt the architecture as needed.
In the end, I will add that the younger the project and the smaller the team size, the easier it is to introduce such practices. And that there are no universal practices, and you can try different techniques aimed at quality, but the good old desire to do well for all team members works best.