Documenting architecture: an introduction

Hi, my name is Vladimir Ivanov and I am a software architect at EPAM. In my work, I constantly have to document software solutions that have to be created. I decided to share some aspects of this activity with you, because you too may find it useful.

How do you draw diagrams for your software? What questions should they answer? Why draw anything at all? Let’s figure it out.

One of the responsibilities of a solution architect is to document the architecture so that it can be communicated to all project stakeholders: project manager, CTO, project sponsor, development teams, QA and others. This is necessary in order to:

understand what components the system consists of;
how these components communicate with each other;
how and where the different elements are located;
whether the system as a whole meets the requirements.

Lack of this information can easily lead to missed project deadlines, overtime, or cancellation.

Photo by ThisisEngineering RAEng / Unsplash

LET’S CONSIDER EXAMPLES

Any software is pretty complex. And the first thing you can do to document it is try to draw some kind of diagram that includes everything. Of course, this attempt would immediately fail. Imagine we want to document some relatively simple solution, say My blog… It runs on Ghost CMS, data is stored in a MySQL database; Apache is used as a web server. The requests are processed by the web server, all requests are redirected from http to https and sent to the CMS. The CMS validates tokens and queries the content of the database, including pages, blog posts, and plugins. All three components run in a virtual machine in GCP on the default network in a separate organization. The system is available to readers, content managers who can add new content and edit the current one. System administrators can modify the system through the cloud console. If you include all this information in one diagram, this is what you get:

Perhaps someone will say that this is a pretty decent diagram, but it still has a number of disadvantages:

Overloaded. To answer a specific question, you need to look for details for a long time.
Incomplete. Try, based on the diagram, to answer the questions: in how many regions is the system deployed; the backup is created in a virtual machine or in a database; where the images are stored; how users are authenticated. You will not find the answers to these questions in the diagram.
Contradictory. Lots of incomprehensible conventions. What are green, blue and yellow rectangles for? What do they mean?

I wanted to talk about architecture views and “viewpoints” as described in the book “Documenting software architecture” SEI, but it’s too academic. Therefore, I will reduce my thought to the following statements:

You cannot place all information in one image.
Moreover, you shouldn’t do it.
Instead, you provide a set of images so that each one is perfect for a specific stakeholder, person interested in your project.
There are several approaches to this (Modules-Components and Connectors-Distributions, Approach C4, etc.), it doesn’t matter which you choose. The main thing is that one person simultaneously receives as much information as possible in a minimum period of time.

SHARE THE CHALLENGES

If we talk about our system – a blog, then we have the following interested parties (they are also stakeholders):
Sponsor of the project,
Blog author,
· System Administrator,
· Content manager,
· The reader.

And each has its own requests:

It can be said that the project sponsor is not interested in charts at all: he only needs the operating cost in dollars. However, the sysadmin wants to see diagrams that answer the following questions:

Who does the system interact with?

This diagram is called a Context Diagram (C4 model), which shows exactly which agents the system communicates with. There is an Analytics block here. I forgot about it when I drew the first diagram, but looking at a certain level, you can focus on certain aspects and not miss anything.

How and where is the system deployed?

Deployment Diagram

In this diagram, you can see that the solution is deployed on the Google cloud platform, on one virtual machine in the same network within the same region, access is protected by cloud IAM. Basically, it answers the question of how much money the solution costs approximately ($ 20-30 per month, depending on the region and the size of the virtual machine), it can be seen that it does not scale well and requires redesign in case of a sharp increase in load. Also, there is no DB backup.
But this diagram does not show which components are deployed in the virtual machine, it needs a different one.
Thus, we focus on certain aspects at the same time. Agree, it’s much easier to understand information this way.

What is the functionality of the system?

This diagram shows which CMS blog features are available. For example, content managers and blog authors can create posts and pages, upload images, and embed third-party content. You can also see that the CMS itself performs user authentication.
It turns out that we were able to answer most of the questions using three simple diagrams.

SUMMARY

In this article, I wanted to show you how to identify issues and provide appropriate Views. Of course, in a really large project, there is still a lot to be done before the product goes into production, which is usually contained in a document called “Solution Architecture Documentation.” I will talk about this in the next publication.