how to make fewer mistakes as a beginner

Avatar of expert Roman Bobrovsky

Roman Bobrovsky

Head of Product Development Rafinad

1. Sloppy code

« from main.tasks import task »

It's not just the PEP code, but the lack of visible logic and structure in the code.

The main thing is that you must always remember that the code should first of all be readable, and ideally also understandable. There is no limit to perfection here, but in general there are a few simple recommendations that will allow you to avoid not only mistakes, but also absurd negligence.

In the text, I use the Django ORM syntax, since it is visually simpler than the same SQL, and it can be represented as pseudocode for SQL.

Give clear function and variable names

Instead of c = count()

Instead of q = Model.objects.all()

queryset = Model.objects.all()

If there are large bundles of filters and other things, you can

Instead of for x, y in _dict.items()

for key, value in _dict.items()

Instead of count = User.objects.count()

user_count = User.objects.count()

Someone will decide that count_user is better, but I think this is indifferent, the main thing is to make it clear.

The opposite situation happens when the title becomes too long to bear all the logic. For example, an arbitrary function

def get_count_of_pikachu_charmander_squirtle_from_base():
pickachu_count = Pokemon.objects.filter(type=’pickachu’).count()
charmander_count = Pokemon.objects.filter(type=’charmander’).count()
squirtle_count = Pokemon.objects.filter(type=’squirtle’).count()
return pickachu_count + charmander_count + squirtle_count

can be significantly simplified in several ways.

Fine:

def get_pikachu_count():
return Pokemon.objects.filter(type=’pickachu’).count()

def get_charmander_count():
return Pokemon.objects.filter(type=’charmander’).count()

def get_squirtle_count():
return Pokemon.objects.filter(type=’squirtle’).count()

Here we have a “hard code” that is best avoided at all times, except for const values. However, unlike the rather controversial original feature, these have a chance of being reused.

Fine:

def get_pokemon_count_by_type(_type):
return Pokemon.objects.filter(type=_type).count()

Now the function is not tied to specific types of Pokemon, we can reuse it many times. The only pity is that in case we need to count several types at a time.

Great:

def get_pokemon_by_types(_type: (str, list)):
if isinstance(_type, str):
return Pokemon.objects.filter(type=_type).count()
if isinstance(_type, list):
return Pokemon.objects.filter(type__in=_type).count()

A strict example of polymorphism, we are no longer tied to specific types of Pokemon, nor to how many we need to count at a time. Need a Pikachu? We send one Pikachu, we need Pikachu and Charmanders, we send them in a list and we get the desired result.

Therefore, I would finally use this option:

def get_pokemon_by_types(_type: (str, list)) -> int:
if not isinstance(_type, (str, list)):
return
if isinstance(_type, str):
_type = (_type)
return Pokemon.objects.filter(type__in=_type).count()

In the previous version, at least one more check occurs if the input data does not match the type. In addition, there is a function that returns data. I would like it to be at the end, because that’s where the programmer usually looks, what came, what he returned, and then into the body.

If you can’t get the full meaning into the name of a function or variable, don’t write “x = ” or “def func”, but try to break down the logic.

2. Lack of basic knowledge of Linux/Unix

Surely in English-speaking foreign companies there is development on Windows, but still Python, as a scripting language, works on Unix and Linux.

Optimal development takes place on a Mac or Ubuntu (and other favorite distros), with the goal of pushing the code onto a server, which will most likely be on Ubuntu. Moreover, in vacancies starting from middle, and sometimes from junior, they ask for basic knowledge of Linux. Ideally, you should be able to use several more utilities that will help with work or deployment, be able to create users, groups, change groups and directory rights.

Utilities that will definitely come in handy

Supervisor

Allows you to launch and monitor processes. For example, so that when you restart the system, your project always starts.

Cron

A utility that allows you to execute commands starting them at a specific point in time, for example, parse a site every Monday at two in the morning.

Nginx

Widely used web server. If you have a web project, you practically cannot do without it.

3. Weak knowledge of DBMS

Jun can be hired without experience or knowledge of working with a DBMS, but in general it is useful to know what a particular database is usually used for, which ones are best used and in what cases.

Non-relational DBMS

Non-relational ones are easier; we use Redis, MongoDB and Clickhouse.

Redis is a key-value store. If we draw an analogy with Python types, a dictionary is excellent for storing a cache. Convenient to use for intercomponent interaction.

MongoDB is a storage of JSON documents, by the same analogy – a list of JSONs. Great if you have a large amount of data generated somewhere that you don't want to lose. For example, if all this data were written in real time to a relational one: it would take a long time, recording errors might occur, plus validation would be needed, etc. Mongo writes this data as is, and in a neighboring service you can easily load all the data into the relational database at your own pace, having gone through all the necessary validations and checks.

Clickhouse – one might say, a new product from domestic developers, uses the SQL language, but is not a relational database. The main task is storing a huge amount of data and the ability to work with it. Collecting a billion records and calculating the sum of certain records will be faster than in a regular relational database. Compresses data well, can be deployed in a cluster, familiar SQL language.

Suitable for viewing statistics based on large amounts of data.

Relational DBMS

These are, first of all, MySQL and Postgres. Most likely, development in Python with the participation of a database will be carried out using ORM, Django ORM or SQLAlchemy. If it's SQLAlchemy, it's not so scary, since writing queries in this ORM somehow follows SQL logic. Django ORM is as unique as possible.

Here is the same way to get username and job title

User.objects.values(‘user__name’, ‘user__job__name’)
sql = select(User.name, Job.name).join(User.job)
session.execute(sql)
SELECT users.name, jobs.name FROM users
JOIN jobs ON jobs.id = users.job_id;

In “Alchemy” it is clear that we did add something, but Django ORM works with other abstractions. You need to understand how queries and pure SQL work. For example, in the case of profiling, if the page slows down, and the profiling screen will show a certain request that is called either several times or eats up loading time. Then you need to find this request in the code. Again, if the SQL from the example takes 10 seconds to load, you can see above how it will look in Alchemy or Django ORM.

4. Keeping GIT clean

A common problem that is not only started, but also supported by the team. Your master branch looks like this:

«fix -> fix -> fix -> try -> featyre -> fix -> revert» ?

The most popular excuse I've heard on this topic is that I got attached, but it works. In fact, yes, it is, but the code is not only about “it works.” This is also his accompaniment in the future. Imagine being asked to find or fix something after a lot of time, and you come across a strange piece of code. Is there one “feature->feature->feature” in GIT history? And the colleague, judging by the GIT annotation, even works, but also cannot remember in what nonsense it was written and at whose request.

Therefore, as for me, committing is close to ideal:

One task – one commit. Several developers on one task? One commit from each developer. The commit text contains a link to the task tracker for a specific task. In the future, if problems arise, you can easily see how this or that code was written.

In addition to the commits themselves, there is also the use of rebase and merge.

The best use of merge is for the master only. In all other cases, a rebase works great; the story will be cleaner and have fewer conflicts.

Are you starting to work on a task? You make a new branch from the master (or as the team lead says).

Are you finishing up a task? Make a final rebase to master and squash the commits. Are you working with a colleague in the same thread? Instead of merging the origin branch into a local one, rebase to the origin branch. The commit history will be cleaner, it will be difficult to get lost in it.

5. Writing tests

I won't talk about TDD. This comes with age. Not only the lack of tests will give you away as a beginner, but also writing tests for every sneeze. You need to immediately understand where the business logic begins and cover it with tests.

It depends on the company, but I’ve come across the following pattern: business logic is covered with tests, some ordinary actions are not. The company makes money with business logic where mistakes will bring financial results. Yes, maybe if you can’t save VK in your profile, the client will be offended and you will lose him, but it’s better to spend time putting tests on the functionality that your client uses to earn money.

Understanding what is business logic and what is not will come with experience. I can only recommend asking managers more often what exactly the company does and what the processes look like outside the code, from the point of view of ordinary staff. Universal advice will help you better understand your product.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *