Regular expressions in simple words. Part 1

Developers are divided into two types: those who already understand regular expressions and sometimes solve complex problems with one line, and those who are still afraid and avoid them in every possible way. This article is specifically for the latter, to make it easier for them to become the first. It will either help overcome “regexpophobia” or make it worse. In any case, welcome to cat.

Use navigation if you don't want to read the entire text:
→ Introduction
→ Tools
→ Hello, world
→ Special characters
→ Looking into the abyss

Introduction

A regular expression describes a pattern that text strings can either match or not. Main applications: search, validation, parsing and deterrence.

Search. Find all email addresses in text~~to send them chain letters~~.
Validation. Check that the email address entered in the form is at least remotely similar to the real one.
Parsing. Split the email address into username and domain.
Intimidation. The most complete regular expression for validating email addresses can be found here on this page.

It is important to remember that although regular expressions are a powerful tool, they are not a silver bullet and they are not Turing complete. Consequently, not all problems can be solved with their help. For example,

on Stack Overflow

explains why you should never parse HTML using regular expressions.

On the other hand, sometimes regular expressions can solve problems for which they were not intended at all. For example, on this page an extremely inefficient but working way to check a number for primality is described.

Writing a “regular expression” every time is tiresome. The terms regex and regexp have taken root in English slang. However, both sound like the names of anime villains, so I will use the word “regular” from time to time in this article.
According to legend, “regular expressions” means “regular expressions,” and they were called regular because they were lazy in translating.
By the way, there is an opinion that the term regex may not always be synonymous with regular expression. But sometimes it can. Details can be read in a short article.

Tools

Of course, of all their tasks, regular fighters cope best with intimidation. However, thanks to the work of leading Egyptologists, services have emerged that help decipher these mysterious writings. I most often use two: one is beautiful, the other is useful. In fact, both are beautiful and useful.

Regexper allows you to turn a regular expression of almost any degree of complexity into a beautifully designed graph. For example, there is such a regular expression for real numbers, invented by the ancient Sumerians:

[+-]?(\d*\.)?\d+

And suddenly it makes sense in the form of an intuitive infographic:

If you found a strange regular program somewhere and want to quickly understand what it does, feel free to throw it into Regexper.

True, for particularly complicated cases the graph will not be simple. For example, try sending the above-mentioned regular email to the service (however, you will first need to write it down in one line). I didn't attach the image here because it is 24,621 pixels wide.

The second extremely useful resource is regex101. Helps you see how the regular season works. And if it doesn’t work or doesn’t work right, then you can understand what’s wrong. The service even has a step-by-step debugger!

Similar Posts

Leave a Reply Cancel reply