Selenium WebDriver at the service of the developer

5 min


Decoding of Dmitry Kostichev’s report from Backend-stories // Video version inside

If suddenly you need integration on a third-party Internet resource, and there is no time to understand it, Selenium will come to the rescue. Dmitry Kostichev told on the example of his project how to automate the work in the browser without leaving your service.

Hello everybody. My name is Dmitry and today I will share my experience of using Selenium in Backend development. What is this for? Selenium is needed to automate interaction with some Internet resources, to level out human factors filling in some data, etc. For development, this may be necessary in such cases when, for example, there is no API on the Internet resource and so on. And on the example of my project, the task was to fill in the client data (before that the service had to correctly prepare all the information) and register them on this site, in this case MasterCard.


Upon further consideration of this site, it was revealed that there is no api to which we can reach out and do everything. All processing is performed in JS-scripts, in which nothing is clear and the data is all encoded. The decision was made – to try Selenium for these purposes, that is, we will fasten to Selenium our entire service, which will carry out this registration at some definite moment.

In the end, what is Selenium and how to work with it? The Selenium project consists of a library that communicates with the web driver interface for a specific browser. A list of available libraries and browsers is shown on the slide. And now I will show how it roughly works on my project.

View screencast or a detailed video at the end of the post.

Now the service will generate a file, immediately upload it to this site and check whether everything has been successfully registered. Here, in fact, he glues, loads and so on. And now he will probably jump on the fact that everything will be successful. Automation is fast enough and does not require any large resources. Everything, apparently, was perfectly registered.

How to cook all this? The Selenium library has such basic commands as:

  • Creating an Instance web driver for a specific browser;
  • Links clicks;
  • Work with elements: clicks, etc.

The web driver allows you to receive sessions and cookies as in a normal browser. You can also change them and customize for yourself. You can also execute js scripts on the page. There is a library that extends this functionality of Selenium itself, called Selenide. Its main feature is to hide the creation of Instance drivers. As an example, we see that we simply call the open command, give some kind of link and the browser is already starting, there is no need to configure anything. And the library expands the work with elements, some additional presets. It all looks convenient, you can quickly figure it out and use it all.

When working with online resources, and ultimately with Selenium, there are some design patterns. And one of them is page objects. Its essence lies in the fact that we describe all these elements in a certain class. And then we can reuse it and it looks easier. Here’s something like this: we call the open command, give the page objects class and then we can use all its methods, for example, working with elements.

We search for elements by the DOM model of the HTML page by selectors such as xpath, css and others. Their main differences, for example between xpath and css, are that the xpath can go “deep”, as well as up and down. And css, on the contrary, is only down. That is, these are the most used selectors.

Ultimately, we need a browser directly, which Selenoid can help us directly. In essence, it is a framework that controls the creation and modification of these containers with browsers. But it is designed more for loaded systems where these browsers are created in large numbers. And in our situation, this is not much needed, but only use the container itself. And now I will show how this should work already on the server.

Actually, the interaction of the page looks something like this, it is a fairly linear data processing. In this case, I divided into steps – transitions to pages. This is where data is populated and, directly, downloaded by files. In principle, everything is quite simple. This is what a pageobject-class looks like, quite reminiscent of a DTO. We simply describe the elements, for example, here in the current case of PCSS Selectum. This is the syntax of Selenide.

For this to work, we need a description of the Remote driver to connect to the docker container. The main settings that you need to get there are the browser that we will use and, in fact, the resolution and other important lines for the browser. But in order to work with the docker container, you will also need settings such as headless mode. That is, in its current form, it will spin on the server. This mode disables the graphics in the browser, so it will work faster and take up less resources.

Further it will be no Sandbox, which is disabled by Chromium security in this case, and it will be possible to execute your own code, JS or other. The third parameter is needed in order for Chromium to work normally on unix machines, correctly record tempo files. And the fourth, in fact, is needed in order for us to upload files. And most importantly, the Remote driver has a flag that allows you to download a file from the local storage where the application is running, already through the Remove driver.

Now I will show how it works with the docker container. Download the docker container and the application itself. It will also be launched in approximately the same way, that is, in principle, there will be nothing new. Only here we will see the interaction in some sort of logs. In principle, they can also be monitored, work with them, and so on. Therefore, you can understand the interaction that is happening. This is the output from the docker container with Chromium directly where the browser is.

In principle, everything works perfectly with the docker container. Also, everything is checked and registration is successfully completed. Further, what general problems could be encountered in such an approach? I had not so much a problem as ignorance. File transfers in the docker container, as usual, do everything – the turn of volume. But in this case, if you still want to run locally, you need to have two configurations for the project, but, as it turned out, the Remote driver can be configured through the flag. He can transfer this file directly through himself, and no additional gestures are needed.

You will also have to follow the page with which we work – an online resource. Because, in my case, this is another system, to which no one has a relationship from my team, and this also needs to be monitored, monitored by logs, and so on. As with the browser, it is constantly updated, you need to monitor it, support may drop. Well, in principle, you can somehow configure the logs and there will be no problems.

In the end, how much easier has it become? It seemed to me that I was able to solve the problem of interacting with this site much faster than if I knew the js code. That is, to understand Selenium and the interaction can be faster than with the fact that the data is encoded and it is not known how to transcode it back. The main thing is speed in development.

Report video – from 16.30


0 Comments

Leave a Reply