The first thing that the user of the videoconferencing system sees is the interface. And in most cases, it is by its appearance and functionality that they judge the system. An inconvenient or sprawling interface will not allow to evaluate either the high system performance or wide functionality. Technically, a “beautiful” system should be wrapped in an attractive and stable working shell. Therefore, at the start of the development of the domestic VKS system, this moment was immediately taken into account.
Since the spring of 2020, the answer to the question of the advisability of developing a full-fledged VKS-system has become obvious. Government departments, and commercial companies, and hospitals, and schools need modern means of communication with a certain level of productivity and security. You can talk in Zoom, but is it worth using for serious commercial negotiations or operational meetings?
For tasks of national importance, it became necessary to create a domestic videoconferencing system. Moreover, a system consisting not only of a software component, but also a full-fledged hardware. Among world famous vendors, at least 5 companies offer multifunctional videoconferencing systems. But in Russia, the concept of import substitution is gradually starting to work. Plus, security issues for many have become more important than the country of origin of the product, and the price price at current exchange rates is not in last place. And the “beauty” of the interface turned out to be quite realistic to develop from scratch.
GUI on start
The main requirements for modern interfaces are speed of implementation, up-to-date appearance and full usability. Thus, the first task of the developers of the graphical user interface (GUI) was a clear definition of the software functionality for the videoconferencing.
From the point of view of the GUI, the following requirements were formulated:
• Making outgoing video / audio calls;
• Accept / reject incoming calls;
• Auto answer an incoming call at a custom time interval;
• Switching between two audio devices (headset / speakerphone) both during and outside the call;
• Turn on / off the microphone and camera both during and outside the call;
• DTMF dialing during a call;
• Conference gathering at the terminal;
• Management of PTZ cameras, saving PTZ presets and applying them;
• Ability to display video in several different windows;
• Management of a mouse, keyboard, remote control;
• Ability to remotely control the terminal from the Web interface.
This list of functions allows you to solve the problem of developing an interface in various ways. Moreover, the choice of a specific type of implementation was affected by limitations of the type of programming languages (for example, Java was categorically not suitable for certification reasons, CSS / HTML – according to the functionality), specialization of developers, and timing. Collectively, the choice was made in favor of C ++ and the use of the Qt5 framework, since, for example, the more modern QML technology does not allow rendering video using an arbitrary OpenGL context, and this was necessary according to the ToR for VKS terminals.
Quickly and efficiently
The first GUI prototype was created for Qt’s softphone and used many open source libraries. For example, for the SIP protocol, eXosip / oSIP libraries were used, for encoding / decoding video and audio – ffmpeg, for working with audio devices – PortAudio. This softphone worked under Linux, Windows, MacOS and was a technology demonstrator, and not a real device.
Later, an abstract softphone was transformed into a real videophone project, and the first version of the software for it should have been created 2 months after the start. To solve this problem in such a short time, the phone software was divided into modules and distributed among several groups of developers in accordance with competencies. Such an organization of the process helped to quickly and efficiently develop the videophone project.
Core and front
For unification and the possibility of using existing GUI developments in other devices from an existing project, the common code base is in a separate module – the GUI backend, or the GUI core module. And directly representations, which are different for different devices, are implemented in separate GUI front modules.
This architecture of the GUI-modules turned out to be advantageous and led to the desired result: the development of interfaces for the new components of the VKS system itself has become relatively fast and high-quality. After all, now the interfaces for VKS terminals did not need to be rewritten from scratch.
Torment and Victory
On the way to creating any software, naturally, there are difficulties and problems. Creating a GUI for the videoconferencing was no exception. Regardless of the specific purpose of the system, they can be repeated in any command. Difficulties and victories on the development path are interesting for colleagues, and perhaps they will prompt effective solutions without our “rake”.
Historically, the very first interesting problem that arose during the development of the GUI for various types of VKS terminals was the problem of consistency, that is, the coordinated state of all modules. During operation, the GUI interacts with several other modules: a module for interacting with hardware, a call management subsystem, a media processing module (MCU), and a user interaction subsystem.
Initially, the GUI worked with all these modules as independent, that is, it could send requests to 2 different modules at the same time. This turned out to be wrong and sometimes led to problems, since these modules themselves were not independent and actively interacted with each other. The solution to the problem was the creation of a special work scheme, which ensured strictly sequential execution of requests within all modules.
There were 2 difficulties adding: firstly, some (but not all) requests require a response, in anticipation of which the terminal, in fact, is in an inconsistent state, so other requests cannot be performed. However, blocking the user interface while waiting for responses is also undesirable. Secondly, responses to GUI requests from modules, as well as requests from modules to the GUI, come in their own threads, different from the GUI, but the GUI must implement all state changes in its thread (for some actions Qt requires this, but in in some cases, this avoids unnecessary difficulties in ensuring thread synchronization).
The solution was found and consisted of two parts. First, all requests / responses from other modules were redirected to the GUI stream using the Qt signal-slot mechanism in conjunction with QueuedConnection, that is, using the global Qapplication event loop. Then, to ensure consistent processing of all requests, a Transitions system was developed with its own queue and processing cycle (TransitionLoop).
Thus, when the user presses some action button in the GUI (for example, the call button), a corresponding Transition is created, which is placed in the transition queue. After that, a signal is generated for the transition processing cycle. TransitionLoop, which, upon receiving a signal, looks to see if there is any transition in progress now. If there is, then the waiting for the completion of the current transition continues; if not, the next Transition is retrieved from the transition queue and launched. When a response is received from another TransitionLoop module using the same signal, the completion of the current transition is notified and TransitionLoop can start the next transition from the queue.
The important thing here is that all transition processing is done in a GUI thread. This is ensured by the use of the Qt signal-slot mechanism in the QueuedConnection variant, in which an event is generated for each signal and placed in the main EventLoop of the application.
OpenGL rendering on low-power hardware
Another difficulty that we had to deal with was the problem of rendering video. Qt provides for OpenGL rendering a special QOpenGLWidget class and related helper classes, which was originally used for rendering video. The data for rendering (decoded video frames) themselves is provided by the media processing module (MCU), which, among other things, implements hardware decoding of the video stream (on the GPU). On low-power processors, “slowing down” the rendering of FullHD video was found. The direct solution was to replace the processor, but this would require serious processing of the already finished components of the videoconferencing system and would increase the cost of the devices themselves. Therefore, the entire rendering process was carefully analyzed to find more beautiful ways to solve the problem.
With standard OpenGL rendering and hardware decoding, the following occurs: data with encoded video comes from the network, it is stored in RAM, then this data from RAM is transferred to video memory on the GPU, where it is decoded. Then, decoded data having a significantly larger volume than encoded data is transferred again to the RAM. Next, a rendering code comes into play, which transfers this data from RAM back to the GPU directly for rendering. Thus, rather large amounts of data are pumped back and forth through the memory bus, and the bus simply cannot do this.
In modern versions of OpenGL, there are special extensions that allow you to specify for rendering data that is already in the GPU memory, and not data in the main RAM, as usual. This mechanism excluded the movement of data of hardware-decoded frames from the GPU to RAM, and then back. Thus, the problem of rendering on low-power processors was almost solved.
Another major problem was the OpenGL contexts supported in Qt. They do not allow you to use the necessary OpenGL extension, that is, you cannot use QOpenGLWidget with this option. The solution was to use the usual Qwidget with the Qt rendering engine turned off from the pipeline. Such an opportunity exists in Qt. However, a question arose here, because in this option the GUI is fully responsible for all rendering in the area of this widget, Qt does not help us. This is normal for displaying video, but for using widgets on top of video, regular Qt tools cannot be used, since, for example, an additional pop-up menu must be displayed on top of the video.
This problem was solved as follows: to get its image from an existing widget (QWidget has a grab () method for this), the image itself can be converted to OpenGL texture and rendered the resulting texture on top of the video using OpenGL tools. By adding the appropriate environment, a universal mechanism was implemented that can be used to display any standard widgets on top of the video in such a non-standard way.
Kiosks and widgets
The task of managing displays and distributing fragments of the user interface in the “kiosk” mode was not an easy one. The videoconferencing terminal can operate in 2 modes – windowed, that is, like any other windowed application in the desktop environment of the operating system and “kiosk mode” (that is, the operating system runs only one application with a graphical interface – VKST – and there is no working environment table).
In windowed mode, everything is relatively simple: the window is controlled by the window manager of the desktop environment, the application creates a second window if necessary, and the user distributes the windows on the displays as he needs. But in the “kiosk” mode, everything is much more complicated, since the system does not have a window manager and there can only be one window, and the user does not have the ability to move it. Therefore, the task of automatically detecting an event appeared, for example, connecting / disconnecting a display. When this event occurred, it was necessary to automatically configure the displays and correctly place fragments of the user interface on them.
The answer came from the LINUX Xrandr OS system library, which is responsible for working with displays. There is very little documentation on it on the Internet, so the implementation was carried out using examples from the Internet, including from Habr. In addition, it was necessary to come up with an algorithm for distributing interface fragments among displays, as well as integrate two different windows into one single one. The latter was implemented as follows: what are windows in windowed mode, in “kiosk” mode are widgets inside one large window, which stretches over 2 displays (if there are 2 of them). In this case, it is necessary to configure the positions of the displays so that a continuous virtual space is created (this is done using the XRandr library), and then specify the geometry of the internal widgets inside a single global window so that everyone gets on their own display.
We create Russian
The whole way of creating the Russian videoconferencing system consisted and consists of many stages, and the GUI is only the tip of the iceberg. The most noticeable and not the most difficult. However, the complexity of the solution, the combination of software and hardware and software components, and the desire to make a technically and aesthetically “beautiful” system created many difficulties along the way. New tasks gave rise to non-standard solutions and helped create a product that is not ashamed to show not only in Russia but also abroad.
Russian developments have long proven their performance, and in a beautiful shell and competitiveness. Our life hacks will be useful to everyone who is seriously involved in GUI development, and we hope that they will help other developers speed up and simplify the process of creating modern shells for new Russian software products. We believe that Russian decisions will be valued in the world no less than Russian ballet or black caviar.