Builders at the facility use wrist sensors that record the movements of the hands of workers. Once a day, the accumulated readings of the sensors are transferred to a server located at the construction site. At the data collection stage, in addition to sensors, video cameras are also used; during the trial operation, only sensors are used. The data obtained is sent to our professional assessors, using which we work on models for recognizing human activities. When the model passed the test and showed high-level quality metrics, we implement it and recognize the daily data stream. The customer has the opportunity to watch regular reports on the activities of employees. Despite the simplicity of the scheme, we had to deal with many pitfalls and unexpected passing tasks, about which the story will go.
The project began with a test bench organized in the office LANIT next to the dining room. He worked for three months. We had a large stream of volunteers ready to hammer a nail, drill a hole with a drill or tighten a couple of nuts. A man performed all these actions with a bracelet on his hand behind an impromptu workbench.
We tried several options for fitness trackers that allow you to extract raw data, and settled on one of the famous brands. First of all, in our work we used the readings of the accelerometer and gyroscope. Additionally, we use GPS, barometer and heart rate monitor data. By the way, a device with an intriguing name is used to measure the pulse photoplethysmograph.
The accelerometer and gyroscope allowed us to obtain raw readings in three coordinate axes with a frequency of 50 Hz, corresponding to a period of 0.02 s. Thus, for recognition we have six time series, however, for technical reasons, the obtained series turn out to be skipped and have a high noise level. If we build a graph that reflects the gaps between successive measurements, we get the following picture:
The graph shows that a 0.13 s gap systematically slips in the data.
The problem of filling gaps and noise often arises in tasks related to time series and has many solutions. To solve the problem of gaps and noise, while preserving the information as much as possible, the Gaussian process models helped us. This approach has proven itself, including in work with time series in astrophysics (arxiv.org/abs/1908.06099, arxiv.org/abs/1905.11516)
Once again, we realized how important the kernel settings are when working with models of Gaussian processes. By setting the core of the Gaussian process, it is possible to regulate: a large-scale structure of the time series or a small-scale one will use the model to approximate and fill gaps. You can start familiarizing yourself with this approach using examples from the documentation. sklearn. Take the following example: the source data is highlighted in black on the graphs below, the Gaussian process average is highlighted in red, and the confidence interval is highlighted in blue. The upper graph shows that the first half of the data has a periodic structure that was not recognized, because it was not possible to isolate the sinusoidal part of the signal, although the large-scale structure was successfully approximated. Using a suitable core, the sinusoidal part was approximated by the model in the second example.
An example of the selection of a suitable core of a Gaussian process. The source data is shown in black, the average of the Gaussian process in red, and the confidence interval in blue.
After the model of the Gaussian process was built, it became possible to get rid of noise: if the points did not fall within the confidence interval, then they are replaced by the corresponding points from the Gaussian process.
Example of filling data gaps
Naturally, the quality of recognition of actions using a neural network on data with and without preliminary data processing will differ. So, for example, in our case, the weighted f1-measure grows from 0.62 to 0.84.
Volunteers at our booth could see a real-time action recognition demonstration. Recognition looked like segmentation of a time series from visualizations of sensor readings of fitness trackers. As you can see, periods of inactivity alternate with tightening the nuts and, for example, driving a nail.
From the test bench, we moved on to the project of recognizing the activities of workers at a construction site. Our assessors are engaged in the careful markup of data from the video recording of the workflow, so we have the markup of data and reduce the recognition of activity types to the task of classifying time series.
Preparation of a training sample is as follows: we divide a multicomponent time series into intervals of the same length, select a class label at each interval, for example, to the maximum of the sum of the lengths of the markup intervals that fall into the partition interval.
In experiments on a test bench, we compared classical algorithms on automatically generated attributes and neural networks. Surprisingly, neural networks could not significantly circumvent gradient boosting in our case, which may be due to noise in the data and a very limited amount of training sample. For neural networks, we tried purified time series, difference schemes, spectrograms, one-dimensional and two-dimensional convolutional layers, recurrent layers, and much more. However, the best result with minimal labor is achieved by classification with gradient boosting from the package. lightGBM. Nevertheless, neural networks are useful in passing tasks, for example, in search of a lunch break, which will be described below.
A human factor is expected to be present in the data, for example, putting the watch upside down. It turns out to be easy to deal with this factor: the classification model with an accuracy of more than 90% determines whether the watch is correctly worn on the worker. In the case of improperly worn bracelets, the linear conversion of raw data makes it possible to use the same activity recognition models.
Another human factor in the data is the result of the marking of assessors: they are also prone to errors. In this case, various tricks and heuristics for cleaning the markup help.
As a result of a series of experiments at the construction site, we have come to the conclusion that we divide actions into two levels.
- The lower level, consisting of elementary actions. Example: hammering, movement with a wrench. The typical scale for the lower level intervals is about 5 seconds.
- The upper level, consisting of the actions of the employee in terms of the purpose of the activity. Example: preparation for work, plasterer work, welding, etc. The typical time scale for upper level intervals is about 30-60 seconds.
The result is a picture of the sequential actions of the employee throughout the entire working day with detailing to elementary movements.
Approximate work interval – 5 minutes
Search for inactivity and lunch
In the process of working on the project, it became clear that not only those actions that are used in the work process, but also actions related to rest, respite, etc. are important. Initially, this issue was not given enough attention, but without the ability to distinguish work from rest the whole project loses attractiveness for the customer. We work with elements of rest and inaction both at the level of elementary actions and in the scale of minutes.
Naturally, the model is not of ideal quality, therefore, to minimize the number of errors, it turns out to be useful to find the employee’s lunch break time. Knowing the interval of the lunch break, it is possible to avoid false positives for a long period and, as a result, significantly improve the accuracy of the model at the stage of delivery of the project. In addition, the workers themselves are comfortable to know that at the lunch break their actions are not recognized, and they are free to rest at their discretion.
Assessors cannot mark lunch time, as it is not filmed on camera. The following observation was made: at the beginning and at the end of the lunch break, workers spend some time moving from the workplace to the change house and vice versa. You can take these movements beyond lunchtime.
The sets and shares of classes at work and at lunch time vary. We realized that the definition of a lunch break can be reduced to solving the segmentation problem on top of the results obtained by the top-level model. To solve this problem, our team decided to use the Unet neural network. The difference from the classical Unet is that here all two-dimensional operations are replaced by one-dimensional ones, since we worked with time series. Also added layers of Gaussian noise and Dropout to minimize retraining of the model.
Training data preparation
Since the segmentation problem is solved on top of the top-level model, the input data for Unet was selected as a vector 1024 * (number of classes). 1024 – since the intervals of the VU model are 30 seconds and the working day is about 8-9 hours.
At the output, the vector is 1024 * 1 with binary values, (0 – the interval does not apply to lunch, 1 – refers to lunch).
Since there is not much data (about 40 working days), a synthetic sample was generated. The day of real workers was divided into n-parts, and each part belonged to one of five classes: before lunch, beginning of lunch, lunch, end of lunch, after lunch. A new working day was generated by a set of random intervals: first, several intervals from the first class, then one from the second, several from the third, one from the fourth and several from the fifth.
Scheme of dividing time intervals into morning (blue), lunch (red) and afternoon (green). From fragments of recognized actions at intervals, synthetic data is combined to enrich the training sample.
To assess the quality of data analysis, we used the Jacquard measure, which in the intuitive sense is the ratio of intersection and union of sets. In our case, augmentation allowed us to raise the quality from 0.98 to 0.99 of the Jacquard measure.
Can all actions be classified?
At the construction site, various and often unpredictable situations can occur. In the process of implementing the project at the construction site, we realized that if we restrict ourselves to a fixed set of classes, we will have to face a situation in which we will use our classification for actions that obviously go beyond the observed behavior in the training sample. In order to be ready to meet with actions beyond the scope of the classes used, we began to apply the anomaly detection method. Anomaly detection is widely used in predictive maintenance tasks and for detecting breakdowns in the early stages of production. Detection helped us find:
- errors of assessors;
- atypical worker behavior;
- the appearance of new elements in the technical process;
- identification of “suspicious” employees.
If you begin your acquaintance with anomaly detection methods, you will most likely come to the following most popular and simple models implemented in sklearn: OneClassSVM, Isolation Forest, Local Outlier Factor. There are more complicated methods (my colleague wrote earlier on this topic).
In the implementation of Local Outlier Factor, it is possible to directly check for the presence of new objects in the data (Novelty detection). If you use the Isolation Forest method on the same features that are calculated for the main classification model, it is possible to get a “normality rating” for each object: a numerical value that characterizes the degree of typicality for each object in the sample. The higher the rating, the more typical the object in the sample is its owner. The distribution of the normality rating is as follows:
For the next step, it is important to choose a threshold value, starting from which it will be possible to determine from the normality rating whether the object is an anomaly. In this matter, one can proceed from the expected frequency of occurrence of anomalies, or choose a threshold value for any additional reasons. We chose the threshold value for the distribution of the normality rating: the figure shows that, starting from a certain value, the nature of the distribution changes.
An important point is the following observation: the search for anomalies can be applied productively for each class of activity separately, otherwise rare classes of actions stand out as an anomaly.
We were able to separate and identify a number of anomalies for the “displacement” class, while the assessor, checking the identified intervals, described them as follows:
- the worker put down a tape measure and a pencil, picked up clothes and dresses along the way (movement has already been found beyond the typical movements of workers);
- the employee kicks the cart with his feet (hazardous activity);
- shakes his head with the camera, waves his hands in front of her (actions that clearly go beyond the scope of work operations at the facility);
- makes manipulations with the sensor on the left hand (incorrect actions).
The work of the mason recorded the following anomalies:
- for some reason, an employee beats a hammer on unidentified plates (incorrect actions);
- lays on the floor, looks at the gap between the floor and the panel (atypical behavior);
- First, he tries to shake something off the cap, then removes it and shakes it out (actions that clearly go beyond the scope of work operations on the object).
In the course of experiments with the search for anomalies, we were able to identify one case of drunkenness in the workplace: anomalies occurred in a drunk worker in time intervals associated with movement. Another source of anomalies was the work of male plasterer, while only female plasterers were present in the training sample.
We continue to develop the project and conduct various experiments with neural networks. We are waiting for the moment when neural networks bypass gradient boosting. We plan to move from the classification problem to the segmentation problem. We are working on methods for cleaning markings of assessors, add the readings of new sensors to the data, and experiment with recognizing collaboration. In addition, we are expanding the scope of our monitoring and mastering new professions.
With those who have read the article to the end, I want to share the conclusions that the team and I made in the process of working on the project.
- When working with physical processes, it is necessary to pay special attention to the purity of data, since they can contain all kinds of omissions, outliers, etc. One solution to the raw data problem may be to use a Gaussian process model.
- Good augmentation can help raise model quality metrics. In augmentation, you can go from a simple method to a more complex one:
- various mixing and gluing;
- auto encoders;
- Competitive Networks (e.g. Generative Adversarial Networks) arxiv.org/abs/1706.02390.
- If you master one of the tools for finding anomalies, then it can come in handy at various stages of the Data Science project:
○ at the preliminary analysis stage, it will be possible to exclude emissions;
○ at the stage of model development, it will be possible to find objects with controversial markup;
○ at the stage of setting up model monitoring in industrial operation, it will be possible to detect moments of a significant change in the data with respect to the training ones.
I will be glad to discuss the article in the comments and will be happy to answer your questions.
The article was written in collaboration with olegkafanov.