Volleyball Serve Recognition with Machine Learning


The development of artificial intelligence is now experiencing rapid growth, and the scope of its application is constantly expanding, penetrating into areas previously unrelated to IT.

Sports are a good example of such expansion.

Not so long ago, the term Sport tech appeared and the number of projects has grown significantly over the past few years.

Volleyball is a promising direction in sports analytics. One of the most popular sports, widespread in many countries.

So, we have a video of a volleyball game. For what purpose is it usually done? Perhaps to show friends or revisit the best moments on long winter evenings. But probably, in its raw form, the record is not very suitable for this. After all, most likely, for the first ten minutes, the players will change clothes and warm up, and after each draw, a minute or two will pass until someone runs after the ball.

In general, we are approaching the obvious goal – to get rid of everything boring and leave only the most interesting. Ok, the strategy is clear, let’s move on to tactics.

For an outside viewer (which of course is artificial intelligence) there are several markers to attract attention: players, ball, referee, scoreboard. Any of these objects can be analyzed. But today we will talk about the ball.

The connection between spectator interest and the ball is quite obvious: the ball flies – we are watching. No ball – nowhere to look. In general, it is clear that we need to cut out all the frames where the ball does not fly, and then it will be possible to watch it without yawning.

Ball recognition and tracking

IN one of my previous articles I talked about the implementation of this approach using computer vision.

The algorithm recognized the ball in the air, and if the object of observation was lost for 5 seconds, the rally was considered completed and a new rally was counted from the moment the ball was seen in the air again.

Something happened, a lot of water was filtered, but there were also disadvantages:

  • It turned out that the ball often moves outside the gameplay. Throwing the ball to a partner, knocking on the ground, warming up during a break – it all counts. But it’s not interesting.

  • If the pauses between rallies are short (serious volleyball players do not run after the ball, they have ball-fights), then the moments stick together into one big rally, which slightly spoils the ideal picture of the world.

  • Failures in recognition, this is a separate issue.

So we come to the fact that in addition to seeing the ball, you should somehow understand what is happening and use it to your advantage.

Trajectory recognition

Anything can happen in volleyball, but there is one fundamental action without which there is simply nowhere – the serve. The minimum draw, in principle, consists of a single submission (how much we are interested in such fragments is another question).

In general, the feed gives us a lot of information:

  • Definitely separates the draws

  • Shows who won last time

  • Its absence hints that this is most likely not a game fragment.

In general, it would be nice to be able to recognize the feed. In the meantime, we have a lot of some trajectories.

Volleyball is an endlessly varied game, and you can capture it on camera in even more diverse ways.

In the world, the most popular approach is to put the camera behind the court. Russia has a special way and most of the videos are shot with a camera from the side (although in recent videos there has been a trend towards globalism). Sometimes there are options – the camera is above or in one of the corners of the site.

The trouble is that a trajectory that in one corner is a serve from the far side, in another can be a neat reception to the right of the net and an accidental rebound.

After experimenting, I decided that the task of recognizing trajectories for all angles is somewhat utopian and concentrated on the most common option (camera behind the court), with the addition of serves to the right / left of the side angle.

Even here, everything turned out to be not so clear: the camera position varied in height and distance from the edge of the site, which gave out somewhat different trajectories, so we had to go through a lot of videos to train the classifier to get acceptable results.

Technical details

Let’s assume that the volleyball trajectory can be one of 6 options:

  • Feed from the near side

  • Feed from the far side

  • Receive/pass

  • Attack

  • Drop (non-playable transfer)

  • Wrong trajectory (it happens)

In fact, you can come up with a lot more options, but since it is the presentation that worries us, the rest is not so important.

Between themselves, the classes of trajectories can differ in scope, speed of the ball, the steepness of the parabola, a lot of things you can think of. In order not to manually fence the chains of conditions, you can shift this work to the computer.

Before us is the task of classification – one of the main applications of machine learning.

The question arises in what form to transfer trajectory data for classification. There are several options:

  • Draw pictures of trajectories

  • Take a fixed number of points from trajectories

  • Draw out some features (for example: speed, ball size, steepness of the parabola)

As a result of the experiments, it turned out that the third option works best, although it is quite possible, I just do not know how to cook the first two.

You can write a separate article about the pains of choosing features, this is generally an open question and an endless field for experiments. It turned out to be ~ 70% success, but if you train the network on similar games, the result grows to 80% percent.

On Github I posted one of the early versions of the dataset, where the classifier is selected using lazypredict:

Random Forest was in the lead, although initially I planned to use trendy LGBM. In the future, I also compared RF with KNN and SVM, but both could not stand the competition – KNN immediately showed a low percentage, SVM lasted longer, but in the end it also failed.

Further training and testing took place only through RF, the issue was the number of estimators, but after 50 the indicators stopped progressing.

As a result, 70-80% is certainly not enough, but there is a prospect for improvements through a deeper study of the signs. Motivation is high, as service recognition is an important and maybe even a key step towards understanding game situations, without which further analytics looks very uncertain.

Applying a classifier in a real game

Further only about volleyball.

The best results can be obtained if the classifier is trained on a reference game and tested on a similar game with the same camera settings and player levels. The level matters – between amateurs on the beach and the final of the Olympics there will be a big difference in trajectories – in the second case, serves and strikes are not visible, sharp passes – almost nothing in common.

Of course, classification errors happen.

Errors of the first kind (when the pitch is not recognized) can be corrected by a certain pulling of the owl on the globe. As a rule, the pitch is present among the candidates, but its evaluation loses to another type of trajectory. In this case, it helps to find a trajectory with the maximum score for the classes of innings.

With errors of the second kind (when another trajectory is recognized as a serve), the situation is worse. It would be possible to choose again the trajectory with the maximum grade by class, but the problem is that several real innings can fall into one draw at once (a common situation on the edited video, where the pauses between the draws have already been cut out). This situation is difficult to distinguish from a false negative result, as a pause between trajectories can be caused by both a pause in the game and a failure in recognizing the ball. There is still work to be done here.

For example, let’s look at the game beachgirls from 2020 Bocharova/Voronina – Motrich/Shalaevskaya – good quality, a good angle and a score on the screen (the latter quickly lost its meaning, since the players periodically change sides).

At the beginning, an interesting fact is that the first two draws are covered with a splash screen, but this did not prevent the algorithm from parsing them and correctly recognizing the serve:

The account, as already mentioned, does not reflect the real situation due to the change of sides, but is still able to provide some information for reconciliation: for example, according to scoreboard in the last drawthere were two games in the game that ended with a score of 21:15, 21:13, that is, there were 60 draws in the game.

The algorithm found 58 draws, while finding several suspicious moments where there were no innings or more than one (but the criteria did not reach to cut them):

Overall, not bad, but could still be improved.

In terms of video optimization – 40 minutes of video recording turned into 15 gaming.

If anyone is interested in the draws themselves – they Here.


Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *