Determination of a free parking space using Computer Vision

Initially, the idea was as follows: A model based on computer vision should, through a webcam installed at home, track vacant parking spaces and inform via a telegram bot if such a place appears. We will work in Python.

So, the TK for me was formulated from me, now for the cause!

The first thing to decide was to decide which object detection model to use. At first my choice fell on Fast R-СNN. The model showed good detection quality. However, after a few days procrastination thinking about the implementation, I decided to use more modern and interesting methods and connect the detector from YOLO (I took not the newest version 4).



With the choice of the detector is over, with heavy thoughts about the project, too, you can start assembling.

import cv2
import numpy as np
import pandas as pd
from art import tprint
import matplotlib.pylab as plt
import requests

1) We connect the camera using the CV library. I did the development on a pre-recorded video, but if we are working with a webcam, then you just need to pass the zero digit to cv2.VideoCapture(). Next, we work with each frame (we take each frame of the video and run it through our model).

#Инициализируем работу с видео
video_capture = cv2.VideoCapture(video_path)

#Пока не нажата клавиша q функция будет работать
while video_capture.isOpened():
    ret, image_to_process =

    #Препроцессинг изображения и работа YOLO
    height, width, _ = image_to_process.shape
    blob = cv2.dnn.blobFromImage(image_to_process, 1 / 255, (608, 608),
                                 (0, 0, 0), swapRB=True, crop=False)
    outs = net.forward(out_layers)
    class_indexes, class_scores, boxes = ([] for i in range(3))

    #Обнаружение объектов в кадре
    for out in outs:
        for obj in out:
            scores = obj[5:]
            class_index = np.argmax(scores)

2) The next step: the operation of the YOLO detector. YOLO can detect 80 objects, but we only need cars, so we cut off everything superfluous. We take only the Bounding Boxes of the necessary objects of the car class.

            #В классе 2 (car) только автомобили
            if class_index == 2: 
                class_score = scores[class_index]
                if class_score > 0:
                    center_x = int(obj[0] * width)
                    center_y = int(obj[1] * height)
                    obj_width = int(obj[2] * width)
                    obj_height = int(obj[3] * height)
                    box = [center_x - obj_width // 2, center_y - obj_height // 2,
                            obj_width, obj_height]


For information: objects that YOLO4 can detect (next time I will detect a giraffe riding a snowboard).

['person', 'bicycle', 'car', 'motorbike', 'aeroplane', 'bus',
'train', 'truck', 'boat', 'traffic light','fire hydrant', 'stop sign',
'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag',
'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite',
'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket',
'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 
'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 
'donut', 'cake', 'chair', 'sofa', 'pottedplant', 'bed', 'diningtable', 
'toilet', 'tvmonitor', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',
'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 
'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush']

3) Now the creative part begins. What do we mean by packing places? The simplest and most logical thing to do is to take the places where the cars stand! That is, under all the cars that are defined in the frame, there are parking spaces. Well, for starters, this approach is suitable (we will always have time to complicate)

Parking spaces defined in the first frame

Parking spaces defined in the first frame

In the variable first_frame_parking_spaces we will write all the BBoxes that are defined in the frame. These are our parking spaces (surprisingly, there were empty spaces in the parking lot when recording the video, but in fact everything is always occupied). We recorded parking spaces in a variable that we do not touch until the very end of the program (We have them carved in stone, this is our golden grail, this is Olivier salad for the new year, which cannot be touched).

if not first_frame_parking_spaces:
    #Предполагаем, что под каждой машиной будет парковочное место
    first_frame_parking_spaces = boxes
    first_frame_parking_score = class_scores

4) Now we will detect the cars themselves in the frame. This is already a dynamic part of the program development. This is where the main difficulties arise.

How to determine that the car is parked? Compare the intersection of their BoundingBoxes, that is, we need to use Intersection over Union (IoU).



Since the detector searches for cars in a random order, we will compare the intersection of all parking spaces with all cars in the frame. If the car in the frame intersects with the parking space, then IoU will be approximately 0.8-0.9, in other cases 0.0, something like this:

IoU = 0.83, space taken

IoU = 0.83, space taken

Then if the car leaves, then the maximum intersection of the BBox of the car with the BBox parking space will decrease and after a certain threshold it will be possible to say about the freed parking space. Is it logical? Logically! But… Here comes the first problem…

If we shoot clearly from above, then there are no questions, everything will be as described above. But if at an angle, then this is what happens: since BoundingBoxes from neighboring cars can intersect with neighboring parking spaces, then at the moment when one of the parking spaces is vacated, the model does not detect it as completely free, because one of the cars nearby crosses two parking spaces at the same time (his own and freed).

Max IoU=0.35

Max IoU=0.35

Here’s what happens if we look at it in numbers:

Decreasing IoU value is a car leaving, and IoU=0.35 is a car standing next to it

Decreasing IoU value is a car leaving, and IoU=0.35 is a car standing next to it

Now the question is: how to “pull out” the desired IoU and tell the model that this is our car? Let’s make some filters. The meaning is this: The first filtering – we take everything that is less than 0.4 and more than 0 in terms of IoU (protection against sudden shutdown of detection – the absence of a BoundingBox of a car in the model when the car is actually present in the frame). In the second filtering, we will cut off options in which the IoU intersection is less than 0.15, so we can dynamically compare the IoU results and determine that we have a BoundingBox that first fell under the first condition, and then the second condition began to be fulfilled. Next, we start counting frames, and if in a row (for 10 frames) both conditions are met, then this is free space.

There is another problem: leapfrog staff. If suddenly we have a BoundingBox that satisfies the first condition, then we will have a frame counter for a BoundingBox that satisfies both conditions. Here dances with a tambourine begin. Unfortunately, we will have to add one more (last) filter, which will be responsible for leapfrog BBoxes and reset the free_parking_timer counter. Eh, I hope it becomes clearer when looking at the code below 🙂

overlaps = compute_overlaps(np.array(parking_spaces), np.array(cars_boxes))

for parking_space_one, area_overlap in zip(parking_spaces, overlaps):
    max_IoU = max(area_overlap)
    sort_IoU = np.sort(area_overlap[area_overlap > 0])[::-1]      
    if free_parking_space == False:
        if 0.0 < max_IoU < 0.4:

            #Количество паркомест по условию 1: 0.0 < IoU < 0.4
            len_sort = len(sort_IoU)

            #Количество паркомест по условию 2: IoU > 0.15
            sort_IoU_2 = sort_IoU[sort_IoU > 0.15]
            len_sort_2 = len(sort_IoU_2)

            #Смотрим чтобы удовлятворяло условию 1 и условию 2
            if (check_det_frame == parking_space_one) & (len_sort != len_sort_2):
                #Начинаем считать кадры подряд с пустыми координатами
                free_parking_timer += 1

            elif check_det_frame == None:
                check_det_frame = parking_space_one

                #Фильтр от чехарды мест (если место чередуется, то "скачет")
                free_parking_timer_bag1 += 1
                if free_parking_timer_bag1 == 2:
                    #Обнуляем счётчик, если паркоместо "скачет"
                    check_det_frame = parking_space_one
                    free_parking_timer = 0

            #Если более 10 кадров подряд, то предполагаем, что место свободно
            if free_parking_timer == 10:
                #Помечаем свободное место
                free_parking_space = True
                free_parking_space_box = parking_space_one
                #Отрисовываем рамку парковочного места 
                x_free, y_free, w_free, h_free = parking_space_one

And when all three conditions are met for 10 frames, we can finally mark the selected BBox as a free parking space and switch the free_parking_space flag to True.

Model work

Model work

It is worth doing the opposite thing: if free_parking_space=True, but the parking space is occupied, then we again have no free space 🙁

#Если место занимают, то помечается как отсутствие свободных мест
overlaps = compute_overlaps(np.array([free_parking_space_box]), 
for area_overlap in overlaps:                
    max_IoU = max(area_overlap)
    if max_IoU > 0.6:
        free_parking_space = False
        telegram_message = False

It remains quite a bit – to fasten the telegram service for informing. In this article, I will not describe how to do this, I will only give a code snippet with the implementation of the necessary functions.

TOKEN = "…"
chat_id = "…"
#Функция для отправки фото в telegram
def send_photo_file(chat_id, img):
    files = {'photo': open(img, 'rb')}'{TOKEN}/sendPhoto?chat_id={chat_id}', files=files)

#Функция для отправки сообщения в telegram
def send_telegram_message(message):

On this, in fact – everything! You can see the complete assembly code on my GitHub page (

Work of the program with informing via telegram

Work of the program with informing via telegram

The customization of the code is very subtle and, unfortunately, not universal. I’m sure there will be new problems with different footage. The model does not implement the detection of several parking spaces at the same time, it is not possible to determine empty parking spaces on the initial frame, and much more. But the base is made and the main questions are considered. Perhaps with better detection on the latest versions of YOLO, some issues can be discarded (for example, with an unexpected shutdown of car detection), however, the main logic can be finalized for a long time, but already within the framework of commercial projects.

The development of this version took me three weeks of sluggish tinkering and two full weekends of intensive development (plus a day to write the article).

That’s all for me! I hope this article was useful, I will be grateful for comments and questions. In the future, I plan to implement several more interesting projects based on Computer Vision and neural networks.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *