Computer vision lessons in Python + OpenCV from the very beginning. Part 10. My pet project
Table of contents:
Let me remind you what steps were taken in the last lesson:
· Apply median filtering to the image.
Perform binarization.
Today we will go a little further: select the outline and find the license plate rectangle on it. First, let’s write a class that produces a contour selection:
class ContourProcessingStep(ImageProcessingStep):
"""Шаг, отвечающий за выделение контуров"""
def process(self,info):
"""Выполнить обработку"""
contours, hierarchy = cv2.findContours(info.image, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
height, width = info.image.shape[:2]
contours_image = np.zeros((height, width, 3), dtype=np.uint8)
# отображаем контуры
cv2.drawContours(contours_image, contours, -1, (255, 0, 0), 1, cv2.LINE_AA, hierarchy, 1)
#Заполним данные
new_info=ImageInfo(contours_image)
new_info.contours=contours
new_info.hierarchy=hierarchy
return new_info
This contour must be approximated, we will develop the following class:
class ContourApproximationProcessingStep(ImageProcessingStep):
"""Шаг, отвечающий за апроксимацию контуров"""
def __init__(self,eps = 0.005, filter=None):
"""Конструктор
eps - размер элемента контура от размера общей дуги"""
self.eps=eps
self.filter=filter
def process(self, info):
"""Выполнить обработку"""
approx_countours=[]
img_contours = np.uint8(np.zeros((info.image.shape[0], info.image.shape[1])))
for countour in info.contours:
arclen = cv2.arcLength(countour, True)
epsilon = arclen * self.eps
approx = cv2.approxPolyDP(countour, epsilon, True)
append=False
if not(self.filter is None):
if self.filter(approx):
append=True
else:
append=True
if append:
approx_countours.append(approx)
cv2.drawContours(img_contours, approx_countours, -1, (255, 255, 255), 1)
#Заполним данные
new_info=ImageInfo(img_contours)
new_info.contours=approx_countours
return new_info
As a filter, a link to a function that will select contours by some criterion (we only need a rectangular contour).
So, we test, first without a filter:
import cv2
from Libraries.Core import Engine
from Libraries.ImageProcessingSteps import MedianBlurProcessingStep, ThresholdProcessingStep, ContourProcessingStep, \
ContourApproximationProcessingStep
def my_filter(approx):
if len(approx)==4:
return True
return False
my_photo = cv2.imread('../Photos/car.jpg')
core=Engine()
core.steps.append(MedianBlurProcessingStep(5))
core.steps.append(ThresholdProcessingStep())
core.steps.append(ContourProcessingStep())
core.steps.append(ContourApproximationProcessingStep(0.02))
#core.steps.append(ContourApproximationProcessingStep(0.02,my_filter))
res,history=core.process(my_photo)
i=1
for info in history:
cv2.imshow('image'+str(i), info.image) # выводим изображение в окно
i=i+1
cv2.imshow('res', res.image)
cv2.waitKey()
cv2.destroyAllWindows()
Let’s see what we got:

Here, for clarity, I showed a reduced photo of the machine. Let’s try to process a full-size photo:

So, we see something like a rectangle (yes, it’s curved, but other shapes don’t look like a rectangle at all).
Now the question arises: how can we find our “rectangle” among all these lines? First, let’s apply a filter (that’s what we need), discarding all the shapes that are not quads. For this, as you noticed, in the text of the program there is such a function:
def my_filter(approx):
if len(approx)==4:
return True
return False
It remains to change this line of code
core.steps.append(ContourApproximationProcessingStep(0.02))
To this one:
core.steps.append(ContourApproximationProcessingStep(0.02,my_filter))
And voila, we are left with only quads:

As you can see, there are significantly fewer objects left, but there is still a lot of garbage. Let’s filter it by removing objects that are too small:
def my_filter(approx):
if len(approx)==4:
if abs(approx[2,0,0]-approx[0,0,0])<10:
return False
if abs(approx[2,0,1]-approx[0,0,1])<10:
return False
return True
return False
And here’s what we got:

There are only 5 properties left. In theory, of course, additional filtering can be applied, for example, by excluding objects that have incorrect length-to-width ratios (a license plate has specific dimensions according to GOST, which means that its length-to-width ratio is also specific). You can also exclude explicitly “crooked rectangles” in which the difference in the lengths of opposite sides is much higher than the level of error. True, it should be remembered that in an attempt to filter out unnecessary objects, you can also throw out the necessary ones to the heap. So you have to be careful here.
However, let’s try. First, let’s write a function for calculating the length of the sides:
def dist(point1,point2):
d1 = point1[0] - point2[0]
d2 = point1[1] - point2[1]
return math.sqrt(d1*d1+d2*d2)
And make changes to the filter:
def my_filter(approx):
if len(approx)==4:
if abs(approx[2,0,0]-approx[0,0,0])<10:
return False
if abs(approx[2,0,1]-approx[0,0,1])<10:
return False
if abs(dist(approx[0,0],approx[1,0])/dist(approx[2,0],approx[3,0])-1)>0.4:
return False
if abs(dist(approx[0,0],approx[3,0])/dist(approx[1,0],approx[2,0])-1)>0.4:
return False
return True
return False
Here’s what happened:

As you can see, there are only three objects left. One of them, by the way, can be filtered for divergence with right angles (if the angle strongly deviates from 90 degrees). But we will not do this yet, we will assume that one extra object is not critical.
We visualize the found numbers. To do this, you need to extract points from the contour. Here is how the extraction from the contour of two opposite points of the first element will look like:
x1=res.contours[0][0][0][0]
y1=res.contours[0][0][0][1]
x2=res.contours[0][2][0][0]
y2=res.contours[0][2][0][1]
cv2.rectangle(finish_result,(x1,y1),(x2,y2),(255,0,0),3)
And this is what will be drawn in the end:

Of course, you don’t have to do this. And you need to, well, at least write a function that would extract these points:
def get_rect(countur_item):
x1 = countur_item[0][0][0]
y1 = countur_item[0][0][1]
x2 = countur_item[2][0][0]
y2 = countur_item[2][0][1]
return (x1,y1), (x2,y2)
And then we can draw the first shape like this:
p1,p2=get_rect(res.contours[0])
cv2.rectangle(finish_result,p1,p2,(255,0,0),3)
And all the figures are like this:
for item in res.contours:
p1,p2=get_rect(item)
cv2.rectangle(finish_result,p1,p2,(255,0,0),3)
And here’s what happens:

That is, now we need to analyze only these three areas, look for letters and numbers there. But first I would like to clean up the code. This is how our run2.py file looks like:
import cv2
import math
from Libraries.Core import Engine
from Libraries.ImageProcessingSteps import MedianBlurProcessingStep, ThresholdProcessingStep, ContourProcessingStep, \
ContourApproximationProcessingStep
def dist(point1,point2):
d1 = point1[0] - point2[0]
d2 = point1[1] - point2[1]
return math.sqrt(d1*d1+d2*d2)
def my_filter(approx):
if len(approx)==4:
if abs(approx[2,0,0]-approx[0,0,0])<10:
return False
if abs(approx[2,0,1]-approx[0,0,1])<10:
return False
if abs(dist(approx[0,0],approx[1,0])/dist(approx[2,0],approx[3,0])-1)>0.4:
return False
if abs(dist(approx[0,0],approx[3,0])/dist(approx[1,0],approx[2,0])-1)>0.4:
return False
return True
return False
def get_rect(countur_item):
x1 = countur_item[0][0][0]
y1 = countur_item[0][0][1]
x2 = countur_item[2][0][0]
y2 = countur_item[2][0][1]
return (x1,y1), (x2,y2)
my_photo = cv2.imread('../Photos/6108249.jpg')
#my_photo = cv2.imread('../Photos/car.jpg')
core=Engine()
core.steps.append(MedianBlurProcessingStep(5))
core.steps.append(ThresholdProcessingStep())
core.steps.append(ContourProcessingStep())
#core.steps.append(ContourApproximationProcessingStep(0.02))
core.steps.append(ContourApproximationProcessingStep(0.02,my_filter))
res,history=core.process(my_photo)
i=1
for info in history:
cv2.imshow('image'+str(i), info.image) # выводим изображение в окно
i=i+1
cv2.imshow('res', res.image)
finish_result = history[0].image.copy()
for item in res.contours:
p1,p2=get_rect(item)
cv2.rectangle(finish_result,p1,p2,(255,0,0),3)
cv2.imshow('Finish', finish_result)
cv2.waitKey()
cv2.destroyAllWindows()
Not very pretty, let’s do some refactoring. Let’s add the Utils.py file to the Libraries folder and move the get_rect and dist functions there. We import these functions:
from Libraries.Utils import dist, get_rect
Now the executable file looks like this:
import cv2
from Libraries.Core import Engine
from Libraries.ImageProcessingSteps import MedianBlurProcessingStep, ThresholdProcessingStep, ContourProcessingStep, \
ContourApproximationProcessingStep
from Libraries.Utils import dist, get_rect, show_history
def my_filter(approx):
if len(approx)==4:
if abs(approx[2,0,0]-approx[0,0,0])<10:
return False
if abs(approx[2,0,1]-approx[0,0,1])<10:
return False
if abs(dist(approx[0,0],approx[1,0])/dist(approx[2,0],approx[3,0])-1)>0.4:
return False
if abs(dist(approx[0,0],approx[3,0])/dist(approx[1,0],approx[2,0])-1)>0.4:
return False
return True
return False
my_photo = cv2.imread('../Photos/6108249.jpg')
#my_photo = cv2.imread('../Photos/car.jpg')
core=Engine()
core.steps.append(MedianBlurProcessingStep(5))
core.steps.append(ThresholdProcessingStep())
core.steps.append(ContourProcessingStep())
#core.steps.append(ContourApproximationProcessingStep(0.02))
core.steps.append(ContourApproximationProcessingStep(0.02,my_filter))
res,history=core.process(my_photo)
show_history(res,history)
finish_result = history[0].image.copy()
for item in res.contours:
p1, p2 = get_rect(item)
cv2.rectangle(finish_result, p1, p2, (255, 0, 0), 3)
cv2.imshow('Finish', finish_result)
cv2.waitKey()
cv2.destroyAllWindows()
And the utility file is like this:
import math
import cv2
def get_rect(countur_item):
x1 = countur_item[0][0][0]
y1 = countur_item[0][0][1]
x2 = countur_item[2][0][0]
y2 = countur_item[2][0][1]
return (x1,y1), (x2,y2)
def dist(point1,point2):
d1 = point1[0] - point2[0]
d2 = point1[1] - point2[1]
return math.sqrt(d1*d1+d2*d2)
def show_history(res,history):
i = 1
for info in history:
cv2.imshow('image' + str(i), info.image) # выводим изображение в окно
i = i + 1
cv2.imshow('res', res.image)
Now you can think about how to “decrypt” the number. In this lesson, I will tell you how to recognize numbers using a neural network, and we will write the recognizer in the next lesson.
And so, get acquainted, His Majesty Keras. To install it on Windows, first install TensorFlow:
pip3 install tensorflow
and then Keras itself:
pip3 install Keras
Well, actually, an example of a neural network training code on the standard minst dataset built into keras:
from keras import layers
from keras import models
from keras.datasets import mnist
import tensorflow as tf
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
train_images = train_images.reshape((60000, 28, 28, 1))
train_images = train_images.astype('float32') / 255
test_images = test_images.reshape((10000, 28, 28, 1))
test_images = test_images.astype('float32') / 255
train_labels = tf.keras.utils.to_categorical(train_labels)
test_labels = tf.keras.utils.to_categorical(test_labels)
model.compile(optimizer="rmsprop", loss="categorical_crossentropy", metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=5, batch_size=64)
test_loss, test_acc = model.evaluate(test_images, test_labels)
print(test_acc)
Here, a convolutional neural network is used, at the input of which is a black and white picture of 28 by 28 pixels, at the output there is a ten-digit probability vector that this or that number is in the picture. The model shows an accuracy of about 99% on the test sample.
If you do not believe that mnist is really numbers, you can check this by visualizing some element of the dataset:
from keras.datasets import mnist
import cv2
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
print(train_labels)
cv2.imshow("Цифра", train_images[0])
cv2.waitKey(0)
cv2.destroyAllWindows()
For element number zero, we will see the number 5:

And this element actually corresponds to label 5:

Let’s try to feed some image of the trained neural network. By the way, after we have trained the neural network, it would be nice to save it:
model.save('cats_and_dogs_small_1.h5')
Well, now let’s try to upload a picture with a number and let the neural network recognize it:
import cv2
from keras import models
my_photo = cv2.imread('imgs/Digit0.png',cv2.IMREAD_GRAYSCALE) #загрузим изображение
#приведем изображение к формату для нейросети
normal_photo=my_photo/255.0
input=normal_photo.reshape(1,28,28)
#скормим изображение нейросетке и получим результат
model = models.load_model('mnist_model.bin')
result=model.predict(input)
print(result)
Here is the picture:

At the exit:
[[9.9990845e-01 1.4144711e-08 8.4316625e-08 3.7920216e-11 2.4454723e-06
4.7663391e-08 8.7873021e-05 4.1903621e-07 3.8488349e-08 6.0560058e-07]]
It can be seen that in the first (that is, zero, counting from zero) cell (corresponds to the number 0), the probability is almost 1, in the rest it is almost 0.
Let’s see how the unit is recognized:

[[5.2682775e-08 9.9998152e-01 9.0230742e-07 1.2926430e-09 9.7749239e-07
6.3665328e-07 5.2730784e-06 1.0716837e-05 2.0985880e-08 1.3917042e-11]]
As you can see, here I recognized the number correctly.
That’s all, we will put the neural network into a project with a “beautiful” code next time.
Let me remind you that examples can be downloaded here: megabax/CVContainer: It is my pet computer vision project. (github.com)