In a previous article, I talked about how I collect data for detecting text in images.
Now, using the collected data set, let’s try to train one of the most popular networks for segmentation and object detection YOLOv5.
To do this, we will use the free Google Colab.
Connecting a drive from a Google account
from google.colab import drive
The root folder should contain an archive with training data which can be found here GitHub.
Cloning from GitHub YOLOv5 and install
!git clone https://github.com/ultralytics/yolov5
!pip install -r /content/yolov5/requirements.txt
I chose the lightest and fastest yolov5n-seg model and changed the yolov5n-seg.yaml model configuration file to reduce the size of the model and increase the output speed.
Unpack images and markup files to /content/yolov5/datasets/text_detection
from zipfile import ZipFile
with ZipFile('/content/drive/MyDrive/sd_text_detection_train.zip', 'r') as zipObj:
with ZipFile('/content/drive/MyDrive/labels_yolo.zip', 'r') as zipObj:
Everything is ready to train the network, let’s get started
!python "/content/yolov5/segment/train.py" --img 800 --batch 12 --epochs 1200 --data "/content/drive/MyDrive/text_segment.yaml" \
--hyp "/content/drive/MyDrive/hyp.scratch-low.yaml" --project '/content/drive/MyDrive/text_detect/' --name 'weights' \
On my home computer in Windows 10, epoch 1 training on a 12GB RTX 2060 graphics card takes about 6 minutes.
Free version problem Google Colab lack of multithreading and SSD drive.
After about a dofig time in Google Colab, we will test our model
!python "segment/predict.py" --imgsz 800 --iou-thres 0.25 --conf-thres 0.5 --hide-labels --hide-conf --line-thickness 2 --device "cpu" --weights "/content/drive/MyDrive/text_detect/weights/weights/last.pt" --source "/content/yolov5/test"
In my opinion, the result is simply excellent, considering that this is the lightest model and also cut in two.
After converting to TensorFlow Lite, the weight of the model is approximately 5 MB, the output speed on the Redmi Note 10 android phone using the GPU is 300 ms.