PicTrace and Artificial Intelligence in Action

Comparison and processing of OpenCV3 photos in the PicTrace application.

Comparison and processing of OpenCV3 photos in the PicTrace application.

In today's world, where the amount of visual information is growing every day, the ability to quickly and accurately find similar images is becoming increasingly important. Imagine being able to upload an image and in a matter of seconds get a list of the most similar images from a large database. Sounds interesting? That's exactly what I'm trying to achieve with my web app — PicTrace.

What is it? PicTrace and how does this platform help solve such problems? How does it use the power OpenCV And TensorFlow for image processing? Why does the combination of structural comparison and keypoints make my approach interesting?

First and foremost, PicTrace is an image search and comparison platform powered by computer vision And deep learning. The initial goal is to quickly and accurately identify similar images in a large database. At the time of writing, I am using an object storage compatible with S3which provides secure and scalable storage of images. And asynchronous operations significantly improve performance and reduce latency when executing queries, making working with the platform faster and more convenient.

Finding duplicate or similar images is becoming an increasingly important task for many areas, from content management And copyright protection before marketing research And entertainment applications.

I was interested in this task mainly because of its relevance in the work environment. Working in a large marketplace, I asked myself: “How will people find similar products?” At that time, I was just starting to fully immerse myself in Python programming, and after some time, I was already training a neural network model on product data.

Example of work.

Example of work.

The most important technologies underlying my application are:

  • OpenCV: A powerful computer vision library used for advanced image processing, including loading, resizing, and comparing images, making it an essential component for image-related tasks that I couldn't live without. Read more

Below I have provided an example of using Computer Vision in real life where I used a webcam to analyze objects and some sample code.

Analysis of objects in space in real time.

Analysis of objects in space in real time.

    def update_frame(self):
        while not self.stop_event.is_set():
            ret, frame = self.cap.read()
            if not ret:
                messagebox.showerror("Error", "Cannot read frame.")
                self.window.destroy()
                return

            blob = cv2.dnn.blobFromImage(frame, 1/255.0, (416, 416), swapRB=True, crop=False)
            self.net.setInput(blob)
            layer_names = self.net.getLayerNames()
            output_layers = [layer_names[i - 1] for i in self.net.getUnconnectedOutLayers()]
            detections = self.net.forward(output_layers)

            h, w = frame.shape[:2]
            boxes, confidences, class_ids = [], [], []
            for output in detections:
                for detection in output:
                    scores = detection[5:]
                    class_id = np.argmax(scores)
                    confidence = scores[class_id]
                    if confidence > 0.5:
                        box = detection[0:4] * np.array([w, h, w, h])
                        (centerX, centerY, width, height) = box.astype("int")
                        x = int(centerX - (width / 2))
                        y = int(centerY - (height / 2))
                        boxes.append([x, y, int(width), int(height)])
                        confidences.append(float(confidence))
                        class_ids.append(class_id)

            indices = cv2.dnn.NMSBoxes(boxes, confidences, 0.5, 0.4)

            if len(indices) > 0:
                for i in indices.flatten():
                    (x, y) = (boxes[i][0], boxes[i][1])
                    (w, h) = (boxes[i][2], boxes[i][3])
                    color = (0, 255, 0)
                    label = f"{self.classNames[class_ids[i]]}: {confidences[i]:.2f}"
                    cv2.rectangle(frame, (x, y), (x + w, y + h), color, 2)
                    cv2.putText(frame, label, (x, y - 5), cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 2)

            if self.is_recording and self.output:
                self.output.write(frame)

            cv2image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
            img = Image.fromarray(cv2image)
            imgtk = ImageTk.PhotoImage(image=img)
            self.display.imgtk = imgtk
            self.display.configure(image=imgtk)
  • TensorFlow and ResNet50: Deep learning and ResNet50 model are applied to extract image features.

    Visualization of residual connections critical to the ResNet architecture.

    Visualization of residual connections critical to the ResNet architecture.

  • FastAPI: A high-performance web framework for creating web applications and handling HTTP requests.

  • aiohttp: A library for asynchronous HTTP requests, providing fast and efficient data processing.

The process of PicTrace is quite simple:

  1. Loading image: The user uploads an image directly or by dragging it across their desktop.

  2. Extracting features: The system analyzes the uploaded image and extracts its special features using the ResNet50 model.

  3. Image comparison: Based on structural comparison (SSIM) and keypoints (ORB), PicTrace finds and compares images from the database.

  4. Results: The system returns a list of the most similar images.

Example code for finding similar images with comments:

# Загрузка данных из БД, содержащей информацию об изображениях. db_data = load_db()
async def find_similar_images(file_path):
# Чтение целевого изображения из указанного пути.
target_image = cv2.imread(file_path)

# Извлечение характеристик из целевого изображения с использованием предварительно обученной модели.
target_features = extract_features(target_image)

# Создание асинхронной сессии aiohttp для обработки HTTP-запросов.
async with aiohttp.ClientSession() as session:
    # Создание асинхронных задач для функции compare_images для каждого изображения в базе данных.
    tasks = [
        compare_images(session, entry, target_features) 
        for entry in db_data if "url" in entry  # Выполнение сравнений только для записей, содержащих URL изображения.
    ]

    # Ожидание завершения всех задач и сбор результатов.
    results = await asyncio.gather(*tasks)

# Фильтрация результатов, оставляя только те, у которых оценка схожести выше 0.
valid_results = filter(lambda x: x[0] > 0, results)

# Сортировка отфильтрованных результатов по убыванию оценки схожести и выборка топ-5.
sorted_results = sorted(valid_results, key=lambda x: x[0], reverse=True)[:5]

# Создание списка для хранения URL-адресов схожих изображений.
similar_images = [result[1] for result in sorted_results if result[1]]

# Возврат списка URL-адресов схожих изображений.
return similar_images

In developing PicTrace, I aim to create a tool that would effectively solve the problems of searching and comparing images. Using modern technologies such as OpenCV and TensorFlow allowed me to achieve high accuracy and speed of data processing. I still hope that over time my product will become part of some workflow, automating routine tasks and helping to solve complex problems.

You can view and support my project at GitHub

P.S. I would like to express special gratitude to the OpenCV and TensorFlow developer community for their contribution to the development of digital vision, and I would also like to express my deep appreciation to you for your interest in my work and the time you spent reading this article.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *