OpenAI has introduced GPT-4o mini and we have already implemented it

Today we were pleased with another hot news from the world of AI! Open AI presented GPT-4o mini — a new accessible and highly intelligent “small” language model, which is much smarter, cheaper and as fast as GPT-3.5 Turbo. Without thinking twice, we implemented and tested the new model on our tasks. The results are below.

Key aspects of the new “mini” model

Intelligence: GPT-4o mini outperforms GPT-3.5 Turbo in text intelligence (82% MMLU score vs. 69.8%) and multimodal reasoning.
Price: GPT-4o mini is more than 60% cheaper than GPT-3.5 Turbo.
Modalities: GPT-4o mini currently supports text and vision, and OpenAI plans to add support for audio and video input and output in the future.
Languages: GPT-4o mini has improved multilingual understanding compared to GPT-3.5 Turbo across a wide range of languages.

Due to its low cost and low latency, GPT-4o mini is well suited for data-intensive tasks (e.g. feeding a model a complete codebase or conversation history), cost-sensitive tasks (e.g. summarizing large documents), and tasks that require fast responses (e.g. customer support chatbots). Like GPT-4o, GPT-4o mini has a context window of 128k tokens, supports up to 16k output tokens per query, and has a knowledge cutoff of October 2023.

Implementation and testing

Working with complex text documents

We will provide the model with a press release from Rostelecom for investors, which presents key financial indicators and other insights. In particular, we will be interested in the dynamics of corporate VPN users.

Example of working with complex .pdf documents

It is obvious that the model correctly interpreted the document, namely, extracted the necessary data from the table and analyzed it.

A fragment of the document with the correct answer

Multimodality: Vision

To demonstrate the capabilities of computer vision and image understanding, I decided to use a very everyday example: asking an assistant what the names of the berries I saw on the street were.

An example of using the vision capabilities of the AI Assistant in everyday life

In this case, the model identified the plant absolutely correctly, moreover, it indicated the fact that these berries are edible and even their taste. Well, let's trust the AI and try…

Multimodality: Audio

In order to test the audio modality (available only in our assistant), I asked the model to translate “artificial intelligence” into German using her voice, and this is what happened.

Audio modality as exemplified by machine translation

The model not only recognized the voice and understood the meaning of what was said, generating a response, but also provided a voice output, which is especially convenient when working with an assistant while driving a car or other similar activities.

Results

The launch of GPT-4o mini demonstrates OpenAI’s “general vision” to not only improve its technologies, but also to make them accessible to all users. It is another step toward a future where human interaction with AI is simple, natural, and ubiquitous. We look forward to seeing more exciting discoveries and applications of these technologies in everyday life.

You can try the model Here. (use the /changeModel command to change the model)

You will find more news in my telegram channel.

OpenAI has introduced GPT-4o mini and we have already implemented it

Key aspects of the new “mini” model

Implementation and testing

Working with complex text documents

Multimodality: Vision

Multimodality: Audio

Results

Optimizing your computer for better 1C performance

Using EfficientNet Models for Image Classification

Russian single-board based on Elbrus. What kind of device is this?

Good programmers copy, great programmers steal

The program writes articles. (N003)

Testing in Team

Leave a Reply Cancel reply

Key aspects of the new “mini” model

Implementation and testing

Working with complex text documents

Multimodality: Vision

Multimodality: Audio

Results

Similar Posts

Leave a Reply Cancel reply