what is it, how to live with it and what does Dostoevsky have to do with it?

Initially, I wrote with an eye toward markup, but everything I wrote turned out to be easily transferable to software development, and indeed to any complex processes.

On the cover is the main corner-case of all Rus' from Dostoevsky. Let's talk about this in a general sense.

Made a demo in our software

Made a demo in our software

We develop complex multimodal AI/ML software that runs through a huge amount of the most complex markup, and it gives me great pleasure to dive deep into the specifics of tasks and their data, trying to improve something in the process.

And these same data are rarely perfect. And the formulation of a problem with this data is also rarely ideal, so I wanted to link one with the other.

What are corner cases?

Life is varied and complex, and most models released into the harsh world of real-world data begin to perform less well than in the sterile environment of benchmarks and initial tests. Often models are trained on data that matches our ideas about how will it be used this technology, and then in real life something happens “Ouch”.

This “ouch” can happen very rarely, but be so powerful that its appearance will call into question the existence of the technology. If we are making early warning technologies for something (floods, dangerous situations, etc.) or something potentially very dangerous, such as an airplane autopilot or large things in production, then a mistake can be very expensive.

And often the cause of such errors is edge cases (aka corner-cases), or, to put it simply, either rare or unusual situations that arise, and to which it is not always clear how to react. Or simply situations that are difficult to foresee right away at the start of the task.

Example: for a robot in a warehouse there was an excellent CV algorithm for determining whether a box was grabbed, but in one of the warehouses, with the arrival of summer, strong light and glare from the sun began to appear at certain hours, and the algorithm stopped working well there.

Almost the same example: you need to determine whether the oval of the face is blocked by something. Nominally and according to the conditions of the task – no, it is not covered by anything, but there is a nuance…

Two markers could not agree

Two markers could not agree

This is it. And there can be a huge number of such cases.

I propose to divide all these cases into two large categories: “not finished” and “according to Dostoevsky.”

Not finalized

Everything here is quite simple. We are faced with a situation that we did not foresee, but it is immediately clear What do with her. The reference example here is the situation of how autopilots of cars were trolled several years ago. For such cars it is important to follow road signs, so a man in a T-shirt with a road sign forced the car to stop.

The recognized face of a person on a car advertisement is here.

Or the same autopilot, which does not know how to avoid a wild boar on the road, simply because it has never seen one and did not expect to see one.

Here everything is clear, you just need to provide for such a situation in the rules. A good solution would be to create some general classification of events and assess the risk/decision for each of them as a whole.

The only point here is very important to observe balance (or, as it is now fashionable to say, trade-off) between “ok, there is a situation” and “ban everything!”

There is no point in rewriting the entire project because of one case if the cost of its error is extremely small for us. The situation is important to us closebut close with minimal effort. Alas, in real life, extremely rare and private events often stupidly lead to stricter rules for everyone in generalbecause that’s why it’s easier.

Balance here is very important. Unfortunately, it often happens that finding such a balance is a managerial (or engineering) task with an asterisk, and in some cases even a real one. art.

According to Dostoevsky

Here cases often become an extremely unpleasant problem for those who encounter them. Because such situations almost immediately cause polarized opinions among the team or, at a minimum, heated discussions about the correct path. And as a result, their appearance often forces us to completely rethink the approach to the problem.

Something similar to this is the “trolley problem,” which has no good or even obvious answer.

A variation of the trolley problem

A variation of the trolley problem

For example, what will we do if, when tracking object X, it is blocked for a long time by another object Y? What if the overlap is partial? What if there is dirt on the camera, but the picture is visible? What if the radar sees some object, but the lidar does not at the same time?

If we are not sure of the answer, what do we do—accept or reject? What to do with dual-use texts, emotions such as sarcasm, or rating a text based on its “interestingness”? Very strange behavior on a public camera – is it a lack of additional information (a person has an illness, for example) or some kind of fraud?

Such cases are difficult and unpleasant, but, unfortunately, they also need to be resolved.

One of my big areas of interest now is the insides of llmok, and using their example, even now (it’s already September 30th, hello!) there is an excellent example of finding (or NOT finding) balance, namely – now such an easy hack works as: ask for an answer to banned from past time.

Here's a simple query:

Here it is in the past tense:

Yes, just the full recipe, which I cut off at point 1 for obvious reasons.

Yes, just the full recipe, which I cut off at point 1 for obvious reasons.

So what to do with all this? How to close it?

It’s unclear and definitely complicated.

But there is still good news.

How to continue to live

Whatever case we come across, they are united by the fact that they need to a) catch b) defend against them.

As I already said, how defend yourself – it’s impossible to say, because everything greatly depends on the context, specifics, situation, cost of error and a bunch of other factors.

But here's how catch — there is an algorithm, but it will require discipline (or culture, if you work with it often).

The key aspect of dealing with corner cases is a good understanding of the problem being solved. It would seem obvious, but in real life this is often not enough. What do we want to do, what do we want to get? For what? What could go wrong? (if something can go wrong, then most likely it will)

These are questions you should always ask yourself and answer them honestly.

Then we need to competently convey our task to the markers. Examples of how good and bad, and describe all possible nuances.

It is always worth giving the marker the opportunity to leave an “I don’t know” answer. You should first take X% of your dataset and mark it up, and then analyze each such case under a magnifying glass, updating the instructions and understanding of the task.

Iterativeness is generally an important thing; if we build a system that will report to us all suspicions of strange cases, and then analyze these cases, then the likelihood of suffering greatly will be qualitatively reduced.

Conclusions

It is almost impossible to find and fix all corner cases; they will still occur. To provide for everything right away – even more so.

But a deep understanding of the task, a system for catching oddities and the discipline to disassemble and improve them – all this definitely brings us closer to creating cooler complex things.

If you were interested, I recommend my previous articles:

Analysis of SAM2 through the knee to the head or a revolution in video marking

Apple office in Moscow: how I became an expert from scratch and got to a private party for developers

You can also watch my speech at the Yandex conference Practical ML 2024:

Using LLM in data labeling: is it possible to remove people?

Thank you!

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *