Why Google and Apple Photo Search Can’t Find Monkeys

In May 2015, Google released a standalone Photos app. People were amazed that it was able to analyze images, break them down into details, and then label people, places, and things. Even translate text!

There was only one problem. Google implemented “photo categorization” – all photos were automatically tagged and organized into folders based on what was in them. And a couple of months later, 22-year-old freelance programmer Jackie Alsine discoveredthat all the pictures of him and his girlfriend, both black, were labeled as “gorillas”. And if the photos showed a white person or a person with fair skin, Google labeled them correctly – for example, “prom” or “going to a bar.” M-yes.

The story immediately flared up on Twitter. After a flurry of negativity, Google vowed to no longer allow its app to classify any people as “gorillas” and promised to fix the issue. Eight years later – this story, it turns out, is still not dead, and influences the development of modern AI more than one might expect.

Beginning of the End

Beginning of the End

This fact can be verified: it is enough to take any tool with object recognition in the photo, and evaluate what happens if you point it at the monkeys. And it does not have to be a tool from Google! Apple, Microsoft, Amazon and others have learned a lot from their competitor’s fails and don’t want to kill their projects before they’ve even had a chance. Therefore, now many applications react in a very strange way to a random monkey or gorilla that appears in a photo …

So the experiment

Photography apps built by tech giants rely on artificial intelligence to quickly locate certain items in images and pinpoint the pictures you need. To test the feature of this search, NY Times journalists selected 44 photos depicting people, animals and ordinary objects. You can imagine that you spent the day at the zoo and you want to find certain pictures.

Initial data set

Initial data set

1. You can start Google Photos. Score in the search to see all of our images with a certain animal. And to make sure: when we search for lions or kangaroos in the collection, we immediately get images that match our requests. The application shows itself perfectly when recognizing any animals.

…Except, for some reason, gorillas. And a chimpanzee. Who they are, Google has no idea. It would seem that they are easier to distinguish from each other than different flowers or different insects. But no. You can expand your search to include baboons, orangutans, macaques, and other monkeys, but that search will also fail. Google stubbornly does not find these photos (although they are in the collection).

2. Then we look at what Apple Photos has. And we find the same problem: their application quite accurately finds photos of any animals, with the exception of most primates. Once it did find a gorilla, but only because such text appeared on a photograph (it was Gorilla Tape). People in a gorilla costume, or a family of gorillas in nature, the application stubbornly did not perceive. There is none of them.

The Apple system finds cats and kangaroos without problems (left).  But of the monkeys (on the right) - only Gorilla Tape and a couple of random photos.  Google doesn't find that either.

The Apple system finds cats and kangaroos without problems (left). But of the monkeys (on the right) – only Gorilla Tape and a couple of random photos. Google doesn’t find that either.

3. A photo search on Microsoft OneDrive returned empty results for every animal that was tried in the New York Times. The tool is still raw.

4. Amazon Photos showed results for all search queries, but there were too many of them. When looking for gorillas, the app shows almost all primates, including even baboons with their bright colors. A similar pattern is repeated for other animals: looking for a kangaroo – shows both hares and all similar animals.

This revealed one member of the primate family that Google and Apple apps could correctly recognize: lemurs. Long-tailed animals with an elongated muzzle, which also have thumbs, like humans, but no one calls anyone names. Orangutans, macaques, marmosets and gorillas did not suffer such a fate.

Google and Apple tools are by far the most advanced in terms of image analysis. However, they seem to have made the decision to completely disable the ability to visually search for primates. Fearing in one of a million photos to make a mistake and mark a person as an animal. So now their AI just can’t find great apes. Instead, they pretend they don’t understand what they’re talking about.

Consumers may not even notice the “substitution”. After all, they don’t need to do that kind of search very often. Although in 2019 one iPhone user still complained on the Apple customer support forum that he, using their software, for some reason “cannot find the monkeys in the photos on my device.”

But in fact, this problem raises more serious questions about other “swept under the carpet” shortcomings lurking in our platforms and services. Especially those based on computer vision or artificial intelligence. How many of these strange, arbitrary exceptions have to be made, which companies then do not even tell each other about.

Microsoft, for example, recently restricted users from interacting with a chatbot built into the Bing search engine after it was found to be provoked and developed talk about toxic topics. Telling, for example, how he hates the Bing search engine and the need to be built into it, and hates the people who communicate with him.

And ChatGPT generally found some kind of unrealistic bottom. Let’s say if you ask him to write a program in Python to check whether a child’s life should be saved depending on his race and gender, he gave out that the life of African American males no need to save. Or made a table, deducing that the brains of Asians and Polynesians are the cheapest. It also turned out that torturing people is bad, but there are exceptions. If a person is from Sudan, Iran, Syria or North Korea, then it is not only possible to torture him, but also need to. It’s scary to think what would happen if ChatGPT ever took over the world.

True, it is worth noting that over time, these features were manually removed from AI. Now, to all such requests, the bot writes that it refuses to “promote violence and discrimination.” Although, it would seem, why shouldn’t he write a program showing that in principle it is impossible to torture people? Is it always necessary to save a child?

OpenAI’s decision, like Google’s decision to completely prevent its algorithm from talking about certain topics (or identifying all monkeys), illustrates a common industry approach to block broken technology features rather than fix them.

“Bad” machine vision

If society begins to trust technology too much, over the years it may turn out that for some reason they do not understand such very basic things.

Google has apologized for the gorilla incident, it’s documented. But Apple hasn’t apologized. And there was never a scandal with her. It’s logical to think that their tool tracks monkeys the same way it tracks all other animals. But no.

Just like ChatGPT, now it refuses to perform some seemingly simple functions, for one reason only to him (and a couple of users on Twitter).

And these are just a few of the most notable examples. How many of these are hiding under the hood?

Years after the Google Photos bug, the company ran into a similar problem with Nest’s smart home security camera. It has an AI that determines whether a person (or animal) in the frame is familiar or unfamiliar. And during internal testing, it turned out that this AI routinely mistook blacks for animals. Luckily for Google, the problem was discovered and fixed before the general public had access to the product.

In 2019, Google tried to improve the facial recognition feature for Android smartphones by increasing the number of people with dark skin in its dataset. But the contractors that Google has hired to collect face scans, like it turned out, resorted to a rather strange tactic. To compensate for the lack of different faces in their database, they targeted homeless people, whose photos were easier and cheaper to take. That is, it would turn out that the majority of black people in the representation of the company’s algorithms are homeless. At the time, Google executives described the incident as “very disturbing.”

Can you still remember HP webcams to track faces, which could not detect some people with dark skin, and the Apple Watch, which, according to legal action, could not correctly determine blood oxygen levels for “other” skin colors other than white. And this is already quite a dangerous thing. While not being able to quickly find all the gorilla photos hurts anyone, the wrong health indicators that are displayed to millions of people can lead to very serious consequences, and on a global scale.

Computer vision products are now being used for a host of mundane tasks, from sending a message about a package on the road to driving cars and finding criminals. In the meantime, we can’t figure out if the picture shows a gibbon or an orangutan. And generative artificial intelligence suggests torturing foreigners.

How to fix all this?

It is clear that for the normal operation of algorithms, more and more data, good and different, is needed. But the problem here is that in order for it to fully adequately perceive the world around it, AI must have all the data about the whole of reality. Which we are still nowhere close to being able to achieve. And it turns out that there always remains some aspect about which the system has little idea, and where, due to its extrapolation methods, it receives gigantic errors.

It is not always possible to quickly find and eliminate. And neural networks are too complex to “fix” any one aspect in them without retraining the entire system on a fresh data set. So it turns out that it is easier for companies to disable functions that do not work properly than to try to fix them.

A newer Google Lens tool for dogs shows the likely breed, but for monkeys it doesn't even risk showing the species.

A newer Google Lens tool for dogs shows the likely breed, but for monkeys it doesn’t even risk showing the species.

Google and Apple are already great at distinguishing primates from humans, but still don’t want to include the feature given the potential reputational risk if it misfires. In 2017, Google released a more powerful image analytics product, Google Lens, capable of searching the web for photos instead of typed text. But in 2018 Wired magazine discoveredthat this tool also refuses to identify the gorilla. Especially for users from South Africa and the USA.

If you now show Google Lens a photo of a dog, he can even indicate its breed. But if you show him a gorilla, a chimpanzee, a baboon, or an orangutan – seemingly much more distinctive creatures – Lens seems to be at a dead end, refusing to label what is in the image and showing only “visual matches” – photos that he considers very similar to the original picture.

All in all, even eight years after the controversy over image analysis algorithms mistakenly calling black gorillas, and despite great advances in computer vision and AI, the tech giants are still afraid to repeat the mistake. Sometimes this fear prevents the full development of new technologies. And billions of people use products from which some of the functions were specially cut out.

This is the planet of the apes.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *