Stanford Shows Deep Learning by Darwin

Evolutionary Deep reinforcement learning can help overcome the limitations of other approaches, and the results of the work could have a profound impact on AI and robotics.

Agents created in a complex virtual environment develop not only the ability to learn, but also the physical structure. We share the details under the cut, while we start ML and DL course


Despite the analogy with evolution and nature, in the field of AI, a lot of emphasis is placed on creating separate elements of intelligence and on combining them. The approach produced excellent results, but limited the flexibility of AI agents in the skills inherent in even the simplest forms of life.

The body and brain of animals develop together. In order for the limbs, organs and nervous system necessary in the environment to appear, species have undergone countless mutations.

Moreover, all species on Earth evolved from the first life form that appeared on Earth several billion years ago. The selection pressure of the environment directed the development of the descendants of these first living things in different ways.

Studying the evolution of life and intelligence is interesting, but reproduce its very difficult. To recreate intelligent life like evolution, the AI ​​system would have to search in a very large space for possible morphologies, and this is fraught with excessive computational costs. It takes a lot of trial and error.

Solutions to the problems of studying evolution

Researchers solve some of these problems in different ways. For example, scientists capture the architecture or physical structure of a system and focus on optimizing learning parameters. There are other approaches as well:

  • AI agents pass on the learned parameters to their descendants, reflecting Lamarck’s evolutionary theory.

  • AI visual, motor, speech systems can be trained separately from each other, combining them in the final system.

These approaches speed up the process and reduce the cost of training and developing AI agents, but they limit the flexibility and variety of outcomes.

Evolutionary deep reinforcement learning

In the new work, scientists at Stanford University seek to bring AI research closer to the real evolutionary process at a minimum cost.

“Our goal is to explore the principles governing the links between environmental complexity, evolved morphology, and the learnability of intelligent control,” the researchers write.

Their approach is called evolutionary deep reinforcement learning. To acquire skills and maximize rewards over their lifetime, each agent in the system uses deep reinforcement learning.

To find optimal solutions in morphological space, Darwin’s theory of evolution is applied here. In other words, the new generation of agents inherits only the physical and architectural features of their ancestors with minor mutations. Not a single learned parameter is passed on to the next generations.

“The foundation of evolutionary deep reinforcement learning is giving rise to large-scale computer modeling experiments to understand how the combined application of learning and evolution creates complex relationships between environmental complexity, morphological intelligence, and agent learning,” the researchers write.

Evolution modeling

The researchers took the virtual environment as a basis. MuJoCo with high-precision modeling of solid state physics. The goal is to create UNIversal aniMAL (UNIMAL) morphologies in its space, which study the tasks of moving and manipulating objects in conditions of various terrain.

Each agent in the environment consists of a genotype. The genotype determines its limbs and joints. The direct descendant of the agent inherits the genotype and mutates: creates or removes limbs, changes their size and degrees of freedom.

To maximize rewards in different environments, each agent undergoes reinforcement training. The main task is movement, in which the agent is rewarded for the distance covered during the episode. Agents whose physical structure is better suited to traversing terrain learn to move faster.

To test the results, the scientists generated agents in three types of terrain:

  • On the plains, the selection pressure on the morphology of agents is minimal.

  • Rough terrain forces you to develop a versatile physical structure in order to climb slopes and avoid obstacles.

  • On rough terrain with mutable objects, there is an additional difficulty: in order to complete the task, the agents must manipulate the objects.

Benefits of Evolutionary Deep Reinforcement Learning

Evolutionary deep reinforcement learning generates a variety of morphologies in different environments
Evolutionary deep reinforcement learning generates a variety of morphologies in different environments

One of the interesting findings of the study is the variety of results. Other approaches to evolutionary AI tend to converge on one solution, since new agents directly inherit the addition and knowledge of their ancestors. But in evolutionary deep learning with reinforcement, only morphological data is transmitted to descendants, which means that a set of various morphologies is created in the system, including two-, three- and four-legged agents with and without arms.

This system detects Baldwin effect: Agents who learn faster are more likely to reproduce and pass on their genes to the next generation.

Evolutionary Reinforcement Learning shows that evolution, as the Stanford researchers put it, “chooses faster agents without any direct selection pressure.”

“Interestingly, the presence of this morphological Baldwin effect could be used in future research to create embodied agents with less sampling complexity and greater generalizability,” the researchers write.

Agents undergoing deep evolutionary reinforcement learning are assessed on various tasks
Agents undergoing deep evolutionary reinforcement learning are assessed on various tasks

Evolutionary deep reinforcement learning supports the hypothesis: the more complex the environment, the more intelligent agents will emerge.

The researchers tested the evolved agents on eight different tasks, including patrolling, escaping, manipulating objects, and reconnaissance.

The results showed that, in general, agents that evolved on rough terrain learn faster and perform better than AI agents who only encountered flat terrain.

These findings are consistent with another hypothesis DeepMind Researchers: A challenging environment, a suitable reward structure, and reinforcement learning can lead to all sorts of intelligent behavior.

AI and robotics research

The evolutionary deep reinforcement learning environment has only a small fraction of the complexity of the real world.

“While evolutionary deep reinforcement learning is making significant headway in scaling the complexity of evolutionary environments, an important direction for future work will be the creation of more open, physically realistic and multi-agent evolutionary environments,” the researchers write.

In the future, scientists will expand the range of tasks to better understand how agents can enhance their ability to learn about human behavior. This work may prompt researchers to apply methods that are much closer to natural evolution.

“We hope that our work will help further large-scale research using learning and evolution in other contexts that will lead to new scientific results, and these approaches contribute to the emergence of rapidly learning intelligent behaviors and new opportunities for their instantiation in machines,” the researchers write.

In the meantime, scientists overcome the limitations of artificial intelligence, you can pay attention to our courses to learn how to use AI to solve business problems:

Also you can go to pages from catalogto find out how we train specialists in other areas.

Professions and courses

Data Science and Machine Learning

Python, web development

Mobile development

Java and C #

From the basics to the depth

And:

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *