How to recreate a person using AI?

In addition to neural networks and machine learning models based on perceptrons, there are also so-called cognitive architectures – they are aimed at simulating human intelligence, as cognitive science imagines it through the prisms of heterogeneous cognitive theories and hypotheses.

For psychology, this is where humanity and that strong artificial intelligence, AGI, hides, simulating all human abilities. But how do architectures like ACT-R or SOAR work, and are they suitable for advancing general intelligence? – in our article.

Cognitive technologies and cognitive science in general are about our mental abilities. Psychology, neuroscience, artificial intelligence, linguistics, philosophy and anthropology – cognitive science is not a direction, but a whole concept for understanding the human “psyche”. If we imagine other psychologies, they will often contain additional authorities that we do not particularly need.

Freud has the same structure of the Ego, which can only “hint” at itself… Cognitive science avoids this and therefore becomes an ambassador of scientific knowledge in the field of psychology.

Perception is about how people interpret sensory information from the world around them. Attention examines how we select and focus on specific information while ignoring other stimuli. Memory demonstrates how information is stored and retrieved from our mind.

Language examines how we understand and produce speech, and how language structures our thoughts. Thinking involves the processes of logical reasoning, problem solving, and decision making. Imagination has to do with our ability to imagine things that are not immediately present in front of us.

It is important that cognitive science is more related to sensory, interpretation and processing of information – it is, first of all, about the mind. Unlike conventional psychoanalysis with its unconscious, cognitive science strives to generate scientific knowledge from the point of view of the criteria of verifiability and transparency.

Since the 70s, scientists have followed the path of not only developing a theory of cognition based on the behavior of subjects – they began to form architectures for simulating human cognitive abilities.

Simple experiments preserve the core problem of psychology—the black box.

Knowing the results, we cannot know their causes. You can't get into people's heads.

We received many architectures and they all solve different problems: the problem of errors, language, the formation of visual representations – today we will talk about memory simulation according to the theory of adaptive control of rational thinking and the implementation of SOAR according to the USP concept (goal-setting operational architecture).

Figure 0. A simple classification of cognitive neural networks.

Figure 0. A simple classification of cognitive neural networks.

How did researchers break down human intelligence into modules?

(Adaptive Control of Thought-Rational) or ACT-R. As the name suggests: control of rational thought. The advantage of rational thinking for developers is that it fits into sequential logical chains and is subject to some kind of structure.

The developers of ACT-R and other architectures bring all knowledge and perceptions of the model into an understandable information form.

There will be no word here about classical language games or the uncertainty of meaning. The task of architecture is research. Whatever the neural network is, it remains a simplification, but one that allows us to compare research results.

Developed by John R. Anderson and his colleagues at Carnegie Mellon University, ACT-R seeks to formalize the understanding of how the mind organizes and uses knowledge to perform various tasks (e.g., the Tower of Hanoi, learning a list of words, understanding language, communicating, flying an airplane… ).

Researchers create models that, in addition to incorporating the ACT-R perspective on cognition, add their own assumptions about a particular task.

These assumptions can be tested by comparing the model's results with the results of people performing the same tasks. Thus, researchers try to fit the data to actual empirical results. By the way, the results are assessed by the time to complete the task, the level of accuracy and neurological data on FMRI.

Numerical estimates are needed to objectify the results.

Researchers can adjust the number of factual and practical knowledge to imagine how many operations and facts a person needs in his head in order to adequately drive a car and not lose his license on the first day of his race…

But ACT-R is not just a program for simulating experiments, but a very specific theory of memory, according to which a neural network was implemented.

The key goal of such systems is not an accurate representation of the human organization of the mind, but to help interpret research and “get closer” to understanding human intelligence… ACT-R, like SOAR, does not strive to recreate the “mind” in full.

ACT-R implements one of the theories of cognition, which is not necessarily true and fully transferable to humans.

ACT-R began as a model of human memory and only then evolved into a unified theory of cognition.

Figure 1. Visualization of the ACT-R architecture

Figure 1. Visualization of the ACT-R architecture

The structure of a neuron is simple: it consists of several modules, buffers and a “pattern matcher”.

As their name suggests, buffers serve as an interface for interacting with modules. the contents of the buffer show researchers how the architecture works in a given model. This is temporarily captured information from memory modules and visual perception.

A buffer is a slice of cognitive processes in real-time.

On Figure 1. It is clearly seen that we receive some data from the environment through sensory modules, for example. The information is stored in a buffer before being transferred to other modules – we can view them.

These buffers have their own characteristics, such as how quickly memory fades, what is transferred from vision to cognition, and what actions can be performed between buffers.

Although buffers become a kind of interface and a slice of data withdrawn from an imitation of the human visual system, they also recode impulses, allowing you to interact with a conditional memory module in your own language

Declarative memory. This module stores factual knowledge and memories in the form of memory cells. Declarative memory contains information that can be consciously recalled and described, such as facts or events.

Examples of declarative knowledge: “Paris is the capital of France,” or memories of events such as “Yesterday I went to the cinema.” Memory cells have certain characteristics that determine how easily that information can be retrieved from memory.

The more often and in more relevant contexts a memory cell is used, the higher its degree of “ease” of activation, and, therefore, the more easily it is available for use in cognitive processes.

Procedural memory. This module contains production rules that define how to use declarative knowledge to perform specific actions and solve problems. We give our machine, for example, rules for printing the letter Q or adding numbers.

But in order for this knowledge to be applicable at all, there must be an imitation of the sensory system, because it is this that provides communication with the world. Accordingly, the model has two modules: motor and visual.

This component is responsible for processing visual information coming from the environment. For example, when reading text, the visual module identifies letters and words, and working memory uses this information to understand what is read.

Motor skills module. This component coordinates the execution of physical actions such as movements of the arms, legs, eyes and other parts of the body.

The motor skills module receives commands from the central production system and converts them into specific motor actions, ensuring the precise execution of physical tasks – this is where we talk about robotics.

It is important to note here that ACT-R works in more than one format: for example, in addition to visual ones, even auditory ones can be added to sensory modules.

Figure 2: Here is a parallel between think tanks and ACT-R modules

Figure 2: Here is a parallel between think tanks and ACT-R modules

But should there also be a center for making conditional decisions?

The external world goes through a series of perceptions/memory calls that send data from the external world, its own library (as in the case of memory) and run through a procedural module – also known as Pattern Matching.

Subsymbolic structure is a set of massively parallel processes that can be summarized using a series of mathematical equations. And this is an important component of the last element of the architecture – Pattern Matcher.

Figure 3. Relationship between symbolic parallel computing and template connections.

Figure 3. Relationship between symbolic parallel computing and template connections.

Let's imagine that we have a problem: we need to determine how a person solves an arithmetic problem, such as adding the numbers 3 and 4.

According to ACT-R theory, a person’s memory stores many products – simple rules that determine what to do in various situations.

The output might look something like this: “If I see numbers A and B and I need to add them, then I will do the addition operation.”

The Pattern Matcher is responsible for selecting the appropriate output from the many possible ones stored in memory and activating the corresponding rule.

At the initial stage, Pattern Matcher receives current information from working memory, which includes the numbers 3 and 4 and the goal of adding them. It then compares this information with the conditions of various products.

It is important to understand that each product condition is a template that must fit the current situation. A template can include various elements: concrete values, data types, or more abstract characteristics. For example, in our case, the pattern may include numbers and an addition operation.

The matching process takes place on the basis of familiar activation, where the most suitable products receive the highest degree of activation.

Activation is determined by several factors: frequency of product use, recent relevance, and contextual suitability.

Products that have been recently used or are used frequently have a higher chance of being selected.

After selecting a suitable product, Pattern Matcher initiates the execution of the action described in the rule. In our example, this action is the operation of adding the numbers 3 and 4, the result of which is 7. This result can then be stored in working memory for later use or verification.

Figure 4. Expanded diagram of ACT-R operation

Figure 4. Expanded diagram of ACT-R operation

However, an important aspect of how Pattern Matcher works is its ability to learn and adapt. Over time and experience, the frequency and context of product use may change, affecting the comparison and selection process.

For example, if a person frequently encounters addition problems, the products associated with this operation become more active and their selection occurs faster and with greater accuracy.

Subsymbolic mechanisms are also responsible for most of the learning processes in ACT-R. Thus, in ACT-R, cognition unfolds as a sequence of production impulses.

These impulses change in the modules, and the buffers become transcoders between the organs of cognition, where information is dosed on demand and enters the executive organ. (Pattern Matching).

The main difference between conditional ACT-R and a regular perceptron is that we train the model to match patterns, rather than look for hidden patterns in the data.

Although no one is talking about the impossibility of using ACT-R in the future in an ensemble with generative neural networks and transformers, although such a synthesis is still far away.

But one could consider the path of synthesis of cognitive architectures, robotics and classical AI as a path to artificial general intelligence. At least in the future.

Thus, thanks to sensory information, declarative and procedural memory through buffers under the control of the executive organ, we perform a specific cognitive task.

Figure 5. Practical Applications of ACT-R: Computer Science, Cognitive Science, Neuroscience, and Education

Figure 5. Practical Applications of ACT-R: Computer Science, Cognitive Science, Neuroscience, and Education

By the way, ACT-R was used not only in cognitive sciences: UI interface tests, work with education, neuropsychology, robotics. An obvious advantage of such neurons is that they can help us create marketing products. Perhaps someday we will see the implementation of this in a conditional advertisement for Colgate…

But, as we have already said, cognitive theory is only a theory and ACT-R is not the only decision-making architecture. From her point of view, the main source of solutions to cognitive operations is the use of templates for certain patterns of information.

As soon as we see a plate of soup, we immediately run for a spoon; As we see a quadratic function, we immediately follow Vieta’s theorem.

But we decided not to stop at just one cognitive system.

Operational Cognitive System – SOAR?

SOAR Cognitive Architecture, developed by Allen Newell, John Laird, and Paul Rosenbloom, is a feature-rich platform that also emulates the principles of cognitive operations.

Like ACT-R, the system implements a certain concept of cognitive psychology. But if in the previous architecture we worked more with searching and connecting patterns in accordance with the situation, i.e. ready-made patterns and solutions.

Here we work according to the Universal Problem Space, UPS system – it tells us about simple things: a person has goals, he looks for solutions to the goals and breaks down the goals into small goals. UPS is based on the concept of product systems.

There is a problem – there is a solution. There are a number of operations between a problem and a solution. That's all.

Figure 6. Procedural rules: decision-action.

Figure 6. Procedural rules: decision-action.

Knowledge is presented in the form of production rules: “If this and that, then I will do this and that.”

Mathematically, the processes in SOAR can be described in graph theory and predicate logic; formalized as logical expressions, where the condition is represented as a logical formula, and the action – in the form of state change operators.

Predicate logic is the same Aristotelian logic with modifications. A predicate is something that can be said about an object.

For example, a rule might look like this: “If I see a traffic light and it's green, then I go.” These production rules are stored in long-term memory and are activated depending on the situation in which the system is located.

These rules allow the system to make decisions based on the current state and the presence of certain conditions. But we will return to decision-making later.

Before we could make decisions and build our incredible logical chains, it is important for us to take knowledge from somewhere.

How does long-term memory work in SOAR?

Figure 7. Diagram of long-term memory.

Figure 7. Diagram of long-term memory.

Long-term memory is storage. Unlike ACT-R, where our store of information is divided into skills and factual knowledge, SOAR builds a memory system that is a little more complex: it is divided into working and long-term.

Although, as we will see, the difference in systems is much subtler than it seems.

Long-term memory can be compared to a library where books (knowledge and experience) are stored that can be taken and returned as needed: production rules, episodic memory, semantic connections, and even evaluation of information retrieval processes.

Those same “if-then” rules (if the condition is met, then do something) are also stored in long-term memory, but are not limited to it.

Semantic memory stores general knowledge and facts about the world. It can be thought of as a dictionary or encyclopedia that records concepts, definitions, and relationships between them. We would call this terminological memory.

To know what to do with a spoon, it would be nice for you to understand “what it is…”

For example, semantic memory will contain knowledge that an apple is a fruit, that it grows on a tree – it is edible.

Episodic memoryon the other hand, stores information about specific events and experiences that occurred in the past.

It is like a diary or photo album where all the important moments and events of your life are recorded. For example, an episodic memory will contain the memory of your last birthday: who was there, what you did, what gifts you received.

When the system encounters a new task or problem, it “goes” to long-term memory and “searches” for relevant production rules or knowledge that can help solve this problem.

For example, if the system needs to solve a math problem, it retrieves rules and knowledge related to arithmetic from long-term memory and applies them to the current problem.

When the system gains new experience or knowledge, it is also added to long-term memory.

For example, if you learn a new phone number or learn how to cook a new dish, this knowledge will be stored in long-term memory and can be retrieved when needed.

Therefore, long-term memory here becomes a dynamic element of the system: new rules are created in it, possibilities for solving problems, and even terminological amendments are made.

This process is called chunking. For example, if you successfully remember a phone number several times, the system can create a new rule that makes this process easier in the future.

Working memory in SOAR?

Figure 8. Abstract diagram of working memory.

Figure 8. Abstract diagram of working memory.

Let's start with the fact that our working memory cannot work without long-term memory – it acts as a central link connecting relevant sensory information and long-term knowledge.

The core elements of working memory in Soar include states, goals, and temporary data structures, which are used to guide problem-solving.

States in working memory represent a description of the current situation or context of the task the system is working on.

For example, if the task is to make tea, the state might include information about the availability of water, a kettle, tea bags, and electricity. In short, states are the state of affairs at a given moment in time.

Naturally, AI stores semantic and procedural knowledge in its long-term memory, which seems to answer the question:

“What is your water and what does tea bags have to do with it?!”

Goals are the desired results or end states that the system strives to achieve. In the case of making tea, the goal can be formulated as “to prepare a cup of tea.”

Figure 9. The principle of carrying out operations from the initial state (state) and gradually approaching the desired one through moving cubes to the final goal (goal).

In general, for cognitive psychology, the nuance of goal setting is important: any action has goals and subgoals. But we are not yet talking about the robot’s global desires to become an artist and draw a million paintings for this…

Goals help the system focus on specific objectives and guide the sequence of actions to achieve them.

The working memory process in Soar begins with loading an initial state and setting a goal.

The system then uses production rules stored in long-term memory to determine what actions need to be taken to achieve the goal.

For example, one of the rules could be: “If there is water in the kettle, and the kettle is connected to the mains, then turn on the kettle.”

These subgoals are managed hierarchically and can be nested, allowing the system to sequentially solve more complex problems by breaking them down into simpler steps.

Working Memory is constantly updated. For example, when the water in the kettle boils, the state is updated from “water in the kettle” to “water is boiling”. This update allows the system to adjust its actions based on the current situation.

Figure 10. General SOAR diagram

Figure 10. General SOAR diagram

By the way, the SOAR architecture also has a kind of buffer where audiovisual and perceptual information is stored – Perceptual Short-Term Memory, Perceptual STM.

It is collected from GPS, sensors and locators using detectors and classifiers.

As a result, we have a simple and humane system:

Goal, assessing the state of affairs through “perception”, choosing a procedural action for our particular case, action, updating working memory, choosing another procedural action. And so on until the goal is achieved. We then commit the result to long-term memory.

At the code and mathematical implementation level, SOAR uses many algorithms and data structures. The main working memory is represented as a graph, where nodes contain information about current states, and edges contain information about possible actions.

Search algorithms are formalized as graph traversal procedures using various path cost estimation strategies.

Will cognitive architectures help AGI?

Both ACT-R and SOAR are two simulations that reflect two different approaches to human consciousness. But both start from memory. The USP system or decision making through goal setting is a completely accepted concept in the field of all cognitive science. Their difference lies more in implementation and questions about subgoals. In ACT-R we see a rectilinear robot that solves problems on the go, in SOAR we see a planning robot.

It is obvious that for perception modules or simple human perception, AI detectors are used up to models of classification and object recognition.

Of course, such models in the implementation of an ensemble with modern classical AI could show interesting results and flexibility. But, unfortunately, such systems are suitable for teaching simple robotic actions.

There is no talk of meaning-making here, whatever that means. The emergent properties of consciousness are not taken into account, there is no hint of self-reflection – only algorithmic and operational actions: stimulus-think-reaction. And that same “think” is still in the status of a black box.

And although some readers may hint at Leabra and Becca with an emergent approach, but only to specifically represented knowledge, based on external observation and modeling without “immersion” in human neural systems – in the context of cognitive science, we look at phenomena, manifestations of unique human properties from the outside.

Such problems are caused by the very methodology of cognitive psychology: it looks at behavior, but cannot understand the internal processes that cause them.

Although the problem is even more serious from a functional point of view – all proposed cognitive architectures are narrowly focused for solving specific problems. We can define a blank of the motor-motor system of an android, but there is no way to synthesize it with the architecture of the emergence of consciousness.

ACT-R and SOAR have proven themselves in robotics, studying user interfaces, identifying learning problems in students, and validating research. It seems to us that this is already quite good. AGI faces too many challenges to limit itself to the use of simple cognitive architectures.

Moreover, we are seeing progress in the search for the mechanisms of human thinking through experiments and an endogenous (random) approach applied to the networks themselves.

Last year, scientists published two studies. It turned out that physical brain activity coincides with the work of a neural network developed according to the “self-observed learning” system.

Figure 10. Grid Cells Visualization

Figure 10. Grid Cells Visualization

Second study shows that during the work of one of the AIs, which was tasked with developing navigation, it developed something very similar to grid cells. These are types of neurons that were discovered in 2005. It is believed that they are responsible for the navigation abilities of some animals and people as well.

It is possible that the implementation of AGI and human consciousness in general lies not in the field of designing the mind, but rather in generating it from suitable conditions – here everything depends only on sufficient computing power and optimization difficulties.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *