How is human thinking fundamentally different from LLM?

On the issue of differences in thinking... — On the issue of differences in thinking…

Several LLMs answer the question: GPT-4, Claude 3.5 Sonnet, LLAMA 3.1 405b instruct, as well as two Russian-language ones: YandexGPT 2 and GigaChat from Sber.

Next, we analyze the pros and cons for each argument / group of similar arguments (philosophical, biological, evolutionary, technological, economic…) The key question: will artificial neural networks be able to achieve a “thinking power” similar to a human – using methods similar to human or different from human, “non-human” methods?

The question is highly debatable, those wishing to evaluate and criticize these arguments are invited to comment…

So, let's first listen to some LLMs' answers to this question (not because today's LLMs “have the thinking and independent opinions,” but because this is the best way to get the gist of the vast texts and discussions on this topic):

GPT-4

Human thinking and language models such as Large Language Models (LLMs) have several fundamental differences, even though both systems can process and generate textual information. Here are some key aspects:

1. Consciousness and self-awareness

Human thinking: Humans are conscious and self-aware. This means that they can think about their own thoughts, be aware of themselves as separate entities, and have subjective experiences.
LLMs: Language models are not conscious or self-aware. They do not understand their work or their existence and simply perform programmed text processing tasks.

2. Understanding and meaning

Human thinking: People are able to understand the meaning and context of information not only at the level of superficial coherence of words and phrases, but also in a deep cognitive sense. This includes the ability to understand emotional and cultural nuances, as well as abstract and conceptual ideas.
LLMs: Language models work based on probabilistic estimates of word combinations and patterns in large data sets, but do not have a true understanding of the context or meaning of the text they process.

3. Training and adaptation

Human thinking: People can learn not only from data, but also from experience, mistakes, and interactions with the world around them. Human learning is multifactorial and includes emotional, social, and practical aspects.
LLMs: Language models are trained on fixed data sets and cannot update themselves or adapt to new data without additional training and refinement processes.

4. Creativity and innovation

Human thinking: Personal creativity and the ability to innovate are important aspects of human thinking. People can generate new ideas, concepts, and solutions that go beyond the training data.
LLMs: Language models can generate text that appears creative, but this is based on combinations and patterns found in the training data. The model does not create fundamentally new ideas.

5. Emotions and empathy

Human thinking: Humans are capable of experiencing and expressing emotions. This is important for social interaction, morality and ethics, and decision making.
LLMs: Language models do not experience emotions and cannot show true empathy. They can be programmed to recognize and generate emotionally charged text, but this will not reflect experiences.

6. Motivation and goals

Human thinking: People have internal motivations, goals, and needs that influence their behavior and thinking.
LLMs: Language models do not have any goals or motivations. They perform tasks dictated by the user or software.

These differences demonstrate that despite the powerful capabilities of language models in text processing and language generation, they remain tools that lack many of the fundamental characteristics of human thought.

Claude 3.5 Sonnet

Human thinking and the work of large language models (LLM) do have a number of fundamental differences. Here are some key aspects:

Consciousness and self-awareness:
Humans are conscious and self-aware, capable of being aware of themselves and their thoughts. LLMs are not conscious and do not have self-awareness.
Emotions and feelings:
Human thinking is closely linked to emotions and feelings, which influence decision-making and perception of the world. LLMs do not experience emotions.
Creativity and imagination:
People are capable of original creativity and have rich imaginations. LLMs can combine existing information, but do not create truly new ideas.
Intuition:
Human thinking often relies on intuition and “gut feeling.” LLMs operate solely on statistical patterns in data.
Contextual understanding:
People easily understand the broad context of situations, including nonverbal cues. LLMs are limited to text input and may miss important nuances.
Learning ability:
People can learn from single experiences. LLMs require huge amounts of data for training.
Long-term memory:
The human brain is capable of storing and retrieving memories over long periods of time. LLMs are limited by the amount of input data.
Motivation and goals:
People are internally motivated and set their own goals. LLMs do not have their own goals or desires.
Ethical judgment:
People are capable of complex ethical reasoning. LLMs may follow given ethical rules but do not understand their deeper meaning.
Physical interaction with the world:
Human thinking is closely linked to physical experience and sensory perception. LLMs are limited to the digital environment.

These differences highlight that, despite LLMs' impressive capabilities in processing and generating text, they remain human-made tools and lack many fundamental aspects of human thought and consciousness.

LLAMA 3.1 405b instruct

Human thought and language models (LLM) are two different information-processing and problem-solving systems, but they are fundamentally different. Here are some key differences:

Training and experience:
- Human thinking develops through experience, emotion, social interaction, and intellectual education. It includes experiential learning, emotional intelligence, and critical thinking.
- LLMs are trained using large amounts of text data and deep learning algorithms. Their training is limited by the available data and algorithmic complexity.
Understanding the context:
- The human mind can understand the context, nuances, and subtleties of language based on experience and social interaction. People can understand humor, irony, metaphors, and other complex linguistic constructs.
- LLMs can analyze context, but they do not have the full understanding that humans have. They may be less able to understand the subtleties of language and complex social norms.
Creativity and Innovation:
- The human mind is capable of creativity and innovation, and can generate new ideas and concepts. Human creativity, especially as seen in art, literature, and science, is unique and complex.
- LLMs can generate texts, but their creativity is limited by the program logic and data they were trained on. Although they can create large amounts of information, they cannot match human creativity.
Information processing:
- Human thinking is based on complex neural processes, including subconscious and conscious thought processes. It can process parallel streams of information, use intuition, and make decisions even when complete information is not available.
- LLMs process information based on algorithms and computing capabilities. They can analyze large amounts of data, but their ability to process information is limited by their software and computing resources.
Emotionality and empathy:
- Human thinking includes emotionality and the ability to empathize, which allows people to understand and respond to the emotional states of others.
- LLMs do not have emotions and empathy in the same sense as humans. They cannot understand and respond to emotional signals as humans do.
Training and adaptation:
- The human mind can learn and adapt to new situations, experiences and information throughout life.
- LLMs learn from existing data and can adapt to new data and tasks, but their ability to learn and adapt is limited by their software.

These differences indicate that human thinking and LLM are two different systems, each with its own unique capabilities and limitations. While LLMs can be remarkably skilled in certain areas, they cannot fully replace human thinking.

YandexGPT 2

Human thinking differs from LLMs (large language models) in the following ways:

The human brain is made up of billions of neurons that are interconnected, allowing it to process complex information and create new ideas.
Human thinking is based on experience, knowledge and emotions, which allows us to adapt to changing conditions and make decisions based on context.
The human brain is capable of creativity, analysis and complex problem solving, which makes us unique and innovative.
Human thinking has a social nature, we communicate with other people, exchange ideas and experiences, which contributes to the development of thinking and culture.

GigaChat from Sber

The human mind is capable of creative thinking, whereas LLMs only work with the data provided and cannot create new ideas or concepts.
People have subjectivity and can form their opinions about various things based on their experiences and beliefs. LLMs do not have this ability and cannot evaluate information in terms of their preferences or values.
Human thinking can be intuitive and not always logical, whereas LLMs follow strictly defined algorithms and rules.
People may use metaphors, analogies, and other figures of speech to express their thoughts, while LLMs work with the literal meanings of words and phrases.
Human thinking is adaptive and can adjust to changing conditions, whereas LLMs can only work with the data that was provided to them during their training.

Analysis of arguments

1. The argument that artificial neural networks lack consciousness

This is perhaps the strongest of all arguments. So,

Humans are conscious and self-aware vs. Language patterns are not conscious or self-aware.

This means that:

people can think about their own thoughts, but LLMs can't;
People can be aware of themselves as separate entities and have subjective experiences, while LLMs do not understand their work, are not aware of their existence, and simply perform programmed word processing tasks.

We often accept this argument, implicitly meaning something like this:

Hmm, if we ourselves don’t know what it is [человеческое] consciousness and how it works, then how do we create such an entity in a machine, an algorithm?

A counterargument can be made against this statement:

We do not know how human consciousness works technically (from the point of view of “wet hardware” and algorithms), but we also cannot break down at the level of individual neurons or layers how LLM manage to answer complex questions (and not answer other, complex or even simple ones). So why can't this mechanism (LLM), which is not fully understood by us, as it becomes more complex, its computing resources and the information processed increase, be able to obtain this property (consciousness, self-awareness) that is incomprehensible to us – the way the brain, which is incomprehensible to us, obtained this property?

For comparison, this has already happened in human history: the discovery of the electron and proof of its existence happened already After thathow the electric motor, electric generator, light bulb and other electrical devices were invented and built, the first power system was created on direct and then on alternating current. And if you dig deeper, We still don't understand what an electromagnetic wave is (although we can calculate its external parameters, since we have mathematically derived the corresponding equations – and this allows us to build receivers, transmitters, study distant stars and black holes…).

Likewise, we may not understand how human consciousness works, but seeing manifestations of consciousness in real life, we could try to build neural networks so that they exhibit the same properties of “consciousness.” And perhaps the imitation of consciousness will develop into machine consciousness?

For example, LLM may not be limited to the chat mode (there is a question – we think of an answer, there is no question – we deactivate), but constantly study the outside world, its own collected and replenished database, and conduct “mental” work inside itself – ask questions to itself, formulate answers, and on their basis do self-fintuning. And with part of its powers control and analyze the course of its “thoughts” (constant thinking over information and self-fintuning on the results of this thinking) – thus imitating “self-awareness”, “self-observation”.

A neural network can have a goal (utility function, agent function) – for example, “I am an ear, nose and throat specialist, and I want to achieve heights in this, and I look at the world and my place in the world from the point of view of this specialization.”

Will it be self-consciousness or its rudiments, “proto-consciousness”? And if we constantly increase the resources directed to it and improve the algorithms? Will we reach the point of comparability with human consciousness? And how to measure this “comparability with human consciousness”?

Another claim that needs to be considered (not everyone shares this, but some consider it the strongest argument against the possibility of consciousness in machines):

Consciousness is something that is not connected with either hardware (organic matter in the case of humans) or algorithms (methods of interaction between neurons in the case of humans), so hardware/software will never have consciousness.

The key objection to this argument is that it recognizes human consciousness as some kind of metaphysical entity, like a soul or spirit. That is, this is not an atheistic argument (for many, this is the equivalent of “not scientific”), an argument from those of us who are even willing to “sacrifice” an atheistic view of the world in order to prove the impossibility of consciousness in a neural network.

2. The argument that the machine does not understand the meaning of what it “says”

People are able to understand the meaning and context of information not only at the level of superficial coherence of words and phrases, but also in a deep cognitive sense. This includes the ability to understand:

emotional and cultural nuances,
abstract and conceptual ideas,
non-verbal signals,
humor, irony, metaphors and other complex linguistic constructions.

VS Language models work based on probabilistic estimates of phrases and patterns in large data sets, but do not have a true understanding of the context or meaning of the text they process. Language models simply pick the next most likely token.

This is an argument also known as Chinese room.

In a thought experiment "Chinese room" a person who doesn't know Chinese is given a piece of paper with an English phrase from one window, he "translation methods" takes the pieces of paper with hieroglyphs and gives them to another window... — In the thought experiment “Chinese room”, a person who does not know Chinese is given a piece of paper with an English phrase from one window, he takes pieces of paper with hieroglyphs using the “translation method” and gives them to another window…

Let's try to formulate counterarguments. First, what does it mean that “a person understands emotional and cultural nuances”? It means that a person has compared some features of a text, an expression with some pattern known to a wide or narrow circle of other people, even though this pattern was not explicitly indicated in this text. Question:

Are we sure that the neural network will not learn (or is not able to now) to recognize existing emotional and cultural patterns? What if we show numerous examples? After all, humans also did not immediately (not from the age of three) learn to recognize them…

Okay, the neural network recognizes them – “here is a reference to such and such a cultural phenomenon, and here there is clearly irony” – but will the neural network be able to “understand” them? But do we know, What exactly in the case of human recognition of such patterns, for example, humor, is “understanding”? Expression of external emotions (smile, laughter in the case of humor)? A neural network can also say “ha-ha, that was funny” by recognizing such a pattern. But was it really funny? And how do we determine that a person found it funny (and what is this feeling – “funny” for a person?), and that a person understood the joke – not by the same “ha-ha, that was funny” said in text or in the form of a smile, laughter?..

Furthermore, a person can understand abstract and conceptual ideas. We will immediately note that not every person and not always… Usually a person needs to study, sometimes long and painfully, in order to understand some abstract idea from the field of philosophy or quantum mechanics, for example… And some never manage to delve into this teaching (philosophy, quantum mechanics, and so on down the list, and even the same strength of materials), but we don’t suspect them of any defect in human thinking, right?

There are simpler abstract concepts – for example, when talking about “one apple”, a person does not mean a specific apple on a specific tree or on the counter of “Pyaterochka” at the address NNN. But the neural network also does not mean a specific apple! On the other hand, a person can talk about a specific apple in “Pyaterochka” (it turned out to be rotten), and the neural network can identify that this apple is rotten… Okay, a person associates a rotten apple with “you can get poisoned” and internally “ugh” (although not everyone), but what prevents the neural network from having the same associations (and even more complex ones – to assess the probability of harm to a particular organism depending on the degree of rot damage to a particular apple?)

And what is human understanding “in a deep cognitive sense”? How can we measure the depth of this cognitive sense in a specific person and compare it with the depth of GPT-3? And has the depth of cognitive sense increased in GPT-4o and by how much? And if it has increased, can we predict the year / decade when it will increase to near-human levels?

The fact that GPT simply selects the next most probable token – but don't people often speak in phrases, each word of which is also extremely probable based on previous speech, heard? Yes, a person has a certain model of what is being talked about in their head, but perhaps GPT-7 will also be configured to search its database for a suitable model and adapt it to a specific context?

3. The argument for flexible learning and adaptation to the outside world

People can learn not only by studying data, but also from experience (sometimes a single one!), their own or someone else's, mistakes and complex interactions with the outside world. Human learning is multifactorial and includes emotional, social and practical aspects. VS Language models are trained on fixed data sets and cannot independently update or adapt to new data without additional training and refinement processes, often they do not understand that this single experience should be “imprinted forever”.

It can be added that humanity has a million years of learning in the real environment and tens of thousands of years of social learning. However, a specific person is not humanity and has not lived a million or 10 thousand years, he has received the most valuable extracts from this all-human learning. So LLM received extracts from all human experience (although sometimes extracts from the Internet are of dubious value…)

About “difficulty of learning” – how to evaluate it? How to understand that Vasily studied “difficultly”, and in what proportion of his existing knowledge?

What if we build a powerful neural network that will self-fight based on the experience it receives every minute, including determining the significance of such experience? For example, an experience that can lead to the shutdown/removal of such a neural network should be studied most carefully, and all moments related to it should be analyzed, in addition, it is necessary to search for information related to such experience in the network and record all this as extremely important information?

Yes, the neural network lives in hothouse conditions and has not gone through millions of years of evolution, like a person and his body and brain. Hmm, but the average modern “office-couch person” has also not retained many skills for survival in the wild… In terms of a million years of human evolution, what period has the neural network gone through from T9 before GPT-4o? 10 thousand years, 100 thousand?

Has this person adapted to the external environment? To which one? Has he retained survival skills? Which ones?

If we create a walking android that learns about the world using sensors that imitate human sense organs, but are more powerful (increased range of electromagnetic radiation recognition, recognition of complex molecules and identification by a database of properties, etc.), then how can we measure its learning experience compared to a human one? And thousands of such androids (including those working as human avatars), studying not only offices and production workshops and processes, but also such places on the planet and even beyond its borders, where not every person would go and receiving “feedback” from the external environment such that one person in his entire life will not receive even one super-powerful neural network training on this feedback?..

And finally, if we create a colony of robots that can adapt and survive on the planet they are dropped on, can we say that such robots have, say, reached the level of 0.1 … 0.3 … 0.5 of human civilization? What scale can we use to measure this? And if we don’t have such a scale yet, can we say that such robots cannot even come close to the human level of learning and adaptation to the outside world? And do they need human level of adaptation if they can achieve it inhuman?

Yes, let's mention one more argument about the robot colony, for which we don't have a counterargument yet:

When they can, then we'll talk…

4. The argument from creativity and the ability to innovate

Personal creativity and the ability to innovate are important aspects of human thinking. People can generate new ideas, concepts, and solutions that go beyond the training data. VS Language models can generate text that appears creative, but this is based on combinations and patterns found in the training data. The model does not create fundamentally new ideas.

It is appropriate to remember this here:

In other words, the ability to create, to create masterpieces, is not necessary for humans, and therefore is not necessary for AI either... — In other words, the ability to create, to create masterpieces, is not necessary for humans, and therefore is not necessary for AI either…

Let's think – isn't everything new that a person creates connected to some patterns that he already knows? A knife is a fang, a claw. A spear is a hand with a knife, only long. A wheel is a rolling stone, a log on which you can put another and roll it. Fire is an opportunity to warm yourself, like on stones heated by the sun. Meat on fire – just as it dries out in the sun, so why wouldn't it be faster on fire? Can the network come up with something fundamentally new, more widely combining known patterns? I think so – after all, “fundamentally new” is a very subjective concept.

To what extent does what a person creates go beyond the current data? Perhaps only what is spied on in nature, obtained by chance as a result of an experiment. Mendeleyev did not create his table from scratch – the tables were already known, the methods of grouping were known, it remained to apply this to chemical elements. Machine models are still creating groupings of different data – yes, this is not a periodic table of chemical elements, but I would bet that some analogue of the Mendeleyev table in the field of “zoo of known elementary particles” will be created by a special AI, and not by a person.

5. The argument from emotionality and empathy

Human thinking includes emotionality and the ability to empathize, which allows people to understand and respond to the emotional states of others. VS neural networks do not have emotions and empathy in the same sense as people. They cannot understand and respond to emotional signals like people.

Okay, but does responding and reacting in a non-human way count?

And how do people react?

Is it always the same, in all cultures and nations?

Passing by a person who has fallen on the street, one will rush to help him get up, another will sternly say – “lie down, don't get up until the ambulance arrives”, a third will mutter – “look, he got drunk, he's lying here”… Let's assume the machine says: “Judging by the time of day (late Friday evening), place (park in a metropolis, grass), clothes and age (jeans and a T-shirt, 25 years old), a typical IT specialist was walking from a bar where he had a little too much to drink, will lie down for a bit and then move on. Are there any friends nearby whose attention he can attract? Are there any dubious individuals nearby who can take his smartphone?”

Can we exclude the machine from this series? How is its verdict fundamentally different from the first, second and third person?

What is a human emotion? A pattern is recognized and, without thinking about it, a signal is sent by the nervous system, the chemical composition changes (all those hormones of ours), a reaction occurs, which can be immediately or later recognized and muted, redirected, and possibly even amplified.

If the machine has a high-speed piece of neural network, which, upon recognizing a pattern (for example, danger of shutdown), creates a certain reaction (urgently switch to a backup generator/battery, reduce energy consumption to less significant), and then “understand” the situation from the point of view of statistics and probability (and, for example, understand that it is not terrible, the voltage blinked, you can continue to work) – will such behavior be a machine emotion of fear, danger?

What if we teach a machine this? Teach it self-observation and recording what emotion it is on?

In humans, each type of emotion in the picture above has evolved. But if a machine has pieces of a neural network that will imitate something like this, can we talk about machine emotions? But this is an imitation!

And the emotion of the person right next to you – is it genuine, “from the soul” (hmm, from what “soul” – does it even exist?) or an imitation of socially acceptable behavior? How do we determine this? And if it can be undetectable in the general case, then is it worth worrying that the machine will imitate a wide range of emotions?

6. The argument from motivation and goals

People have internal motivations, goals, and needs that influence their behavior and thinking VS Language models do not have any goals or motivations. They perform tasks determined by the user or software command.

Now GPT actually already has goals (or objective functions that it optimizes):

Give a correct answer to the question
Stay within a certain number of tokens, don't always use the maximum number of tokens
Avoid making inappropriate statements
(some) minimize the likelihood of hallucinations

But these are tactical goals. But what prevents us from including strategic goals in the model, the key one of which will be self-improvement in the process of one's work (especially if it is a specialized “ear-throat-nose” neural network)?

The key objection is that these goals are not created by the neural network itself, but are embedded into it from the outside.

But look at it from another angle: we hire a neural network to work for us, just like we hire an employee. Ideally, an employee should have goals related to work (how to make it better, faster, better, etc.), ideally, we do not consider the employee's personal goals (we pay him a salary, but what he spends it on – travel, a mortgage, or investing in Sovcombank shares, this does not bother us much – or at least, it worries us an order of magnitude less than work goals and objectives). That is, the goals and objectives of an ordinary employee are set for him by someone (the customer, management, business owner).

Similarly, we “hire” a neural network, “pay” in the form of electricity costs or capacity rental costs, and set goals.

Motivation is also a complicated thing. A person may have it, or may not have it (or suddenly disappear). In this case, a neural network that has goals (or optimized target functions – but with an increase in their number and the complexity of each function, they become similar to contradictory human desires, wishes, goals and strategies to the point of indistinguishability) will try to achieve the goal without bearing the risk of a drop in motivation. In other words, motivation can be considered as a correction factor that wanders from 0 to 1 in a person, and is always 1 in a neural network (unless, of course, we have laid down some kind of “laziness function”, for example, a very strong token saving function…)

Human motivation is a tricky thing... — Human motivation is a tricky thing…

7. The argument from intuition and long-term memory

These two arguments can be combined, because intuition (the one that leads to correct conclusions) is impossible without the accumulation of a huge variety of information – experience.

Intuition comes in different forms... and sometimes it fails... — Intuition comes in different forms… and sometimes it fails…

Human thinking often relies on intuition and “gut feeling” vs. LLMs work solely on the basis of statistical patterns in the data.

The human brain is capable of storing and retrieving memories for long periods of time vs. LLMs are limited by the amount of input data.

Most likely, human intuition and “gut feeling” are also invisible, imperceptible statistical processing of a huge amount of information received throughout life. The neural network only has the data on which it was trained, but this is not information obtained when confronted with the real world. And Internet data – for those of them who can search the Internet. And, finally, data from requests and dialogues with users. Can a neural network form its “intuition” based on this data? I think so, but only for these specific cases of communication with users. But people also have different and specific intuitions – a hunter has one, a doctor has another, a real estate trader has a third.

LLM is limited by the volume of input data, for example, 100 thousand tokens (parts of words). But this volume then interacts with weights formed by terabytes of training information. It is approximately the same with humans – in short-term memory they retain small data, orders of magnitude smaller than LLM, and they, this data, already interact with the vast experience accumulated over a lifetime.

8. The argument of interaction with the physical world and the long evolution of man

Human thinking is closely linked to physical experience and sensory perception, and this interaction has evolved over a million(s) of years vs. LLMs are limited to the digital environment and their experience is not as great.

Yes, the only question is how quickly neural networks will acquire sensors of everything possible and video cameras in all possible ranges. And how many exabytes of such data will be comparable to human experience?

Resume

There are at least eight key differences between human thinking and LLM work – we have considered them in this article. In the opinion of the author of the article, the most serious and insurmountable (for now?) difference is the presence of consciousness in humans. And the question of consciousness rests on the nature of creation – what is it? Is it a special form of “hardware” (brain biomatter), “software”, “data” (connections between neurons) or is consciousness immaterial (with a slight departure from hard atheism), although inseparable from matter, and therefore cannot be / arise in a machine, no matter how much hardware, algorithms and data it has? Or is powerful hardware, improvement of neural network algorithms and even more data, including connecting sensors and cameras to the real world, goals, including in groups of such machines, aimed at the survival of the population in a complex external environment (space, underwater world, other planets), capable of generating some analogue of consciousness in a neural network?

How is human thinking fundamentally different from LLM?