A Brief History of NLP – Natural Language Processing

The history of the use of natural language processing systems is only 50 years old, but every day we use different NLP models. In various search queries, translators and chat bots. NLP originated as a fusion of artificial intelligence and linguistics. Linguistics is a science that studies languages, their semantics is the semantic units of words, phonetics is the study of the sound composition of words, syntax is the nominative and communicative units of a language.

Noah Chomsky was a linguist who revolutionized the field of linguistics and changed our understanding of syntax. He created a system of grammatical description known as generative or generative grammar (the corresponding current of linguistic thought is often called generativism – NLG). Its foundations were formulated by Chomsky in the mid-1950s. Chomsky’s work was the beginning of the rationalist trend in computational linguistics. The starting point of rationalism is language-independent computer models. Models are best accepted when they are as simple as possible. Here one can draw a parallel with Saussure’s idea of ​​separating language from the real world.

Avram Noam Chomsky is an American public intellectual: linguist, philosopher, cognitive scientist, historian, social critic, and political activist.  Sometimes it is called "father of modern linguistics"Chomsky is also a major figure in analytic philosophy and one of the founders of the field of cognitive science.
Avram Noam Chomsky is an American public intellectual: linguist, philosopher, cognitive scientist, historian, social critic, and political activist. Sometimes called the “father of modern linguistics”, Chomsky is also a major figure in analytic philosophy and one of the founders of the field of cognitive science.

This approach did not give good results from the very beginning, but as work continued in this direction, the results became somewhat better than with systems that follow a “bottom-up” approach. Chomsky’s theory of universal grammar provided a schema independent of the individual characteristics of a particular language. The syntax was best suited to models of independent languages ​​that considered only languages.
Early machine translation researchers realized that a machine would not be able to translate input text without additional help. Given the paucity of linguistic theories, especially before 1957, some have suggested pre-editing the texts in such a way as to mark difficulties in them, for example, to eliminate homonymy. And since machine translation systems could not produce the correct result, the text in the target language had to be edited to make it understandable.

Natural language processing systems expand our knowledge of human language. Some of the NLP tasks explored include automatic summarization (automatic summarization creates a meaningful summary of a set of texts and provides concise or detailed information about a text of a known type), collaborative summarization (joint summarization refers to a sentence or larger set of text that defines all the words that belong to the same same object), discourse analysis (discourse analysis refers to the task of determining the discourse structure of a linked text, i.e. machine translation).

As we said above, NLP can be divided into two parts, into NLU – Natural Language Understanding. And NLG is natural language generation. In the context of our problem, we are interested in the first – Natural Language Understanding. Our task is to teach the machine to understand the text and draw conclusions from the material that we offered it. NLU allows machines to understand and parse natural language, extract concepts, entities, emotions, keywords, etc. It is used in customer service applications to understand problems that customers report verbally or in writing. Linguistics is the science that studies the meaning of language, language context, and various forms of language.

Understanding natural language is the task of linguistics, which includes components such as phonology, morphology, syntax, and semantics. These are all components of any sentence in any language, the study of which is an important task for a general understanding of how natural language processing is built.

The first application of natural language processing was machine translation. The goal was to create a machine capable of translating human speech or text. The first steps in this area were taken by Georgetown University and IBM Companying 1954. The program was able to translate 60 Russian sentences into English. As IBM later reported, “This program included logic algorithms that made grammatical and semantic ‘decisions’ that mimic the work of a bilingual person.” This breakthrough provided insight into how future technologies and data processing capabilities will evolve.

On January 7, 1954, IBM demonstrated an experimental program that allowed the IBM 701 computer to translate from Russian into English.  In 1959, the Mark 1 Translating Device, developed for the US Air Force, produced the first automated translation from Russian into English.  The Mark 1 was shown to the public at the IBM Pavilion at the 1964 New York World's Fair.
On January 7, 1954, IBM demonstrated an experimental program that allowed the IBM 701 computer to translate from Russian into English. In 1959, the Mark 1 Translating Device, developed for the US Air Force, produced the first automated translation from Russian into English. The Mark 1 was shown to the public at the IBM Pavilion at the 1964 New York World’s Fair.

In the late 1960s, Terry Winograd of the Massachusetts Institute of Technology develops SHRDLU, a natural language processing program. She was able to answer questions and take into account new facts about her world. SHRDLU was able to combine complex parsing with a fairly general deductive system, operating in a “world” with visible counterparts of perception and action. The machine could answer simple questions, and it seemed that if you put enough effort into conveying meaning and limited yourself to a certain area, SHRDLU could achieve natural communication. But even this early approach had its pitfalls, which did not allow it to be developed further, the machine still did not understand the text well, and it rarely managed to understand what text SHRDLU could compose.

The user asks the program what action to take.
The user asks the program what action to take.

Then, in 1969, Roger Shank developed a conceptual dependency system, which he described as: “a stratified linguistic system to provide a computational theory of simulated performance.” It was the concept of creating lexemes that made it possible to extract more meaning from the text. These tokens could contain various objects of the real world. The combination of tokens in various aspects is designed to take into account the totality of human linguistic activity at the conceptual level. If the user says that the construct is okay, it is added to memory, otherwise the construct is searched for in the list of metaphors or aborted. So the system uses the recording of what it heard before to analyze what it hears now.
In his work, Roger seized on the idea that before computers can understand natural language, they must learn to make decisions about what they are told. Roger’s parser was focused on the semantics of the language; he was able to teach the computer to discern important conceptual relationships.

Roger Carl Shank - American artificial intelligence theorist, cognitive psychologist, learning scientist, educational reformer, and entrepreneur
Roger Carl Shank – American artificial intelligence theorist, cognitive psychologist, learning scientist, educational reformer, and entrepreneur

Then, in the 1970s, William Woods developed his text recognition and processing system, he introduced the Augmented Transition Network (ATN). On the basis of which the LAS program was subsequently developed, which allows you to create classes of words in a language and understand the rules for forming sentences. ATN allowed not only to form new sentences, but also to understand natural language. The program created connections between sentence structure and surface structure, formed word classes.

The program promised to be as adaptive as a person. Studying new material, a person got acquainted with new vocabulary, new syntactic constructions in order to share his thoughts in the environment that he studied. LAS was written in the Michigan language LISP and allowed multiple input lines, which it described as scenes encoded as association networks. In the same way, the program could obey commands to understand, write, learn. The core of the entire system was the grammar of the ATN Augmented Transitional Network.

Dependency tree example: "What is a parser?".
Dependency tree example: “What is a parser?”.

Until the advent of machine learning algorithms in the 1980s, all natural language processing was reduced to handwritten and manual rules. However, even before that time, the first ideas about creating machines that could work like the human brain, through neural connections, appeared. This became the prototype of artificial intelligence built on neural networks in the future.

From the beginning of the 21st century to the present day, the development of machine learning began to gain momentum, having survived two winters of AI, machine learning has become fashionable again, also thanks to Big Data and deep learning. New methods and approaches to word processing are being created today, which makes the study of NLP relevant today.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *