Agent 007: How Ghost Words Protect English Dictionaries

Even the most famous dictionaries have errors. Sometimes non-existent words appear on the pages of publications, which in fact are not in English.

Oxford, Cambridge, Webster and dozens of other lesser-known dictionaries – all of them have similar lexemes, which are also called ghost words. They seem to be there, but they are not really there.

Over time, ghost words have evolved from annoying mistakes into a tool that helps dictionary compilers protect their copyrights. How it all happened – today we are talking about it.

First word ghost: abacot

Compiling dictionaries is a colossal amount of work that involves many people. Therefore, mistakes are not uncommon here. It is clear that philologists check the records many times, but in the printed versions, shoals still happen.

Professor Walter Skeet, President of the Philological Society of London in 1886, wrote:

Dr. Murray, as you may recall, wrote a very good article to justify the exclusion of the word “abacot” from the Dictionary, which Webster defines as “the state headdress used by the kings of England, made in the shape of two crowns.” It was completely correct and reasonably rejected by the editor, because such a word does not exist, and the stated form is the result of a perfect mistake … an error of printers or scribes, ardent imagination or illiteracy of the editors …

Therefore, I bring to your attention a few more words similar to “abacot”; words that our editor will eventually review, and I have no doubt that he will reject them. And to make it convenient to briefly name such words, I propose to call them “ghost words.” And I recognize the name “ghost word” only for such words and word forms that are meaningless in principle.

And there were quite a few such ghost words. Which in general is not strange. There are about 470,000 tokens in the Webster and Oxford Dictionaries today. The presence in the general list of 20-30 words that do not actually exist in the English language is a small percentage of error in such a huge philological study.

In the pre-Internet era, it was extremely difficult to control the presence of such tokens. Even a brilliant editor won’t know all 470,000 terms. Each more or less rare word, and there are more than 80% of such in large dictionaries, needed a multi-stage recheck. And sometimes the well-established system failed.

Some words that don’t seem to exist (but that’s not certain)

Over the centuries of the development of the English language, enough ghost words have accumulated.

Interestingly, they all sound quite logical and correspond to the rules of the English language. Moreover, some of these erroneous tokens actually began to be used in everyday life over time.

We have collected several examples that were at first erroneous, but then became widespread, and now they are completely considered legitimate and official.

Today this word is included in all dictionaries and means “meat juice gravy”.

But in the 14th century it came to English thanks to the mistake of a translator who adapted the cookbook from French. In the original there was the word “grane”, which meant spices, but a mistake turned it into “grave”, and then into “gravy”, along the way, completely changing its meaning.

Now this is a common word, but in the 15th century it was created due to the carelessness of the scribe. In one translation of Cicero’s Letters to Atticus, the Greek word sittybas (table of records) was rendered syllabus.

Over a hundred years, this mistake was duplicated many times, so it became quite common already in the 16th century. And today it is in all dictionaries with full rights and means “synopsis”.

Yes, yes, the usual and familiar word “tweed”, which denotes the type of fabric, is also a ghost word that appeared due to an error.

In the Scottish language there is the word “twill”, which, in fact, designates the very fabric from which the kilts are sewn. Moreover, Scotland has the Tweed River. Perhaps, during the trade between the Scots and the British, the latter caught on to the well-known name and did not begin to double-check how it is spelled correctly there. So “twill” became “tweed”. And in the 1800s, this was already a completely legitimate word.

One of those examples when the erroneous word was nevertheless removed from the Webster dictionary. And it got there in 1934 due to a misunderstanding.

The editor left a note to the compiler: “D or d, cont./density “, meaning that they add to the word density that it can be denoted with the abbreviations D or d But the compiler took “D or d” as a whole word “dord” with the meaning “density” and added it to the dictionary.

The mistake was noticed only after the dictionary was printed, and it took 13 years to finally resolve the issue with it and exclude the word.

How bugs became a tool for copyright protection

In 1886, the Berne Convention for the Protection of Literary and Artistic Works was adopted, which became one of the first international documents in the field of copyright protection.

Dictionaries as a product of intellectual creativity also came under protection. But the compilers were faced with an interesting question – how to prove that copyright was violated if the words in English are common and they are repeated in all dictionaries?

The English language is a national asset. And no one prevents dishonest authors from taking the text of another dictionary and passing it off as their own scientific research.

The solution turned out to be elegant. Compilers of dictionaries and encyclopedias began to deliberately add erroneous words and definitions that do not exist in real language. After all, if in someone else’s dictionary you find a term that you invented yourself, then it is absolutely certain that the compilers simply stole it from you.

Such intentional ghost words have come to be called fictitious records. In the era of print, they have become one of the most effective ways to protect copyright.

Articles about non-existent sports appeared in reference books and encyclopedias, and non-existent cities and towns appeared on maps – the so-called “paper cities”. When copying “ghosts” it became quite realistic to prove the theft of intellectual property and copyright infringement.

One example. The two defunct cities of Beatosu and Goblu, which the creators of the map placed on the map of Michigan.

Suddenly, errors in dictionaries began to benefit compilers. Indeed, even a dozen non-existent words in a dictionary for 100,000 lexemes is a minuscule amount. They did not interfere with the simple user – not a single cross-reference led to them and they could only be found purposefully. But for potential intruders, they were like a bone in the throat – if you copy everything in a row, there was a serious risk of duplicating this most fictitious word, and with it getting into problems with the law.

Dictionaries still tried to get rid of random errors, but at the same time they added intentional ones.

An interesting situation happened with the New Oxford Dictionary of American English, when in 2005 it found the non-existent word “esquivalience”, which was attributed to the meaning of “willful evasion of official duties.”

It was added in 2001 as a copyright protection tool. After all, the dictionary is available online and it is much easier to copy its base than in any of the printed versions. And it is very likely that this is far from the only fake word-ghost.

But the lexeme “esquivalience” became very popular when the story itself was revealed. They began to make links to it and publish it in the media. Today the word “esquivalience” is no longer in the Oxford Dictionary – it was revealed as a false one, and therefore the need for it disappeared. The words are secret agents.

Moreover, this was done not only by the compilers of dictionaries, but also by the authors of encyclopedias, charts, lists, reference books and other large text materials.

Pauli’s new encyclopedia, published in 1996, contained an article about the fictional sport of apopoudobalia, which was allegedly played in ancient Rome.

The 1975 New Columbia University Encyclopedia had a long article about Lillian Mountweizel, a completely fictional person. Subsequently, they still found out about her and the name became a household name for such fictitious records – in the English-speaking sphere they are often called Mountweazel.

The record holder in the number of fictitious entries can be considered the Appleton Cyclopedia of American Biography – in six volumes, researchers found about 200 fictitious entries. And the authors approached them with great imagination.

For example, Hue de Navarre was a real person, but the compilers added fictional details to his life story. And Rafael Ferrer, Joseph Cantillion and William Tenner are completely invented personalities with rather detailed biographies. There are about 45 fictional personalities in total.

Why it was necessary to invent so much is not clear.

Over time, the very essence of ghost words has changed. If a hundred years ago they got into dictionaries due to carelessness or errors, now they are quite thoughtful actions to ensure a higher level of copyright protection.

