Quality is the First Occupation

Legend:   Early attempts at computer translation of text from one human language to another produced hilarious results.


[Budiansky, 1999]

In the early 1960s, an apocryphal tale went around about a computer that the CIA had built to translate between English and Russian: to test the machine, the programmers decided to have it translate a phrase into Russian and then translate the result back into English, to see if they'd get the same words they started with. The director of the CIA was invited to do the honors; the programmers all gathered expectantly around the console to watch as the director typed in the test words: "Out of sight, out of mind." The computer silently ground through its calculations. Hours passed. Then, suddenly, magnetic tapes whirred, lights blinked, and a printer clattered out the result: "Invisible insanity."

[Collected on the Internet, 1999]

Some scientists were testing a program that could translate from English to Chinese and back again. They fed into their computer the English phrase "Out of sight, out of mind," and out came some Chinese ideograms. Since none of the scientists in the room at that moment knew Chinese well enough to determine whether the computer's Chinese translation had captured the spirit of the English phrase, they fed the ideograms back into the computer. The translation back into English read "Invisible idiot."

[Collected on the Internet, 1999]

Rumors have it that early modules for English to Russian have mistranslated some idioms with amusing results. Translating the phrase "The spirit was willing, but the flesh was weak" to Russian and back to English resulted in: "The vodka was good, but the meat was rotten." Likewise "out of sight, out of mind" reportedly yielded the phrase "blind and insane."

[Tan, 1979]

A firm experimenting with an electronic brain designed to translate English into Russian fed it the words: "The spirit is willing, but the flesh is weak." The machine responded with a sentence in Russian which meant, a linguist reported, "The whisky is agreeable, but the meat has gone bad."

Origins:   Since the earliest days of computers, one of the holy grails of computing has been to create a machine that can translate text from one human language to another. Fifty years or
so on, we're still on the quest for that grail.

Certainly modern translation programs are more sophisticated, run much faster, and produce far better results than their crude, slow, and clumsy forebears. Some of today's translation software packages do a very fine job of reading a piece of text written in one language and producing a readable equivalent of it in a second language. Still, we've a ways to go yet. Anyone who's tried SYSTRAN's translation software knows that although it's usually up to the task of turning a web page written in a foreign language into something understandable by English speakers, the quality of the translated text can range from the amusingly stilted to the outright incomprehensible.

Considering the enormity of the task of producing a machine than can render text from one language into another, it's no surprise that the results are still far from perfect. The depth and complexity of the syntactic and grammatic rules that we humans unconsciously master can (and do) fill innumerable linguistics texts. Consider some of the simpler problems language translation programs have to deal with:
  • In many languages nouns are accompanied by articles (both definite and indefinite), but the use of articles can vary quite widely, even in related languages. English employs a single definite article ("the") with all nouns, but other languages (such as Spanish and French) categorize nouns as being either masculine or feminine and therefore have two definite articles. German adds neuter nouns and thus has three different indefinite articles. In English the definite article remains the same whether the accompanying noun is singular or plural ("the cat" vs. "the cats"), but in other languages different forms of definite articles are used with plural nouns ("el gato" vs. "los gatos" in Spanish, for example). Even within a single language dialectal differences in article use exist (compare the British "in hospital" to the American "in the hospital"). Of course, exceptions exist, such as nouns used with definite articles but not with indefinite articles (e.g., in English we speak of "the money" but never "a money.") Other languages, such as Japanese, don't use articles at all.
  • Some languages form plurals by inflecting nouns. In English we typically do this by adding -s or -es to the end of the noun, but we have numerous exceptions, especially in words borrowed from other languages (e.g., "ox" vs. "oxen" or "medium" vs. "media"). We also have non-count nouns (such as "furniture") with no distinct plural forms. Other languages do not inflect nouns to indicate plural usage, so information about the number of nouns must be inferred from the context in which they are used.
  • One word can have multiple meanings: in English a "head" can be the part of the body above the neck, the leader of an organization or business, or a slang term for bathroom. The word "wind" can be either a noun or a verb. Translation programs must be able to judge from context which meaning is the intended one (or which meaning is at least a sensible one).
  • Word order can vary significantly from language to language. Adjectives appear in front of the nouns they modify in some languages (such as English), and following the nouns they modify in other languages (such as Spanish). The object of a verb typically follows the verb in English; in other languages it precedes the verb. In English (and the Romance languages) verbs are usually found near the middle of sentences; in most other languages verbs regularly fall at the ends of sentences.
As we said, these are just a few examples of the simpler challenges to overcome. Other more formidable obstacles — such as conjugations of (irregular) verbs, multiple phrases and clauses, unusual sentence word order, interrogatives, and idioms (whose understood meanings are quite different than their literal meanings) — have to be surmounted as well. The ubiquitous "language translation follies" examples cited above are still plausible even with today's advanced translation programs, so it's not hard to understand why they have circulated as true anecdotes for so long now.

How these particular tales got started is unknown, but they've been circulating for thirty or forty years now, in part because their details do accurately reflect the era in which they originated. The Cold War made the U.S. government highly desirous of a machine that could perform Russian-English translations — not just for security reasons, but also to cope with the reams of important technical articles being produced by Soviet scientists. And early efforts in this arena often used the "brute force" approach to translation that sought to find a direct lexical equivalent of every word or phrase in a text, heedless of linguistic nuances. One can well imagine that these literal translations would have produced results such as "out of sight" being rendered as "invisible" or "blind," and "out of mind" being rendered as "idiot" or "insane," creating some rather interesting output if certain idioms or aphorisms were used as input. (You'd think that the testers would favor more straightforward test data such as "We will bomb Washington D.C. at 6:00 PM on Thursday" over pithy English sayings that one rarely encounters in Russian scientific articles, but that doesn't make for amusing anecdotes.) These examples probably aren't true, but they could be, and in the world of urban legends, that's good enough for most people.

Last updated:   12 July 2007

  Sources Sources:
    Budiansky, Stephen.   "Lost in Translation."
    The Atlantic Monthly.   December 1998.

    Pinker, Steven.   The Language Instinct.
    New York: William Morrow, 1994.   ISBN 0-688-12141-1   (pp. 209-210).

    Tan, Paul Lee.   Encyclopedia of 7700 Illustrations.
    Rockville, Maryland: Assurance Publishers, 1979.   ISBN 0-88469-100-4   (p. 717).