We start with a linguistic discussion of language, its properties, and the study of language in philosophy and linguistics. We then investigate natural languages, controlled languages, and artificial languages to emphasise the human ability to control and construct languages. At the end, we arrive at the notion of software languages as means to communicate software between people.
3. Language
Since language is such a fundamental phenomenon in human life, it has always been
a subject of scientific studies. Many persons have tried to come up with a definition
of language, its concepts, and its characteristics. But we can investigate language
from many perspectives and with different professional backgrounds. Philosophers,
psychologists, linguists, mathematicians, and computer scientists all have their own
concepts of language. Thus, all definitions of language are restricted to a particular
viewpoint and none of them is sufficient to describe all concepts and characteristics
of language.
This course has a computer scientific view on those languages which are employed
in software engineering. Since the study of language in computer science is based
on many linguistic concepts, we pursue this influence of linguistics in this chapter.
It starts with a linguistic view on language in this section: First, we discuss some
fundamental properties of language without giving a precise definition of language.
Second, we investigate how these properties enable humans to construct entirely new
languages from scratch. Next, we introduce metalanguage as a means of describing
or prescribing languages, their concepts, and their characteristics. Finally, we char-
acterise linguistics as a science of language as well as its subdisciplines.
Properties of Language
Unique properties separate language from other human communication means as
well as from animal communication [1]: First, language is arbitrary. There is no
inherent or logical relation between words and their meaning. As a consequence,
language is symbolic. Words are symbols for objects, concepts, ideas, notions, etc.
Second, language is systematic. Sounds are organised into words. We use only certain
sounds and combine them only in certain ways. Furthermore, words are organised
into sentences. Again, we use only certain systematic combinations. Third, language
is productive. There is an infinite amount of possible sentences in language. We can
construct a sentence that has never been constructed before. Even better, it enables
us to understand novel sentences. Furthermore, language is non-instinctive but con-
ventional. The relation between words and their meaning is given by convention in a
society of language users. In the same way, the systematic organisation of sounds
into words and words into sentences is conventional. These conventions are culturally
transmitted to new language users. Thus, language is modifiable. We can add new
words to our vocabulary, slightly change its meaning, add another meaning. In the
same way, words might not longer be used or lose a former meaning. Languages
arise, evolve, and die.
1
4. 2
Constructed Languages
Since language is conventional, we can capture some of these conventions explicitly.
For example, a dictionary describes the pronunciation and meaning of words in a
specific language. If new words are significantly used in the language or a word from
the dictionary changes its meaning in the language, a new edition of the dictionary
will reflect such changes. But a dictionary is not only descriptive. Additionally, it
prescribes the conventional spelling of words.
In some use cases, it proved to be valuable to prescribe larger parts of a language.
For example, in the 1980’s the European Association of Aerospace Manufacturers cre-
ated a standard for aerospace industry maintenance manuals. This standard pre-
scribes a subset of English in order to reduce ambiguity, to improve comprehension
for non-native English speakers, and to ease translation. It consists of a lexicon of
approved words and restrictions how these words can be used. Actually, this stan-
dard defines a language on its own which is nowadays called Simplified Technical
English [2]. In general, such languages are called controlled languages.
But we can not only control existing natural languages but can also construct new
languages. Such languages are called constructed languages or artificial languages.
Well-known examples are international auxiliary languages like Esperanto, Ido, or
Interlingua. Other examples are artistic languages like the Elvish languages spoken
in John Tolkien’s high fantasy works or the Klingon language spoken in the fictional
Star Trek universe.
Another example for a constructed language is the International Algorithmic Lan-
guage (IAL). Nowadays, it is better known under the name ALGOL (ALGOrithmic Lan-
guage). This language was proposed in order “to provide a means of communicating
numerical methods and other procedures between people” [3]. The second motive
was to provide a machine-independent programming language. Though, ALGOL is a
human language: It was constructed for people who wanted to describe algorithms,
particularly in publications. It was constructed for people who wanted to abstract from
a concrete machine a program runs on. It was constructed for people who wanted to
design, implement, maintain, and understand algorithms.
ALGOL meets the properties of language as discussed earlier in this section. It is
arbitrary: We can easily exchange symbols in the language without losing the expres-
siveness of the language. It is symbolic: For example, it includes textual symbols for
conditional execution or iterative procedures. It is systematic: An ALGOL program is
constructed in two systematic steps. First, we organise letters and digits into vari-
able names, integer numbers, strings, and keywords. Second, we organise variable
names, integer numbers, strings, and keywords into statements, procedures, and
programs. It is productive: We can express an algorithm that was never expressed
before in ALGOL or even in any other language. When we communicate this algorithm
in a publication, readers can understand it. It is completely conventional: Its syntax
and semantics are prescribed by a formal specification. It is modifiable: Eventually, a
whole family of ALGOL languages evolved from the original IAL propose, including AL-
GOL 60 [4–6] and ALGOL 68 [7, 8]. Furthermore, ALGOL influenced the construction
of many other programming languages.
5. 3
Metalanguages
When we communicate ideas about language, its concepts, or characteristics, we
employ a language. The same holds when we describe or prescribe a particular lan-
guage. Typically, natural languages include a metalanguage facility. For example, we
employ English in this section to characterise language in general. In a similar way,
we can employ English to describe a particular language, even English itself as done
in a dictionary, for example.
Alternatively, we can employ a constructed metalanguage. The oldest bequeathed
constructed metalanguage was applied by Pānini, an Ancient Indian Sanskrit gram-
marian, in the 4th century BCE to describe the morphology of Sanskrit [9, 10]. Eventu-
ally, a very similar metalanguage widely known as Backus-Naur Form was constructed
over two millennia later to prescribe the syntax of ALGOL. Because of the similarity to
the metalanguage used by Pānini, American computer scientist Ingerman [11] pro-
posed to call it Pānini-Backus Form. Like natural languages, constructed languages
applied in software engineering often include a metalanguage facility. For example,
Prolog [12] is a logical programming language which can be employed to describe
natural languages [13, 14] and to prescribe constructed languages [15].
Study of Language
Language has been one of the earliest subject of scientific studies. Today, linguistics
is the scientific study of language. Important topics of linguistics are the study of
language structure, that is grammar, and the study of meaning, that is semantics.
Grammar encompasses morphology, that is the formation and composition of words,
syntax, that is the combination of words into phrases and sentences, and phonology,
that is the study of sound systems.
Additionally, linguistics is involved in many interdisciplinary fields of study. We give
some examples. Applied linguistics studies language-related issues applied in every-
day life, including language planning, language assessment, language pedagogy, sec-
ond language acquisition, and translation. Computational linguistics, studies compu-
tational implementations of linguistic structures. Historical linguistics studies language
change over time. Neurolinguistics studies the neural mechanisms in the human brain
that control the comprehension, production, and acquisition of language. Psycholin-
guistics studies the psychological factors that enable comprehension, production, and
acquisition of language. Sociolinguistics studies the effect of social aspects like cultural
norms, expectations, and social context on language usage.
Due to its fundamentality, philosophy studies language since its early days. This
includes the inquisition of the nature of meaning as well as the relation between
language and meaning to truth and the world. Other interesting topics are language
creation and translation.
In computer science, we study languages involved in software engineering. Par-
ticularly, we are concerned with the construction of such languages, their properties,
the design and implementation of software tools for these languages, their evolution,
and their impact on software engineering processes. Thereby, the study of language
in computer science is based on many linguistic concepts. Like ordinary linguistics,
computer science distinguishes the study of language structure, that is grammar, and
6. 4
the study of meaning, that is semantics. But the relation of computer science to lin-
guistics and its interdisciplinary fields goes beyond the reuse of concepts. We illustrate
this with two examples.
The first example concerns linguistic universals [16, 17]. A linguistic universal is
a statements that holds for all natural languages. Linguists distinguish absolute uni-
versals which apply to every language and tendencies which are common for many
languages. Furthermore, implicational and non-implicational universals are distin-
guished. Implicational universals apply to languages with a particular feature that is
always accompanied by another feature. Non-implicational universals simply state the
existence or non-existence of a particular feature. Linguistic universals deal with all
kinds of language aspects including phonology, morphology, syntax, and semantics.
To ease language acquisition, production, and comprehension, constructed lan-
guages employed in software engineering should follow linguistic universals. This is
required even more, if we assume a universal grammar which is substantially the
same in all languages. The idea goes back to the 13th century and English philoso-
pher Roger Bacon. In modern linguistics, American linguist Noam Chomsky [18, 19]
postulates a universal grammar which is innate to all humans. Neurolinguistics and
historical linguistics provide some evidence for this, but there is also much criticism.
Nevertheless, the assumption of a universal grammar implies that constructed lan-
guages should meet this grammar.
The second example concerns linguistic relativity. Here, the hypothesis is that lan-
guage has a significant impact on thought and perception. It goes back to ideas of
German philosophers Johann Georg Hamann and Johann Gottfried Herder [20] that
language anchors thought which was defended by Wilhelm von von Humboldt [21].
Later, we can find the systematic relationship between language and thought as an un-
derlying axiom in the works of American linguist Edward Sapir [1983] and his student
Benjamin Whorf [1964]. This includes the idea that different language patterns yield
different thought patterns. This idea has a deep impact on controlled and constructed
languages: English author George Orwell first advocated the controlled language Ba-
sic English as an international auxiliary language. Later, the language inspired his use
of Newspeak in his novel Nineteen Eighty-Four [24] to illustrate controlled language
as a way to control thought.
Linguistic relativity is highly debated in linguistics. But constructed languages em-
ployed in software engineering give some evidence to it. In computer science, it is
widely accepted that a programming language influences the way we think about
problems and algorithmic solutions [25, 26]. This assumption is one of the reasons
for the construction of domain-specific languages. These languages are tailored to a
particular domain in order to meet the way of thinking and problem solving of experts
in this domain.
7. Bibliography
[1] E. Sapir, Language. An Introduction to the Study of Speech (Harcourt Brace, New
York, NY, USA, 1921).
[2] ASD Simplified Technical English Maintenance Group, ASD Simplified Technical
English, Specification ASD-STE100, AeroSpace and Defence Industries Associa-
tion of Europe (2005-2010).
[3] J. W. Backus, The syntax and semantics of the proposed international algebraic
language of the Zurich ACM-GAMM conference, in IFIP Congress (Unesco, Paris,
1959) pp. 125–131.
[4] J. W. Backus, F. L. Bauer, J. Green, C. Katz, J. McCarthy, A. J. Perlis,
H. Rutishauser, K. Samelson, B. Vauquois, J. H. Wegstein, A. van Wijngaarden,
and M. Woodger, Report on the algorithmic language ALGOL 60, Communica-
tions of the ACM 3, 299 (1960).
[5] J. W. Backus, F. L. Bauer, J. Green, C. Katz, J. McCarthy, A. J. Perlis,
H. Rutishauser, K. Samelson, B. Vauquois, J. H. Wegstein, A. van Wijngaarden,
M. Woodger, and P. Naur, Revised report on the algorithm language ALGOL 60,
Communications of the ACM 6, 1 (1963).
[6] M. Woodger, Supplement to the ALGOL 60 report, Communications of the ACM
6, 18 (1963).
[7] A. van Wijngaarden, B. J. Mailloux, J. E. Peck, and C. H. Koster, Report on the
algorithmic language ALGOL 68, Numerische Mathematik 14, 79 (1969).
[8] A. van Wijngaarden, B. J. Mailloux, J. E. L. Peck, C. H. A. Koster, M. Sintzoff, C. H.
Lindsey, L. G. L. T. Meertens, and R. G. Fisker, Revised report on the algorithmic
language ALGOL 68, Acta Informatica 5, 1 (1975).
[9] O. Böhtlingk, ed., Pānini’s acht B cher grammatischer Regeln (H. B. König, Bonn,
Germany, 1839-1840).
[10] Pānini, The Astādhyāyi of Pānini, edited by Śrīśa Chandra Vasu (Benares, 1897).
[11] P. Z. Ingerman, Panini-Backus form suggested, Communications of the ACM 10,
137 (1967).
[12] International Organization for Standardization, ISO IEC 13211: Information
technology – Programming languages – Prolog, International Organization for
Standardization, Geneva (1995).
[13] G. Gazdar and C. Mellish, Natural Language Processing in PROLOG (Addison-
Wesley, Reading, MA, USA, 1989).
5
8. 6 Bibliography
[14] C. Bitter, D. A. Elizondo, and Y. Yang, Natural language processing: A Prolog
perspective, Artificial Intelligence Review 33, 151 (2010).
[15] R. Lämmel and G. Riedewald, Prological language processing, Electronic Notes
in Theoretical Computer Science 44 (2001).
[16] J. H. Greenberg, Language Universals: With Special Reference to Feature Hier-
archies (Mouton, The Hague, The Netherlands, 1966).
[17] J. H. Greenberg, ed., Universals of Human Language (Stanford University Press,
Stanford, CA, USA, 1978).
[18] N. Chomsky, Syntactic Structures (Mouton, The Hague, The Netherlands, 1957).
[19] N. Chomsky, Approaching UG from below, in Interfaces Recursion Lan-
guage Chomsky’s Minimalism and the View from Syntax-Semantics, Studies in
Generative Grammar, Vol. 89, edited by U. Sauerland and H.-M. Gärtner (Mouton
de Gruyter, Berlin, Germany, 2007) pp. 1–29.
[20] J. G. Herder, Briefe zur Beförderung der Humanität. Erste Sammlung., edited by
J. F. Hartknoch (Riga, 1793).
[21] W. von Humboldt, Gesammelte Schriften, edited by A. Leitzmann (Preußische
Akademie der Wissenschaften, Berlin, Germany, 1903-1936).
[22] E. Sapir, Selected Writings of Edward Sapir in Language, Culture, and Personality,
edited by D. G. Mandelbaum (University of California Press, 1983).
[23] B. L. Whorf, Language, Thought, and Reality: Selected Writings of Benjamin Lee
Whorf, edited by J. B. Carrol (MIT Press, 1964).
[24] G. Orwell, Nineteen Eighty-Four (Secker and Warburg, London, UK, 1949).
[25] K. E. Iverson, Notation as a tool of thought, Communications of the ACM 23,
444 (1980).
[26] P. Graham, Hackers painters: Big ideas from the computer age (O’Reilly, 2004).