-What is Computational Linguistics
-Approaches of the Study Of Computational Linguistics
-What is Internet Linguistics
-Internet Linguistics Perspectives
-Linguistic Future Of The Internet
Computational linguistics is an interdisciplinary field
concerned with the rule-based modeling of natural
language from a computational perspective.
Computational linguistics works with language experts and
computer scientists and it draws upon the involvement of : 1Linguists, 2-Mathematicians, 3-Computer scientists, 4-Experts
in artificial intelligence, 5-Logicians, 6-Cognitive science, 7Cognitive psychologists, 8-Psycholinguistis.
It has theoretical components which takes up issues in
theoretical linguistics and cognitive science, and also has
Applied components which focuses on the practical
outcome of modeling human language use.
Computational Linguistics is originated with efforts in the
United states in the 1950s.
Computational linguistics is a new field to study
devoted to developing algorithm and software for
intelligently processing language data.
Artificial intelligence came into existence in the 1960s.
Morphology : The grammar of word form, Syntax: The
grammar of sentence structure, Semantics: The study of
the meaning, Lexicon: The meaning in the dictionary,
Pragmatics: The Usage of language.
Research within the scope of computational linguistics is
done at computational linguistics departments, some
researches aim to create working speech or text
processing system, others aim to create a system
allowing human-machine interaction.
Conversational agents: programs meant for humanmachine communications.
Examine language acquisition and development.
1-Takes long time to learn
2-Only correct evidence is provided and this is insufficient.
Language can be learned more efficiently with a
combination of simple input at first presented
-Contributions of Developmental approach are :
1- Neural network
2-Robotic system (in order to test linguistics theories ) :
these robots are able to acquire functioning word-tomeaning mapping without needing grammar structure
3-Predication of future changes in language and give
insight into evolutionary history of modern days language.
One of the most important pieces of being able to study
linguistic structure is the availability of large linguistic corpora.
Penn Treebank: one of the most cited linguistics corpora,
containing over 4.5 million words of American English, this
corpus has been annotated for part-of-speech information.
-Contributions of Structural approach are:
1- allows computational linguistics to have a framework to
work out hypothesis that will further the understanding of the
language in several ways
2-Allowrs for the discovery and implementation of similarity
recognition between pairs of text utterances.
Structural data is not simply available for English but available
for other languages such as Japanese.
Computational linguistics allow scientists to parse large
amount of data reliably and efficiently, creating possibility for
discoveries unlike any other approach.
Very complex approach as it deals with all the skills that a person need to
speak a language fluently.
Comprehension in only half the battle of communication , the other half is
how system produces language.
" Alan Turing " proposed the possibility that machine might one day be able
to think, he proposed an ' imitation test ' in which human subject has two
text-only conversations, one with a human and another with machine
attempting to respond as a human, if the subject cannot tell the difference
between the machine and human it may be concluded that the machine is
capable of thinking.
Today, this test is called ' Turing Test '.
ELIZA program is one of the earliest and best known examples of computer
programs designed to converse naturally with humans, its developed by "
Joseph Weizenbaum " at MIT in 1966.
In an effort to improve computer translation, several methods have been
compared including : 1- Hidden Markov models, 2-Smoothing techniques,
and the specific refinements of those to apply them to verb translation.
Production approach has also done in making computer produce language
in more naturalistic manner, making human-computer interaction much
Much of focus of modern computational linguistics is on
Bayesian statistics have applied to the task of character
recognition illustrated by Bledsoe and Browing in 1959, and
also applied to language analysis included the work of
Mosteller and Wollace in 1963.
Lunar is a project developed by NASA to answer written
questions about geographically analysis of Lunar rocks by the
Signal modeling language was achieved with the use of
Hidden Markov models detailed by Rabiner in 1989.
Applications on Comprehension approach:1-Topic
Identification, 2-Improved search engines, 3-Automated
customer service, 4-Online Education.
It is a sub domain of linguistics advocated by David
Crystal. It studies the new language styles and
forms that have arisen under the influence of
internet and other new media ,such as: SMS, HCI,
Contribution of Internet Linguistics: Studying the
emerging language of the internet will help
improving the conceptual organizations,
translation and web usability, and that will benefit
both linguists and web users.
Four main perspectives of Internet Linguistics are :
Sociolinguistics, Educational, Stylistics, and Applied.
Deals with how the society views the impact of internet
development on language.
It changed the way people communicate and created new
platform with far-reaching social impact.
ways of social communication : SMS, E-Mails, Chat groups, Virtual
worlds, and the Web.
Influence of Internet language personally, CMC such as SMS text
messaging and e-mailing has greatly enhanced instantaneous
communication, such as : Blackberry & iPhone.
Influence of Internet language on Education: in school, it's common
for students and educators to be given personalized e-mail
accounts for communication and interaction purposes, classrooms
discussions are increasingly brought onto the internet in form of
Influence of Internet language professionally, it is a common sight
for companies to have their computers and laptops hooked up onto
the internet, it facilitates internal and external communication,
Mobile communication such as smart phones are increasingly
making their way into the corporate world.
Multilingualism: It looks at the status of the various language on
Language change: It explores the linguistic changed over time,
with emphasis on the internet lingo.
Conversational discourse: It explores the change in patterns of
social interaction and communicative practice on the internet.
Stylistic diffusion: It involves the study of the spread of the
internet jargons and related linguistic forms into common usage.
Meta-language and folk linguistics: It involves looking at the way
these linguistic forms and changes on the internet are being
labeled and discussed.
Examine the internet impact on formal language use
The rapid spread of internet use has brought onto new
features such as:
The increase in usage of informal written language.
Inconsistency of written styles and stylistics and the use of new
abbreviations in the internet chats and SMS.
Constraints of technology on the word count contributed to the rise of
new abbreviations such as acronyms, and examples of acronyms are
"LOL (Laughing out loud) - GTG (Got to go) - OMG (Oh my God)".
Disadvantages of Internet use:
Informal language and incorrect words use in academic and formal
situation such as the use of the casual word "Guy" and the choice of the
word "Preclude" instead of "Precede".
Use of abbreviations in the academic work such as "u" for "you" and "2" for
Advantages of Internet use:
Internet provides potential benefits in enhancing language learners
through communication aspects (use of E-mail, discussion forums,
chatting messenger and blogs...)
IMC allow for the greater interaction between language learners and the
native speakers of the language, providing for the better error corrections
and more learning opportunity of the standard language allowing picking
up of some special skills such as negotiation and persuasion.
Examine how the internet and its related technologies have encouraged new and
different forms of creativity in language.
This new mode of language is interesting to study because it is an mixture of both
spoken and written languages, Traditional writing is static compared to the dynamic
nature of new language on the internet where words can appear in different colors and
font sizes on the computer screen.
This new mode of language also contains other elements not found in natural
languages, example is the concept of framing found in e-mails and discussion forums.
Mobile Phone (cell phones) : have expressive potential beyond their basic
communicative functions, The 160-character limit imposed by cell phone have
motivated the users to exercise their linguistic creativity to overcome them. Cell phone
has also created a new literary genre (cell phone novels).
Blogs : Blogging has brought about new ways of writing diaries and from a linguistic
perspective, the language used in blogs is published to the world to see without
undergoing the formal editing process. Blogs have become so popular that they have
expand beyond written blogs with emerging to photoblog, videoblog, audioblog,
Virtual worlds : provide insight of how users are adapting the usage of their natural
language for communication within these new mediums. Some of CMC strategies used
include capitalization for words such a "EMPHASIS" , creative usage of the punctuation
like "??!?!?!", and usage of symbols such as the asterisk to enclose words such as
"*Stress*". Virtual worlds are good tools for language learning among younger learners as
they already see such places as a "place to learn and play".
E-Mails : One of the most popular Internet-related technologies is E-mail,
which expanded the stylistics of language in many ways. There is a hybrid of
speech and writing styles in terms of format, grammar, and style. Email is
rapidly replacing traditional letter-writing because of its convenience, speed,
Instant messaging : has developed its own acronyms and short forms. Instant
messaging is quite different from email and chat-groups because it allows
participant to interact in real-time while conversing in private. There are also
greater occurrences of stylistic variation because there can be a very wide
age gap between participants.
Views the linguistic exploitation of the internet in terms of its
communicative capabilities - The good and the bad.
The internet is a platform where minority and endangered
languages seek to revive their languages use and to create
awareness, it provides these languages opportunities to
make progress in two important regards : 1- Language
documentation, 2-Language revitalization (
Language documentation :
The internet facilitates language documentation.
Digital archives of media help to preserve language documentation and
allow global dissemination through the internet.
Publicity about endangered languages has helped a spur worldwide
interest in linguistic documentation.
The HRELP is a project that seeks to document endangered languages,
preserve and disseminate documentation materials amount others.
Language Newsletter provides news and articles about topics in
Language revitalization :
The internet facilitates language revitalization.
Virtual environments (emails, chats, instant messaging) have helped
to bridge the distance between communicators.
The use of e-mails facilitates language revitalization in the sense that
speakers of minority languages who moved to a location where their
native language is not spoken, can use the internet to communicate
with their family and friends, thus maintaining the use of their native
Leoki (powerful voice) : is a system developed in Hawaiian
where the content, interface and menus are entirely in the
Another use of the internet include having students of minority
languages write about their native cultures in their native
language for distant audience, in attempt to preserve their
language and culture.
People Will alter their language use to suit the dimensions of
The Increase of The Internet Users make cultural background ,
habits , and language differences to be brought to The Web.
The Internet is on its way to become more diverse multilingual
The interaction between English and other Languages will be
important to study it.
Promotion Will be done to The Minority Languages.
However , the Minority Languages will be affected
by the The Majority Ones.
Speakers Of Minority Languages will be encouraged To
Learn The majority languages to be Allowed to access more
The Future Of Minority Languages is in danger Due to the
Spread Of the internet
Translation Memory : is a database that stores segments that
have been translated previously To Aid Human Translation
Source-Text and its corresponding translation in “translation
Words Are handled by “Terminology Bases”
Software Using the TM Are Called (TMM) Translation Memory
TM is used in CAT Tools , Word-processing and Terminology
Many Companies producing multilingual documentation are
using TM Systems
-How The TMs Work :
1- Breaking the Source-Text into Segments.
2- Looks For Matches Between The Segments.
3- Presents Such Matching Pairs As Translation Candidates.
4- Accepting a Candidate and Replacing it With Fresh
Translation , Modify , Or To match them To the Source.
5- Saving The database
Typical TMs only search for text in the source segment.
Segments where no-match Found Will have be Translated
Manually and to be saved in the database.
TMs Work best on texts which are highly repeative Such as
1- Ensuring that the Document is Completely
2- Ensuring Consistency, Including Common
Definitions And Terminology.
3- Various Formats To Be Translated
4- Accelerating The Overall Translation Process
5- Reducing Time And Money
1- Recycled Translation Lost an Important Princible is
that “Taking The message From the Text”
2- Not Supporting All Files Types
3- Can’t Work with the Repeative lack Text
4- Quality Of The translated Text is not Guaranteed
5- Dealing With the Text Sentence-By-Sentence ,
Instead of the Whole Meaning
6- Expensive Software , And The more Cheaper
Software used , the less Features That you Will See.
Also Read the Rest of these Obstacles in the book
Special Thanks To :: Farah El-Mowaled
Created By : Abdohelal