Twitter provides a wealth of data that can be used to analyze language trends in real-time. The large volume of informal tweets sent daily help document the emergence and spread of new words and expressions. By following linguistic experts and word-related hashtags on Twitter, neologisms can be identified as they enter the language and their usage over time and in different locations can be tracked. Several organizations determine high-profile "Word of the Year" choices by analyzing which terms gained most attention over the past twelve months based on Twitter and other social media discourse.
3. Social media introduces new words
• Social media has made
written language much
more ‘visible’.
• We need to
compensate for lack of
non-verbal
communication.
4. We write how we speak
• Recent research in Natural Language
Processing (NLP) has demonstrated that
people on social media
platforms intentionally write how they speak.
6. The users of social media produce a tremendous
amount of text each day.
Social media as available corpora
7. Social media as available corpora
• The Big amount of Data allows:
– to search, for example, the word “the”, getting
607 million tokens in the last month alone.
– to map the emergence of new words and their
lexical diffusion.
8. Social media as available corpora
• Text is readily available for lexicographical
analysis.
• Easy access to very large corpora.
• Given the right tools and know-how, anyone
can search that published material.
• Corpus patterns that are very rare in
conventional-size corpora turn to have many
occurrences in the very large corpora of social
media.
9.
10.
11. Real-time language public data to analyse:
340 million tweets sent every day, according to
Twitter.
Why Twitter?
12. Language in action
• Instead of relying on questionnaires and other
laborious and time-consuming methods of
data collection, social scientists can simply
take advantage of Twitter’s stream to
eavesdrop on a virtually limitless array of
language in action.
13. Why Twitter?
• Tweets tend to be rather informal.
• Tweets appear similar to spontaneous speech,
making them particularly valuable to the study
of the spread of new words and expressions.
14. • On Twitter, it is possible to see a word at the
moment of its coinage:
– Twitter limits the “Tweet” to 140 characters, thus
pushing its users to become more adept at saying
what they want or need to say with fewer words.
– 57% of neologisms on Twitter come from blends.
Twitter and neologisms
15. • Twitter has been at the forefront of recent
linguistic developments, with words such as
‘selfie’, ‘twerk’, ‘vom’, ‘buzzworthy’ and
‘squee’ all making it into the Oxford
Dictionaries Online in 2013.
Twitter and neologisms
17. Why Twitter?
Tweets provide location data and the time
they were sent allowing thus to map out the
way in which new words become popular and
spread.
18. • Geolocation information provides:
– social classes,
– patterns of immigration and
– how groups are influencing each other.
• The interaction between geographic features,
historical migrations, and a 'snapshot' of
linguistic data can tell us about our language
and ourselves.
Why Twitter?
22. How new words are selected
• Ease of use/interpretation.
• Usefulness: does the word fill a lexical gap?
• Relevant in society over time?
• High degree of exposure.
23. FUDGE by Alan Metcalf
Frequency of use
Unobtrusiveness
Diversity of users and situations
Generation of other forms and meanings
Endurance of the concept
Each new word gets a score of 0, 1, or 2 on each factor.
It’s not a mathematical formula but a judgment call.
The higher the total, the more likely a word will endure
for generations.
25. By means of a simple computational method
for identifying English lexical blends by
exploiting the massive amount of text available
on Twitter.
Like a Pro
34. #Wordwatch round up
• This feature investigates interest in words
influenced by news and other current events.
• The graphs are based on data from
OxfordDictionaries.com over a four-week
period and explores changes in term lookups
across the entire website.
35. Follow the News
• Twitter is a newswire other than a social
platform.
• By following the social spotlights on Twitter
new words will pop-up.
41. Follow the events: WotY
"Word of the Year" and abbreviated "WOTY" (or
"WotY"), refers to any of various assessments
as to the most important word(s) or
expression(s) in the public sphere during a
specific year.
44. WOTY Word of the year
• The oldest WOTY, at the end of the calendar
year, determined by a vote of independent
linguists, is the American Dialect Society's
Word of the Year.
• US
– American Dialect Society
– Global Language Monitor
45. The Global Language Monitor
English-speaking world: 1.83 billion speakers
(January 2013 estimate).
GLM employs its NarrativeTracker technologies for
global Internet and social media analysis.
NarrativeTracker is based on global discourse,
providing a real-time, accurate picture about any
topic, at any point in time.
NarrativeTracker analyzes the Internet,
blogosphere, the top 300.000 print and
electronic global media, as well as social media
sources as they emerge.
47. WOTY Word of the year
• UK
– The lists of Merriam-Webster's Words of the Year
(for each year) are ten-word lists published
annually.
– Oxford University Press announces an Oxford
Dictionaries UK Word of the Year and an Oxford
Dictionaries US Word of the Year.
50. “Twictionary”
• No more up to
lexicographers to select
words but it is only up
to the users to decide
and vote for the
inclusion of new words
in the dictionary.