SlideShare a Scribd company logo
1 of 124
Download to read offline
Tracking the Emergence of
New Words across Time and Space
Jack Grieve
Aston University
Research conducted with
Diansheng Guo & Alice Kasakoff, University of South Carolina
Andrea Nini, Aston University
Funded as part of the Digging into Data Challenge
Approaches to Historical Linguistics
There are several different approaches to the analysis of
language change:
Reconstruction through comparison of known languages
(comparative method)
Analysis of previous linguistic research (e.g. lexicographic
research)
Analysis of historical texts (corpus-based)
Apparent time studies with interview data (sociolinguistics)
Computer simulations
Lexical Change
Research in historical linguistics and etymology has
analysed how the usage of certain words have changed
over relatively long periods of time (primarily based on
historical corpora and lexicographic research), but overall
there are large gaps in our knowledge of lexical change,
including how newly emerging words enter a language
and spread across its speakers.
Words are Rare Events
The main problem with studying lexical variation and
change is that most words are incredibly rare, thus
requiring incredibly large corpora of natural language.
This is why most research on lexical variation and
change has focused on relatively high frequency words,
primarily function words (e.g. pronouns, prepositions,
auxiliary verbs).
Word Frequency Distribution (Zipf 1935, 1945)
Word Frequency Distribution (Zipf 1935, 1945)
The majority of the 67,000 most
frequent words in our corpus occur
less than once per 25 million words
Word Frequency Distribution (Zipf 1935, 1945)
New Words are Incredibly Rare Events
The analysis of new words requires even more data,
because emerging words are by definition especially
rare.
In addition, to analyse the temporal and spatial spread
of new words, large corpora must be compiled for a
large number of points in times and locations.
Big Data
Suitable data has recently become available with the
rise of the social media and smartphones, which
provide massive amounts of time-stamped and geo-
coded natural language data.
Goals of Today’s Talk
Identify emerging words from 2014 based on a multi-
billion word corpus of American tweets.
Chart their usage over time and identify common
temporal patterns of lexical spread.
Map their geographical diffusion and identify common
spatial patterns of lexical spread.
The Corpus
Since 2013, the team at USC have been compiling two
multi-billion word geocoded corpora for the US and the UK
using the Twitter API.
Twitter is a particularly rich source of geocoded data and
is also very popular, informal, and youthful, making it ideal
for tracking the emergence of new words.
Approximately 2% of tweets are geocoded.
The Corpus
The analysis today is based on a 8.9 billion word
corpus of American Tweets from October 2013-
November 2014, which totals approximately 980 million
Tweets from 7 million users.
Every tweet is geocoded with the precise longitude and
latitude of the user when posting, which were then used
to identify the county where each Tweet was produced.
-­‐87.684555,42.074043
Just	
  posted	
  a	
  photo	
  @	
  Baha'i	
  House	
  of	
  Worship	
  
-­‐87.684555,42.074043
Just	
  posted	
  a	
  photo	
  @	
  Baha'i	
  House	
  of	
  Worship	
  
-­‐87.684555,42.074043
Just	
  posted	
  a	
  photo	
  @	
  Baha'i	
  House	
  of	
  Worship	
  
-­‐87.684555,42.074043
Just	
  posted	
  a	
  photo	
  @	
  Baha'i	
  House	
  of	
  Worship	
  
-­‐87.684555,42.074043
Just	
  posted	
  a	
  photo	
  @	
  Baha'i	
  House	
  of	
  Worship	
  
-­‐87.684555,42.074043
Just	
  posted	
  a	
  photo	
  @	
  Baha'i	
  House	
  of	
  Worship	
  
-­‐87.684555,42.074043
Just	
  posted	
  a	
  photo	
  @	
  Baha'i	
  House	
  of	
  Worship	
  
-­‐87.684555,42.074043
Just	
  posted	
  a	
  photo	
  @	
  Baha'i	
  House	
  of	
  Worship	
  
Corpus Examples
username,fips,time,tweet
-­‐,48439,Sun	
  Jul	
  27	
  23:59:59	
  EDT	
  2014,
don't	
  follow	
  the	
  right	
  ppl	
  lol
-­‐,42007,Sun	
  Jul	
  27	
  23:59:59	
  EDT	
  2014,
yesss	
  moody	
  judy
-­‐,36005,Sun	
  Jul	
  27	
  23:59:59	
  EDT	
  2014,
Man	
  i	
  was	
  just	
  thinking	
  shexx	
  be	
  lurking	
  but	
  won't	
  hmu
-­‐,25021,Sun	
  Jul	
  27	
  23:59:59	
  EDT	
  2014,
no	
  seeing	
  u	
  on	
  tv	
  is	
  reel	
  but	
  not	
  seeing	
  u	
  on	
  twitter	
  
is	
  real	
  for	
  me...so	
  pls	
  visit	
  us	
  here	
  everyday.
-­‐,26163,Sun	
  Jul	
  27	
  23:59:59	
  EDT	
  2014,
Hate	
  seeing	
  my	
  friends	
  sad
-­‐,12093,Sun	
  Jul	
  27	
  23:59:59	
  EDT	
  2014,
this	
  is	
  the	
  shirt	
  i	
  won	
  that	
  i	
  got	
  to	
  sign	
  btw!!:)
Graveyard/Cemetery
Graveyard/Cemetery
Graveyard/Cemetery Percent
Graveyard/Cemetery Smoothed (Getis-Ord Gi)
Identifying Rising Words
To find newly emerging words, we first measured the
degree to which the usage of each word in the corpus
had been rising over the 13 month period.
To identify these rising words we extracted the 67,000
words that occur at least 1,000 times in the corpus and
compared word relative frequency per day to day of the
year using a Spearman’s rank correlation coefficient.
ρ = .116
ρ = .044
ρ = .044ρ = -.028
The Top 10 Rising Words on Twitter 2014
Word ρ Definition
fuckboy 0.947 Asshole, Jerk, Poser, Tool, etc.
rn 0.938 Right Now (Top Riser 2013)
hbd 0.928 Happy Birthday
fw 0.927 Fuck with
unbothered 0.926 Unconcerned & Disengaged
ft 0.925 Face time
gmfu 0.924 Get me fucked up
sm 0.919 So Much
squad 0.919 Squad
asf 0.918 As fuck
Identifying Emerging Words
Although measuring correlations allows for rising words
to be identified, most are far too common by 2014 to
show patterns of regional spread.
To identify emerging words we cross-referenced the list
of rising words against a list of rare words, defined as
words with low overall frequencies in the fourth quarter
of 2013 (excluding proper nouns).
Top 10 Emerging Words on Twitter 2014
Words ρ Definition
unbothered 0.926 Unconcerned & Disengaged
gmfu 0.924 Get Me Fucked Up
joggers 0.908 Jogging pants
fuckboys 0.902 Losers, wimps, posers, etc.
rekt 0.900 Wrecked
tfw 0.879 That feel when
xans 0.878 Benzodiazepine pills
baeless 0.875 To be without a bae
boolin 0.857 Hanging out, esp. young men
lordt 0.854 Lord, as exclamation
Top 11-20 Emerging Words on Twitter 2014
Words ρ Definition
celfie 0.852 selfie
slays 0.843 impresses, succeeds at, etc.
famo 0.840 family and friends
fuckboi 0.838 fuckboy
(on) fleek 0.838 on point, esp. eyebrows
faved 0.836 to favorite something
gainz 0.828 earnings
bruuh 0.817 bro
amirite 0.816 am I right
notifs 0.808 notifications, especially online
http://www.google.co.uk/trends/explore#q=unbothered
S-shaped Curves
In the time charts for many of the rising and emerging
words we see clear s-curves or what look like the start
of s-curves.
S-shaped Curves
Similar results have also been found repeatedly in
sociolinguistic apparent time studies (see Labov, 2001),
as well as in corpus-based research in historical
linguistics (e.g. Nevalainen & Raumolin-Brunberg, 2003).
Similar results have also been obtained in research on
the diffusion of innovations (see Rogers, 2003), where it
is referred to as an S-shaped Curve of Diffusion.
https://www.uni-due.de/SHE/S-Curve.JPG
Rogers (2003: 11)
Summary: Time Patterns
New words rise (and fall) very quickly in Modern
English, with numerous new words entering the
language and quickly rising in usage every year.
The usage of emerging words over time tends to follow
an s-shaped curve, echoing results found in
sociolinguistic apparent time studies and diffusion of
innovation research.
Goals of Today’s Talk
Identify emerging words from 2014 based on a multi-
billion word corpus of American tweets.
Chart their usage over time and identify common
temporal patterns of lexical spread.
Map their geographical diffusion and identify common
spatial patterns of lexical spread.
Mapping the Spread of New Words
An important technical problem is how to map the
spread of a new word across a region.
One approach is to map the relative frequency (e.g.
occurrences per million words) of the word across a
series of regional corpora (e.g. all the tweets from a
particular county) over a series of time points.
Geographical Diffusion of Linguistic Forms
Two major theories have been proposed to explain how
new linguistic forms generally spread in language:
The Wave Model states that new forms spread out
radially from their source.
The Gravity Model states that new forms spread out
from one urban area to the next, based on distance
and population size, only later filling in less
populated areas in between.
Assessing the Wave and Gravity Models
We can begin assess the validity of the wave and
gravity models for lexical spread by comparing the
spread of unbothered.
This analysis can be facilitated by focusing on one state
where the form eventually becomes relatively common,
for example Georgia.
Atlanta
Columbus
Macon
Augusta
Savannah
Population Density of Georgia
Atlanta
Columbus
Macon
Augusta
Savannah
01 November 2013
Atlanta
Columbus
Macon
Augusta
Savannah
01 December 2013
Atlanta
Columbus
Macon
Augusta
Savannah
01 January 2014
Atlanta
Columbus
Macon
Augusta
Savannah
01 February 2014
Atlanta
Columbus
Macon
Augusta
Savannah
01 March 2014
Atlanta
Columbus
Macon
Augusta
Savannah
01 April 2014
Atlanta
Columbus
Macon
Augusta
Savannah
01 May 2014
Atlanta
Columbus
Macon
Augusta
Savannah
01 June 2014
Atlanta
Columbus
Macon
Augusta
Savannah
01 July 2014
Atlanta
Columbus
Macon
Augusta
Savannah
01 August 2014
Atlanta
Columbus
Macon
Augusta
Savannah
01 September 2014
Atlanta
Columbus
Macon
Augusta
Savannah
01 October 2014
Atlanta
Columbus
Macon
Augusta
Savannah
01 November 2014
Assessing the Wave and Gravity Models
The geographical spread of unbothered in Georgia
appears to be more complex than predicted by the
Wave or Gravity Model, although both appear to offer a
partial explanation for this pattern of spread
The percentage of African Americans, however, also
appears to be an important predictor.
African Americans in Georgia
Atlanta
Columbus
Macon
Augusta
Savannah
Atlanta
Columbus
Macon
Augusta
Savannah
01 November 2014
01 November 2014
Atlanta
Columbus
Macon
Augusta
Savannah
Presenting a time series of maps is an effective way to
map lexical spread, but another technical issue is how
to map emerging words on one map:
Relative frequency
Date of first (or second...) occurrence
Number of words until first (or second...) occurrence
Mapping the Spread of New Words on One Map
Top 10 Emerging Words on Twitter 2014
Words ρ Definition
unbothered 0.926 Unconcerned & Disengaged
gmfu 0.924 Get Me Fucked Up
joggers 0.908 Jogging pants
fuckboys 0.902 Losers, wimps, posers, etc.
rekt 0.900 Wrecked
tfw 0.879 That feel when
xans 0.878 Benzodiazepine pills
baeless 0.875 To be without a bae
boolin 0.857 Hanging out, esp. young men
lordt 0.854 Lord, as exclamation
Top 11-20 Emerging Words on Twitter 2014
Words ρ Definition
celfie 0.852 selfie
slays 0.843 impresses, succeeds at, etc.
famo 0.840 family and friends
fuckboi 0.838 fuckboy
(on) fleek 0.838 on point, esp. eyebrows
faved 0.836 to favorite something
gainz 0.828 earnings
bruuh 0.817 bro
amirite 0.816 am I right
notifs 0.808 notifications, especially online
Summary: Regional Patterns
New words originate from across the US, including the
Southeast (e.g. Unbothered, Baeless, Boolin), the North
(e.g. Fuckboy, Gainz), and the West (e.g. Wrekt), and
tend to spread within these regions first.
Otherwise, the spread of new words appears to be highly
complex, affected by numerous factors, including
proximity, population density, and demographic patterns.
Traditional Approaches to Historical Linguistics
The empirical analysis of language change is generally
based on historical corpora, which tend to span
centuries, or collections of linguistic interviews, which
tend to span generations (i.e. based on apparent time).
Both sources of data tend to provide a broad temporal
scope but limited temporal resolution and amounts of
data (<1 million words).
The Uniformitarian Principle
“Knowledge of processes that operated in the past can
be inferred by observing ongoing processes in the
present” (Christy, 1983: ix).
This Uniformitarian Principle is cited in Labov (2001) to
justify the use of apparent time interview data in place of
historical corpora, but it also justifies the use of
extremely large and dense contemporary corpora in
place of both of these more common approaches.
A Modern Approach to Historical Linguistics
Analysing with modern language data mined from online
sources allows for unprecedentedly large, rich and
dense natural language corpora to be compiled.
Although historical scope is lost, this approach allows for
language change to be analysed in far greater detail
than would otherwise be possible.
Tracking the Emergence of
New Words across Time and Space
Jack Grieve
Centre for Forensic Linguistics
Aston University
Email: j.grieve1@aston.ac.uk
Website: https://sites.google.com/site/jackgrieveaston
Twitter: @JWGrieve

More Related Content

Similar to Tracking the Emergence of New Words across Time and Space

Twitter provides a selfie of envolving language
Twitter provides a selfie of envolving languageTwitter provides a selfie of envolving language
Twitter provides a selfie of envolving languageTERMCAT
 
[DSC Europe 22] Hedonometry and big data - Petar Kocovic & Muthu Ramachandran
[DSC Europe 22] Hedonometry and big data - Petar Kocovic & Muthu Ramachandran[DSC Europe 22] Hedonometry and big data - Petar Kocovic & Muthu Ramachandran
[DSC Europe 22] Hedonometry and big data - Petar Kocovic & Muthu RamachandranDataScienceConferenc1
 
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docx
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docxSpeaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docx
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docxwilliame8
 
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docx
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docxSpeaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docx
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docxrafbolet0
 
Towards Modelling Language Innovation Acceptance in Online Social Networks
Towards Modelling Language Innovation Acceptance in Online Social NetworksTowards Modelling Language Innovation Acceptance in Online Social Networks
Towards Modelling Language Innovation Acceptance in Online Social NetworksDaniel Kershaw
 
Large-scale norming and statistical analysis of 870 American English idioms.pdf
Large-scale norming and statistical analysis of 870 American English idioms.pdfLarge-scale norming and statistical analysis of 870 American English idioms.pdf
Large-scale norming and statistical analysis of 870 American English idioms.pdfFaishaMaeTangog
 
Use Your Words: Content Strategy to Influence Behavior
Use Your Words: Content Strategy to Influence BehaviorUse Your Words: Content Strategy to Influence Behavior
Use Your Words: Content Strategy to Influence BehaviorLiz Danzico
 
Metaphic or the art of looking another way.
Metaphic or the art of looking another way.Metaphic or the art of looking another way.
Metaphic or the art of looking another way.Suresh Manian
 
Outline For A Persuasive Essay. Persuasive Essay Outline by NjoyL1fe Teaching...
Outline For A Persuasive Essay. Persuasive Essay Outline by NjoyL1fe Teaching...Outline For A Persuasive Essay. Persuasive Essay Outline by NjoyL1fe Teaching...
Outline For A Persuasive Essay. Persuasive Essay Outline by NjoyL1fe Teaching...Maggie Cooper
 
Module #6SSC-327(N.P)There is several ways of cross cultural
Module #6SSC-327(N.P)There is several ways of cross culturalModule #6SSC-327(N.P)There is several ways of cross cultural
Module #6SSC-327(N.P)There is several ways of cross culturalIlonaThornburg83
 
Topical tweets 2018
Topical tweets 2018Topical tweets 2018
Topical tweets 2018ENSFCEnglish
 
parts of speech,punctuation,use of grammer,active passive voice, change of ac...
parts of speech,punctuation,use of grammer,active passive voice, change of ac...parts of speech,punctuation,use of grammer,active passive voice, change of ac...
parts of speech,punctuation,use of grammer,active passive voice, change of ac...UmarKhan422
 
Traditions and language.pdf
Traditions and language.pdfTraditions and language.pdf
Traditions and language.pdfLidaZhunusova
 
Anti social media - Racism on Twitter
Anti social media - Racism on TwitterAnti social media - Racism on Twitter
Anti social media - Racism on TwitterDario Caliendo
 
How Languages WorkAn Introduction to Language and LinguisticsSecond Ed.docx
How Languages WorkAn Introduction to Language and LinguisticsSecond Ed.docxHow Languages WorkAn Introduction to Language and LinguisticsSecond Ed.docx
How Languages WorkAn Introduction to Language and LinguisticsSecond Ed.docxsandraa52
 

Similar to Tracking the Emergence of New Words across Time and Space (20)

Twitter provides a selfie of envolving language
Twitter provides a selfie of envolving languageTwitter provides a selfie of envolving language
Twitter provides a selfie of envolving language
 
[DSC Europe 22] Hedonometry and big data - Petar Kocovic & Muthu Ramachandran
[DSC Europe 22] Hedonometry and big data - Petar Kocovic & Muthu Ramachandran[DSC Europe 22] Hedonometry and big data - Petar Kocovic & Muthu Ramachandran
[DSC Europe 22] Hedonometry and big data - Petar Kocovic & Muthu Ramachandran
 
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docx
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docxSpeaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docx
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docx
 
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docx
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docxSpeaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docx
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docx
 
Towards Modelling Language Innovation Acceptance in Online Social Networks
Towards Modelling Language Innovation Acceptance in Online Social NetworksTowards Modelling Language Innovation Acceptance in Online Social Networks
Towards Modelling Language Innovation Acceptance in Online Social Networks
 
Large-scale norming and statistical analysis of 870 American English idioms.pdf
Large-scale norming and statistical analysis of 870 American English idioms.pdfLarge-scale norming and statistical analysis of 870 American English idioms.pdf
Large-scale norming and statistical analysis of 870 American English idioms.pdf
 
Use Your Words: Content Strategy to Influence Behavior
Use Your Words: Content Strategy to Influence BehaviorUse Your Words: Content Strategy to Influence Behavior
Use Your Words: Content Strategy to Influence Behavior
 
Metaphic or the art of looking another way.
Metaphic or the art of looking another way.Metaphic or the art of looking another way.
Metaphic or the art of looking another way.
 
BORRERO-SLANG-WORDS.pdf
BORRERO-SLANG-WORDS.pdfBORRERO-SLANG-WORDS.pdf
BORRERO-SLANG-WORDS.pdf
 
Outline For A Persuasive Essay. Persuasive Essay Outline by NjoyL1fe Teaching...
Outline For A Persuasive Essay. Persuasive Essay Outline by NjoyL1fe Teaching...Outline For A Persuasive Essay. Persuasive Essay Outline by NjoyL1fe Teaching...
Outline For A Persuasive Essay. Persuasive Essay Outline by NjoyL1fe Teaching...
 
Module #6SSC-327(N.P)There is several ways of cross cultural
Module #6SSC-327(N.P)There is several ways of cross culturalModule #6SSC-327(N.P)There is several ways of cross cultural
Module #6SSC-327(N.P)There is several ways of cross cultural
 
Topical tweets 2018
Topical tweets 2018Topical tweets 2018
Topical tweets 2018
 
Sl for blog
Sl for blogSl for blog
Sl for blog
 
Sl for blog
Sl for blogSl for blog
Sl for blog
 
parts of speech,punctuation,use of grammer,active passive voice, change of ac...
parts of speech,punctuation,use of grammer,active passive voice, change of ac...parts of speech,punctuation,use of grammer,active passive voice, change of ac...
parts of speech,punctuation,use of grammer,active passive voice, change of ac...
 
Pragmatics
PragmaticsPragmatics
Pragmatics
 
Traditions and language.pdf
Traditions and language.pdfTraditions and language.pdf
Traditions and language.pdf
 
Why We Study Words?
Why We Study Words?Why We Study Words?
Why We Study Words?
 
Anti social media - Racism on Twitter
Anti social media - Racism on TwitterAnti social media - Racism on Twitter
Anti social media - Racism on Twitter
 
How Languages WorkAn Introduction to Language and LinguisticsSecond Ed.docx
How Languages WorkAn Introduction to Language and LinguisticsSecond Ed.docxHow Languages WorkAn Introduction to Language and LinguisticsSecond Ed.docx
How Languages WorkAn Introduction to Language and LinguisticsSecond Ed.docx
 

More from Digital History

Ihr dig hist_teachingpanel_feb2020
Ihr dig hist_teachingpanel_feb2020Ihr dig hist_teachingpanel_feb2020
Ihr dig hist_teachingpanel_feb2020Digital History
 
Ihr dig hist_teachingpanel_feb2020
Ihr dig hist_teachingpanel_feb2020Ihr dig hist_teachingpanel_feb2020
Ihr dig hist_teachingpanel_feb2020Digital History
 
Commemorating the Great War on Twitter
Commemorating the Great War on TwitterCommemorating the Great War on Twitter
Commemorating the Great War on TwitterDigital History
 
Community Archives and Ethics
Community Archives and EthicsCommunity Archives and Ethics
Community Archives and EthicsDigital History
 
Contemporary web archives ihr
Contemporary web archives ihrContemporary web archives ihr
Contemporary web archives ihrDigital History
 
The ‘Digital Thematic Deconstruction’ of early modern urban maps and bird’s-e...
The ‘Digital Thematic Deconstruction’ of early modern urban maps and bird’s-e...The ‘Digital Thematic Deconstruction’ of early modern urban maps and bird’s-e...
The ‘Digital Thematic Deconstruction’ of early modern urban maps and bird’s-e...Digital History
 
The Language of Migration in the Victorian Press: A Corpus Linguistic Approach
The Language of Migration in the Victorian Press: A Corpus Linguistic ApproachThe Language of Migration in the Victorian Press: A Corpus Linguistic Approach
The Language of Migration in the Victorian Press: A Corpus Linguistic ApproachDigital History
 
Identifying responses to revolution
Identifying responses to revolutionIdentifying responses to revolution
Identifying responses to revolutionDigital History
 
Chance encounters with the past
Chance encounters with the pastChance encounters with the past
Chance encounters with the pastDigital History
 
The lives and criminal careers of juvenile offenders
The lives and criminal careers of juvenile offendersThe lives and criminal careers of juvenile offenders
The lives and criminal careers of juvenile offendersDigital History
 
Tudor Intelligence Networks - Ruth Ahnert
Tudor Intelligence Networks - Ruth AhnertTudor Intelligence Networks - Ruth Ahnert
Tudor Intelligence Networks - Ruth AhnertDigital History
 
The Pictorial publisher - Agents technologies and the illustrrated book in Br...
The Pictorial publisher - Agents technologies and the illustrrated book in Br...The Pictorial publisher - Agents technologies and the illustrrated book in Br...
The Pictorial publisher - Agents technologies and the illustrrated book in Br...Digital History
 
Cordell scientific american
Cordell scientific americanCordell scientific american
Cordell scientific americanDigital History
 
Political Meetings Mapper with British Library Labs: mapping the origins of B...
Political Meetings Mapper with British Library Labs: mapping the origins of B...Political Meetings Mapper with British Library Labs: mapping the origins of B...
Political Meetings Mapper with British Library Labs: mapping the origins of B...Digital History
 
European or Imperial Metropolis? Depictions of London in British Newspapers, ...
European or Imperial Metropolis? Depictions of London in British Newspapers, ...European or Imperial Metropolis? Depictions of London in British Newspapers, ...
European or Imperial Metropolis? Depictions of London in British Newspapers, ...Digital History
 
The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...
The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...
The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...Digital History
 
Emma Bayne: ‘Traces Through Time overview and next steps’
Emma Bayne: ‘Traces Through Time overview and next steps’ Emma Bayne: ‘Traces Through Time overview and next steps’
Emma Bayne: ‘Traces Through Time overview and next steps’ Digital History
 

More from Digital History (20)

Ihr dig hist_teachingpanel_feb2020
Ihr dig hist_teachingpanel_feb2020Ihr dig hist_teachingpanel_feb2020
Ihr dig hist_teachingpanel_feb2020
 
Ihr dig hist_teachingpanel_feb2020
Ihr dig hist_teachingpanel_feb2020Ihr dig hist_teachingpanel_feb2020
Ihr dig hist_teachingpanel_feb2020
 
Commemorating the Great War on Twitter
Commemorating the Great War on TwitterCommemorating the Great War on Twitter
Commemorating the Great War on Twitter
 
Community Archives and Ethics
Community Archives and EthicsCommunity Archives and Ethics
Community Archives and Ethics
 
Contemporary web archives ihr
Contemporary web archives ihrContemporary web archives ihr
Contemporary web archives ihr
 
The ‘Digital Thematic Deconstruction’ of early modern urban maps and bird’s-e...
The ‘Digital Thematic Deconstruction’ of early modern urban maps and bird’s-e...The ‘Digital Thematic Deconstruction’ of early modern urban maps and bird’s-e...
The ‘Digital Thematic Deconstruction’ of early modern urban maps and bird’s-e...
 
The Language of Migration in the Victorian Press: A Corpus Linguistic Approach
The Language of Migration in the Victorian Press: A Corpus Linguistic ApproachThe Language of Migration in the Victorian Press: A Corpus Linguistic Approach
The Language of Migration in the Victorian Press: A Corpus Linguistic Approach
 
Identifying responses to revolution
Identifying responses to revolutionIdentifying responses to revolution
Identifying responses to revolution
 
Chance encounters with the past
Chance encounters with the pastChance encounters with the past
Chance encounters with the past
 
The lives and criminal careers of juvenile offenders
The lives and criminal careers of juvenile offendersThe lives and criminal careers of juvenile offenders
The lives and criminal careers of juvenile offenders
 
History of teaching ihr
History of teaching ihrHistory of teaching ihr
History of teaching ihr
 
Tudor Intelligence Networks - Ruth Ahnert
Tudor Intelligence Networks - Ruth AhnertTudor Intelligence Networks - Ruth Ahnert
Tudor Intelligence Networks - Ruth Ahnert
 
The Pictorial publisher - Agents technologies and the illustrrated book in Br...
The Pictorial publisher - Agents technologies and the illustrrated book in Br...The Pictorial publisher - Agents technologies and the illustrrated book in Br...
The Pictorial publisher - Agents technologies and the illustrrated book in Br...
 
Cordell scientific american
Cordell scientific americanCordell scientific american
Cordell scientific american
 
Mapping paris
Mapping parisMapping paris
Mapping paris
 
Political Meetings Mapper with British Library Labs: mapping the origins of B...
Political Meetings Mapper with British Library Labs: mapping the origins of B...Political Meetings Mapper with British Library Labs: mapping the origins of B...
Political Meetings Mapper with British Library Labs: mapping the origins of B...
 
European or Imperial Metropolis? Depictions of London in British Newspapers, ...
European or Imperial Metropolis? Depictions of London in British Newspapers, ...European or Imperial Metropolis? Depictions of London in British Newspapers, ...
European or Imperial Metropolis? Depictions of London in British Newspapers, ...
 
The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...
The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...
The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...
 
Emma Bayne: ‘Traces Through Time overview and next steps’
Emma Bayne: ‘Traces Through Time overview and next steps’ Emma Bayne: ‘Traces Through Time overview and next steps’
Emma Bayne: ‘Traces Through Time overview and next steps’
 
Ihr june15-evans
Ihr june15-evansIhr june15-evans
Ihr june15-evans
 

Recently uploaded

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxShobhayan Kirtania
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...anjaliyadav012327
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 

Recently uploaded (20)

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptx
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 

Tracking the Emergence of New Words across Time and Space

  • 1. Tracking the Emergence of New Words across Time and Space Jack Grieve Aston University Research conducted with Diansheng Guo & Alice Kasakoff, University of South Carolina Andrea Nini, Aston University Funded as part of the Digging into Data Challenge
  • 2. Approaches to Historical Linguistics There are several different approaches to the analysis of language change: Reconstruction through comparison of known languages (comparative method) Analysis of previous linguistic research (e.g. lexicographic research) Analysis of historical texts (corpus-based) Apparent time studies with interview data (sociolinguistics) Computer simulations
  • 3. Lexical Change Research in historical linguistics and etymology has analysed how the usage of certain words have changed over relatively long periods of time (primarily based on historical corpora and lexicographic research), but overall there are large gaps in our knowledge of lexical change, including how newly emerging words enter a language and spread across its speakers.
  • 4. Words are Rare Events The main problem with studying lexical variation and change is that most words are incredibly rare, thus requiring incredibly large corpora of natural language. This is why most research on lexical variation and change has focused on relatively high frequency words, primarily function words (e.g. pronouns, prepositions, auxiliary verbs).
  • 5. Word Frequency Distribution (Zipf 1935, 1945)
  • 6. Word Frequency Distribution (Zipf 1935, 1945)
  • 7. The majority of the 67,000 most frequent words in our corpus occur less than once per 25 million words Word Frequency Distribution (Zipf 1935, 1945)
  • 8. New Words are Incredibly Rare Events The analysis of new words requires even more data, because emerging words are by definition especially rare. In addition, to analyse the temporal and spatial spread of new words, large corpora must be compiled for a large number of points in times and locations.
  • 9. Big Data Suitable data has recently become available with the rise of the social media and smartphones, which provide massive amounts of time-stamped and geo- coded natural language data.
  • 10. Goals of Today’s Talk Identify emerging words from 2014 based on a multi- billion word corpus of American tweets. Chart their usage over time and identify common temporal patterns of lexical spread. Map their geographical diffusion and identify common spatial patterns of lexical spread.
  • 11. The Corpus Since 2013, the team at USC have been compiling two multi-billion word geocoded corpora for the US and the UK using the Twitter API. Twitter is a particularly rich source of geocoded data and is also very popular, informal, and youthful, making it ideal for tracking the emergence of new words. Approximately 2% of tweets are geocoded.
  • 12. The Corpus The analysis today is based on a 8.9 billion word corpus of American Tweets from October 2013- November 2014, which totals approximately 980 million Tweets from 7 million users. Every tweet is geocoded with the precise longitude and latitude of the user when posting, which were then used to identify the county where each Tweet was produced.
  • 13. -­‐87.684555,42.074043 Just  posted  a  photo  @  Baha'i  House  of  Worship  
  • 14. -­‐87.684555,42.074043 Just  posted  a  photo  @  Baha'i  House  of  Worship  
  • 15. -­‐87.684555,42.074043 Just  posted  a  photo  @  Baha'i  House  of  Worship  
  • 16. -­‐87.684555,42.074043 Just  posted  a  photo  @  Baha'i  House  of  Worship  
  • 17. -­‐87.684555,42.074043 Just  posted  a  photo  @  Baha'i  House  of  Worship  
  • 18. -­‐87.684555,42.074043 Just  posted  a  photo  @  Baha'i  House  of  Worship  
  • 19. -­‐87.684555,42.074043 Just  posted  a  photo  @  Baha'i  House  of  Worship  
  • 20. -­‐87.684555,42.074043 Just  posted  a  photo  @  Baha'i  House  of  Worship  
  • 21. Corpus Examples username,fips,time,tweet -­‐,48439,Sun  Jul  27  23:59:59  EDT  2014, don't  follow  the  right  ppl  lol -­‐,42007,Sun  Jul  27  23:59:59  EDT  2014, yesss  moody  judy -­‐,36005,Sun  Jul  27  23:59:59  EDT  2014, Man  i  was  just  thinking  shexx  be  lurking  but  won't  hmu -­‐,25021,Sun  Jul  27  23:59:59  EDT  2014, no  seeing  u  on  tv  is  reel  but  not  seeing  u  on  twitter   is  real  for  me...so  pls  visit  us  here  everyday. -­‐,26163,Sun  Jul  27  23:59:59  EDT  2014, Hate  seeing  my  friends  sad -­‐,12093,Sun  Jul  27  23:59:59  EDT  2014, this  is  the  shirt  i  won  that  i  got  to  sign  btw!!:)
  • 26. Identifying Rising Words To find newly emerging words, we first measured the degree to which the usage of each word in the corpus had been rising over the 13 month period. To identify these rising words we extracted the 67,000 words that occur at least 1,000 times in the corpus and compared word relative frequency per day to day of the year using a Spearman’s rank correlation coefficient.
  • 29. ρ = .044ρ = -.028
  • 30. The Top 10 Rising Words on Twitter 2014 Word ρ Definition fuckboy 0.947 Asshole, Jerk, Poser, Tool, etc. rn 0.938 Right Now (Top Riser 2013) hbd 0.928 Happy Birthday fw 0.927 Fuck with unbothered 0.926 Unconcerned & Disengaged ft 0.925 Face time gmfu 0.924 Get me fucked up sm 0.919 So Much squad 0.919 Squad asf 0.918 As fuck
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36. Identifying Emerging Words Although measuring correlations allows for rising words to be identified, most are far too common by 2014 to show patterns of regional spread. To identify emerging words we cross-referenced the list of rising words against a list of rare words, defined as words with low overall frequencies in the fourth quarter of 2013 (excluding proper nouns).
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 47. Top 10 Emerging Words on Twitter 2014 Words ρ Definition unbothered 0.926 Unconcerned & Disengaged gmfu 0.924 Get Me Fucked Up joggers 0.908 Jogging pants fuckboys 0.902 Losers, wimps, posers, etc. rekt 0.900 Wrecked tfw 0.879 That feel when xans 0.878 Benzodiazepine pills baeless 0.875 To be without a bae boolin 0.857 Hanging out, esp. young men lordt 0.854 Lord, as exclamation
  • 48. Top 11-20 Emerging Words on Twitter 2014 Words ρ Definition celfie 0.852 selfie slays 0.843 impresses, succeeds at, etc. famo 0.840 family and friends fuckboi 0.838 fuckboy (on) fleek 0.838 on point, esp. eyebrows faved 0.836 to favorite something gainz 0.828 earnings bruuh 0.817 bro amirite 0.816 am I right notifs 0.808 notifications, especially online
  • 49.
  • 51.
  • 52.
  • 53.
  • 54.
  • 55.
  • 56.
  • 57.
  • 58.
  • 59.
  • 60.
  • 61. S-shaped Curves In the time charts for many of the rising and emerging words we see clear s-curves or what look like the start of s-curves.
  • 62. S-shaped Curves Similar results have also been found repeatedly in sociolinguistic apparent time studies (see Labov, 2001), as well as in corpus-based research in historical linguistics (e.g. Nevalainen & Raumolin-Brunberg, 2003). Similar results have also been obtained in research on the diffusion of innovations (see Rogers, 2003), where it is referred to as an S-shaped Curve of Diffusion.
  • 65. Summary: Time Patterns New words rise (and fall) very quickly in Modern English, with numerous new words entering the language and quickly rising in usage every year. The usage of emerging words over time tends to follow an s-shaped curve, echoing results found in sociolinguistic apparent time studies and diffusion of innovation research.
  • 66. Goals of Today’s Talk Identify emerging words from 2014 based on a multi- billion word corpus of American tweets. Chart their usage over time and identify common temporal patterns of lexical spread. Map their geographical diffusion and identify common spatial patterns of lexical spread.
  • 67. Mapping the Spread of New Words An important technical problem is how to map the spread of a new word across a region. One approach is to map the relative frequency (e.g. occurrences per million words) of the word across a series of regional corpora (e.g. all the tweets from a particular county) over a series of time points.
  • 68.
  • 69.
  • 70.
  • 71.
  • 72.
  • 73.
  • 74.
  • 75.
  • 76.
  • 77.
  • 78.
  • 79.
  • 80.
  • 81. Geographical Diffusion of Linguistic Forms Two major theories have been proposed to explain how new linguistic forms generally spread in language: The Wave Model states that new forms spread out radially from their source. The Gravity Model states that new forms spread out from one urban area to the next, based on distance and population size, only later filling in less populated areas in between.
  • 82. Assessing the Wave and Gravity Models We can begin assess the validity of the wave and gravity models for lexical spread by comparing the spread of unbothered. This analysis can be facilitated by focusing on one state where the form eventually becomes relatively common, for example Georgia.
  • 97. Assessing the Wave and Gravity Models The geographical spread of unbothered in Georgia appears to be more complex than predicted by the Wave or Gravity Model, although both appear to offer a partial explanation for this pattern of spread The percentage of African Americans, however, also appears to be an important predictor.
  • 98. African Americans in Georgia Atlanta Columbus Macon Augusta Savannah
  • 101. Presenting a time series of maps is an effective way to map lexical spread, but another technical issue is how to map emerging words on one map: Relative frequency Date of first (or second...) occurrence Number of words until first (or second...) occurrence Mapping the Spread of New Words on One Map
  • 102.
  • 103.
  • 104.
  • 105.
  • 106.
  • 107.
  • 108. Top 10 Emerging Words on Twitter 2014 Words ρ Definition unbothered 0.926 Unconcerned & Disengaged gmfu 0.924 Get Me Fucked Up joggers 0.908 Jogging pants fuckboys 0.902 Losers, wimps, posers, etc. rekt 0.900 Wrecked tfw 0.879 That feel when xans 0.878 Benzodiazepine pills baeless 0.875 To be without a bae boolin 0.857 Hanging out, esp. young men lordt 0.854 Lord, as exclamation
  • 109.
  • 110.
  • 111.
  • 112.
  • 113.
  • 114.
  • 115.
  • 116.
  • 117.
  • 118. Top 11-20 Emerging Words on Twitter 2014 Words ρ Definition celfie 0.852 selfie slays 0.843 impresses, succeeds at, etc. famo 0.840 family and friends fuckboi 0.838 fuckboy (on) fleek 0.838 on point, esp. eyebrows faved 0.836 to favorite something gainz 0.828 earnings bruuh 0.817 bro amirite 0.816 am I right notifs 0.808 notifications, especially online
  • 119.
  • 120. Summary: Regional Patterns New words originate from across the US, including the Southeast (e.g. Unbothered, Baeless, Boolin), the North (e.g. Fuckboy, Gainz), and the West (e.g. Wrekt), and tend to spread within these regions first. Otherwise, the spread of new words appears to be highly complex, affected by numerous factors, including proximity, population density, and demographic patterns.
  • 121. Traditional Approaches to Historical Linguistics The empirical analysis of language change is generally based on historical corpora, which tend to span centuries, or collections of linguistic interviews, which tend to span generations (i.e. based on apparent time). Both sources of data tend to provide a broad temporal scope but limited temporal resolution and amounts of data (<1 million words).
  • 122. The Uniformitarian Principle “Knowledge of processes that operated in the past can be inferred by observing ongoing processes in the present” (Christy, 1983: ix). This Uniformitarian Principle is cited in Labov (2001) to justify the use of apparent time interview data in place of historical corpora, but it also justifies the use of extremely large and dense contemporary corpora in place of both of these more common approaches.
  • 123. A Modern Approach to Historical Linguistics Analysing with modern language data mined from online sources allows for unprecedentedly large, rich and dense natural language corpora to be compiled. Although historical scope is lost, this approach allows for language change to be analysed in far greater detail than would otherwise be possible.
  • 124. Tracking the Emergence of New Words across Time and Space Jack Grieve Centre for Forensic Linguistics Aston University Email: j.grieve1@aston.ac.uk Website: https://sites.google.com/site/jackgrieveaston Twitter: @JWGrieve