SlideShare a Scribd company logo
1 of 19
A TOPIC ANALYSIS APPROACH TO REVEALING
DISCUSSIONS ON THE AUSTRALIAN TWITTERSPHERE
Brenda Moon
Queensland University of Technology
Introduction
This paper investigates techniques to identify the topics
being discussed in one week of tweets from the
Australian Twittersphere. Tweets were extracted from a
comprehensive dataset which captures all tweets by
2.8m Australian: the Tracking Infrastructure for Social
Media Analysis (TrISMA) (Bruns, Burgess & Banks et al.,
2016).
Selected week: Sunday 2 August to Saturday 8
August 2015
• Thursday 6th August 2015 was used for One Day in
the Life of a National Twittersphere (Axel Bruns and Brenda
Moon, presented at Social Media and Society, London, 13 July 2016)
• Same day used for initial development of topic
modelling approach
• Then extended to full week
Latent Dirichlet Allocation
Blei, D. M. (2011)
Data cleaning
• Remove
– retweets & multitweets (“rt”, “mt” or “via”)
– URLs
– dates, times, distances & weights
– Words less than 3 characters
– elipses ('...’)
• NTLK tokenisation using Twitter Tokenizer
– Remove all @users and urls
– Lowercase
• Convert
– HTML entities to text
– Hashtags to words (trim ‘#’ off hashtags)
• NLTK lemmatisation
• NLTK stopwords
Hashtag pooling
• Mehrotra, Sanner, Buntine & Xie (2013) looked at
different options of ‘pooling’ tweets into documents
before LDA analysis to see if this could increase accuracy.
They found that hashtag pooling was effective (best was
hashtag pooling with clustering, but more complex to
apply)
• Group all the tweets with hashtags into documents for
each hashtag (some tweets will be added into more than
one document)
• Tweets without hashtags stay as individual documents
Corpus filtering
(Thursday 6 August 2015)
• Raw tweets: 963,064
• After data cleaning: 583,528
• After hashtag pooling: 516,263
– 23% of tweets had hashtags
• Dictionary pruning – remove most frequent and least
frequent terms
– no_above=0.5 (percent of documents), no_below=5
(documents)
– 223,157 unique tokens reduced to 49,964 unique tokens
Latent Dirichlet Allocation (LDA)
• Gensim LDA (Lau & Baldwin, 2014)
• LdaMulticore
• Identify 30 topics
• 100 passes
Thursday 6th August 2015 – overall terms
https://github.com/bmabey/pyLDAvis
Thursday 6th August 2015
Topic 2: Politics / coal / China / Queensland
Thursday 6th August 2015
Topic 5: Cricket – The Ashes
Thursday 6th August 2015
Topic 5: Cricket – The Ashes
Thursday 6th August 2015
Topic 5: Cricket – The Ashes
Thursday 6th August 2015
Topic 5: Cricket – The Ashes
Thursday 6th
August 2015
30 topics,
With hashtag
pooling.
MH370
Thursday 6th
August 2015
30 topics,
With hashtag
pooling.
Comparison to
other study
Pop?
Teen culture?
MH370
1.1m tweets from 147k, to 224k accounts
294k nodes total, including non-Australians
535k edges from 856k @mentions / RTs
Visualisation: Gephi, Force Atlas 2
Colours: Gephi, modularity resolution 1.0
Labels assigned through qualitative evaluation
Politics
Cricket
Teen Culture
Pop
From “One Day in the Life of a National Twittersphere” by Axel Bruns and
Brenda Moon, presented at Social Media and Society, London, 13 July 2016.
Further Outlook
• Confirm initial topic labelling by looking at top tweets for each
topic
• Check whether the hashtag pooling has allowed non-hashtag
tweet topics to still be visible
• Use statistical coherence of model (U_Mass Coherence, C_V
coherence) to tune LDA parameters
• Model different numbers of topics (coarse/fine grain)
• Relate topics per user back to our mention network graphs
• Extend to the full week (or longer)
• Compare to alternative approaches
– Doc2Vec / Tensorflow / dynamic LDA etc
References
• Blei, D. M. (2011). Introduction to probabilistic topic models. Communications of the ACM, 1–
16. Retrieved from http://www.cs.princeton.edu/~blei/papers/Blei2011.pdf
• Mehrotra, R., Sanner, S., Buntine, W., & Xie, L. (2013). Improving LDA Topic Models for
Microblogs via Tweet Pooling and Automatic Labeling. Proceedings of the 36th International
ACM SIGIR Conference on Research and Development in Information Retrieval, 889–892.
http://doi.org/10.1145/2484028.2484166
• Lau, J. H., & Baldwin, T. (2014). An Empirical Evaluation of doc2vec with Practical Insights into
Document Embedding Generation.
• Puschmann, C., & Scheffler, T. (2016). Topic modeling for media and communication
research : A short primer (HIIG Discussion Paper Series No. 2016–5). Retrieved from
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2836478
• Sievert, C., & Shirley, K. (2014). LDAvis: A method for visualizing and interpreting topics.
Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces,
63–70. Retrieved from http://www.aclweb.org/anthology/W/W14/W14-3110

More Related Content

What's hot

Prelights a new community platform for preprint highlights
Prelights a new community platform for preprint highlightsPrelights a new community platform for preprint highlights
Prelights a new community platform for preprint highlightsCrossref
 
BioTorrents: A File Sharing Service for Scientific Data
BioTorrents: A File Sharing Service for Scientific DataBioTorrents: A File Sharing Service for Scientific Data
BioTorrents: A File Sharing Service for Scientific DataMorgan Langille
 
Integrating research indicators for use in the repositories infrastructure
Integrating research indicators for use in the repositories infrastructure Integrating research indicators for use in the repositories infrastructure
Integrating research indicators for use in the repositories infrastructure petrknoth
 
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...Nees Jan van Eck
 
Open minted content_provision
Open minted content_provisionOpen minted content_provision
Open minted content_provisionLucas anastasiou
 
Research Data Explored: Citations versus Altmetrics
Research Data Explored: Citations versus AltmetricsResearch Data Explored: Citations versus Altmetrics
Research Data Explored: Citations versus AltmetricsElisabeth Lex
 
Multi-Domain Alias Matching Using Machine Learning
Multi-Domain Alias Matching Using Machine LearningMulti-Domain Alias Matching Using Machine Learning
Multi-Domain Alias Matching Using Machine LearningAmendra Shrestha
 

What's hot (7)

Prelights a new community platform for preprint highlights
Prelights a new community platform for preprint highlightsPrelights a new community platform for preprint highlights
Prelights a new community platform for preprint highlights
 
BioTorrents: A File Sharing Service for Scientific Data
BioTorrents: A File Sharing Service for Scientific DataBioTorrents: A File Sharing Service for Scientific Data
BioTorrents: A File Sharing Service for Scientific Data
 
Integrating research indicators for use in the repositories infrastructure
Integrating research indicators for use in the repositories infrastructure Integrating research indicators for use in the repositories infrastructure
Integrating research indicators for use in the repositories infrastructure
 
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
 
Open minted content_provision
Open minted content_provisionOpen minted content_provision
Open minted content_provision
 
Research Data Explored: Citations versus Altmetrics
Research Data Explored: Citations versus AltmetricsResearch Data Explored: Citations versus Altmetrics
Research Data Explored: Citations versus Altmetrics
 
Multi-Domain Alias Matching Using Machine Learning
Multi-Domain Alias Matching Using Machine LearningMulti-Domain Alias Matching Using Machine Learning
Multi-Domain Alias Matching Using Machine Learning
 

Viewers also liked

7 Guidelines to be Awesome Rotaractors
7 Guidelines to be Awesome Rotaractors 7 Guidelines to be Awesome Rotaractors
7 Guidelines to be Awesome Rotaractors M. Januar Fariki
 
Innovazione 1. Tipi e livelli dell’Innovazione
Innovazione 1. Tipi e livelli dell’InnovazioneInnovazione 1. Tipi e livelli dell’Innovazione
Innovazione 1. Tipi e livelli dell’InnovazioneManager.it
 
Editorial_s_Monthly_Issues_Editorial_April_2015
Editorial_s_Monthly_Issues_Editorial_April_2015Editorial_s_Monthly_Issues_Editorial_April_2015
Editorial_s_Monthly_Issues_Editorial_April_2015Velina Iankova
 
O Centro - n.º 25 – 18.04.2007
O Centro - n.º 25 – 18.04.2007O Centro - n.º 25 – 18.04.2007
O Centro - n.º 25 – 18.04.2007MANCHETE
 
Tal Rappleyea Presents: Eviction Law in New York State
Tal Rappleyea Presents: Eviction Law in New York StateTal Rappleyea Presents: Eviction Law in New York State
Tal Rappleyea Presents: Eviction Law in New York StateTal Rappleyea
 
Hey! Nielsen - A Social Media Engagement and Research Platform Case Study by ...
Hey! Nielsen - A Social Media Engagement and Research Platform Case Study by ...Hey! Nielsen - A Social Media Engagement and Research Platform Case Study by ...
Hey! Nielsen - A Social Media Engagement and Research Platform Case Study by ...Affinitive
 
El ip y sus clases
El ip y sus clasesEl ip y sus clases
El ip y sus clasesDANNY MUÑOZ
 
Media Innovation and Entrepreneurship: Building an Environment for Change
Media Innovation and Entrepreneurship: Building an Environment for ChangeMedia Innovation and Entrepreneurship: Building an Environment for Change
Media Innovation and Entrepreneurship: Building an Environment for ChangeMichelle Ferrier
 
Politics of tweeting, tweeting of politics: The uses of social media by state...
Politics of tweeting, tweeting of politics: The uses of social media by state...Politics of tweeting, tweeting of politics: The uses of social media by state...
Politics of tweeting, tweeting of politics: The uses of social media by state...Brenda Moon
 
Entretém-me e lembrar-me-ei! - O uso dos jogos digitais na estratégia publici...
Entretém-me e lembrar-me-ei! - O uso dos jogos digitais na estratégia publici...Entretém-me e lembrar-me-ei! - O uso dos jogos digitais na estratégia publici...
Entretém-me e lembrar-me-ei! - O uso dos jogos digitais na estratégia publici...Sandra Oliveira
 

Viewers also liked (11)

7 Guidelines to be Awesome Rotaractors
7 Guidelines to be Awesome Rotaractors 7 Guidelines to be Awesome Rotaractors
7 Guidelines to be Awesome Rotaractors
 
Uso acceso a la tecnología
Uso acceso a la tecnologíaUso acceso a la tecnología
Uso acceso a la tecnología
 
Innovazione 1. Tipi e livelli dell’Innovazione
Innovazione 1. Tipi e livelli dell’InnovazioneInnovazione 1. Tipi e livelli dell’Innovazione
Innovazione 1. Tipi e livelli dell’Innovazione
 
Editorial_s_Monthly_Issues_Editorial_April_2015
Editorial_s_Monthly_Issues_Editorial_April_2015Editorial_s_Monthly_Issues_Editorial_April_2015
Editorial_s_Monthly_Issues_Editorial_April_2015
 
O Centro - n.º 25 – 18.04.2007
O Centro - n.º 25 – 18.04.2007O Centro - n.º 25 – 18.04.2007
O Centro - n.º 25 – 18.04.2007
 
Tal Rappleyea Presents: Eviction Law in New York State
Tal Rappleyea Presents: Eviction Law in New York StateTal Rappleyea Presents: Eviction Law in New York State
Tal Rappleyea Presents: Eviction Law in New York State
 
Hey! Nielsen - A Social Media Engagement and Research Platform Case Study by ...
Hey! Nielsen - A Social Media Engagement and Research Platform Case Study by ...Hey! Nielsen - A Social Media Engagement and Research Platform Case Study by ...
Hey! Nielsen - A Social Media Engagement and Research Platform Case Study by ...
 
El ip y sus clases
El ip y sus clasesEl ip y sus clases
El ip y sus clases
 
Media Innovation and Entrepreneurship: Building an Environment for Change
Media Innovation and Entrepreneurship: Building an Environment for ChangeMedia Innovation and Entrepreneurship: Building an Environment for Change
Media Innovation and Entrepreneurship: Building an Environment for Change
 
Politics of tweeting, tweeting of politics: The uses of social media by state...
Politics of tweeting, tweeting of politics: The uses of social media by state...Politics of tweeting, tweeting of politics: The uses of social media by state...
Politics of tweeting, tweeting of politics: The uses of social media by state...
 
Entretém-me e lembrar-me-ei! - O uso dos jogos digitais na estratégia publici...
Entretém-me e lembrar-me-ei! - O uso dos jogos digitais na estratégia publici...Entretém-me e lembrar-me-ei! - O uso dos jogos digitais na estratégia publici...
Entretém-me e lembrar-me-ei! - O uso dos jogos digitais na estratégia publici...
 

Similar to A Topic Analysis Approach To Revealing Discussions On The Australian Twittersphere

2016 09-28 social network analysis with node-xl_emke
2016 09-28 social network analysis with node-xl_emke2016 09-28 social network analysis with node-xl_emke
2016 09-28 social network analysis with node-xl_emkeDr Martina Emke
 
WWW2014: Long Time No See: The Probability of Reusing Tags as a Function of F...
WWW2014: Long Time No See: The Probability of Reusing Tags as a Function of F...WWW2014: Long Time No See: The Probability of Reusing Tags as a Function of F...
WWW2014: Long Time No See: The Probability of Reusing Tags as a Function of F...Dominik Kowald
 
Social Media in Australia: The Case of Twitter
Social Media in Australia: The Case of TwitterSocial Media in Australia: The Case of Twitter
Social Media in Australia: The Case of TwitterAxel Bruns
 
Visual network analysis of Twitter data for co-organizing conferences: Case C...
Visual network analysis of Twitter data for co-organizing conferences: Case C...Visual network analysis of Twitter data for co-organizing conferences: Case C...
Visual network analysis of Twitter data for co-organizing conferences: Case C...Jari Jussila
 
New Approaches to Large-Scale Social Media Analytics: Investigating Twitter i...
New Approaches to Large-Scale Social Media Analytics: Investigating Twitter i...New Approaches to Large-Scale Social Media Analytics: Investigating Twitter i...
New Approaches to Large-Scale Social Media Analytics: Investigating Twitter i...Axel Bruns
 
HT2016: Influence of Frequency, Recency and Semantic Context on Tag Reuse
HT2016: Influence of Frequency, Recency and Semantic Context on Tag ReuseHT2016: Influence of Frequency, Recency and Semantic Context on Tag Reuse
HT2016: Influence of Frequency, Recency and Semantic Context on Tag ReuseDominik Kowald
 
Using Twitter as a data source: An overview of ethical challenges
Using Twitter as a data source: An overview of ethical challengesUsing Twitter as a data source: An overview of ethical challenges
Using Twitter as a data source: An overview of ethical challengesDr Wasim Ahmed
 
Developing rich interactive eBooks to teach linked open data to professionals...
Developing rich interactive eBooks to teach linked open data to professionals...Developing rich interactive eBooks to teach linked open data to professionals...
Developing rich interactive eBooks to teach linked open data to professionals...John Domingue
 
Conducting Twitter Reserch
Conducting Twitter ReserchConducting Twitter Reserch
Conducting Twitter ReserchKim Holmberg
 
In Search of Australian Blogs: Determining the Extent of the Contemporary Aus...
In Search of Australian Blogs: Determining the Extent of the Contemporary Aus...In Search of Australian Blogs: Determining the Extent of the Contemporary Aus...
In Search of Australian Blogs: Determining the Extent of the Contemporary Aus...Axel Bruns
 
Managing 'Big Data' in the social sciences: the contribution of an analytico-...
Managing 'Big Data' in the social sciences: the contribution of an analytico-...Managing 'Big Data' in the social sciences: the contribution of an analytico-...
Managing 'Big Data' in the social sciences: the contribution of an analytico-...CILIP MDG
 
Pikas casci talk 11262013 final
Pikas casci talk 11262013 finalPikas casci talk 11262013 final
Pikas casci talk 11262013 finalChristina Pikas
 
Search, Exploration and Analytics of Evolving Data
Search, Exploration and Analytics of Evolving DataSearch, Exploration and Analytics of Evolving Data
Search, Exploration and Analytics of Evolving DataNattiya Kanhabua
 
The Future of Semantics on the Web
The Future of Semantics on the WebThe Future of Semantics on the Web
The Future of Semantics on the WebJohn Domingue
 
One Day in the Life of a National Twittersphere
One Day in the Life of a National TwittersphereOne Day in the Life of a National Twittersphere
One Day in the Life of a National TwittersphereAxel Bruns
 
New Methodologies for Capturing and Working with Publicly Available Twitter Data
New Methodologies for Capturing and Working with Publicly Available Twitter DataNew Methodologies for Capturing and Working with Publicly Available Twitter Data
New Methodologies for Capturing and Working with Publicly Available Twitter DataAxel Bruns
 
Disciplinary Differences in Twitter Scholarly Communication
Disciplinary Differences in Twitter Scholarly CommunicationDisciplinary Differences in Twitter Scholarly Communication
Disciplinary Differences in Twitter Scholarly CommunicationKim Holmberg
 
Temporal Effects on Hashtag Reuse in Twitter
Temporal Effects on Hashtag Reuse in TwitterTemporal Effects on Hashtag Reuse in Twitter
Temporal Effects on Hashtag Reuse in TwitterDominik Kowald
 
LIS 653 Knowledge Organization | Pratt Institute School of Information | Fall...
LIS 653 Knowledge Organization | Pratt Institute School of Information | Fall...LIS 653 Knowledge Organization | Pratt Institute School of Information | Fall...
LIS 653 Knowledge Organization | Pratt Institute School of Information | Fall...PrattSILS
 
Australia's Environmental Predictive Capability
Australia's Environmental Predictive CapabilityAustralia's Environmental Predictive Capability
Australia's Environmental Predictive CapabilityTERN Australia
 

Similar to A Topic Analysis Approach To Revealing Discussions On The Australian Twittersphere (20)

2016 09-28 social network analysis with node-xl_emke
2016 09-28 social network analysis with node-xl_emke2016 09-28 social network analysis with node-xl_emke
2016 09-28 social network analysis with node-xl_emke
 
WWW2014: Long Time No See: The Probability of Reusing Tags as a Function of F...
WWW2014: Long Time No See: The Probability of Reusing Tags as a Function of F...WWW2014: Long Time No See: The Probability of Reusing Tags as a Function of F...
WWW2014: Long Time No See: The Probability of Reusing Tags as a Function of F...
 
Social Media in Australia: The Case of Twitter
Social Media in Australia: The Case of TwitterSocial Media in Australia: The Case of Twitter
Social Media in Australia: The Case of Twitter
 
Visual network analysis of Twitter data for co-organizing conferences: Case C...
Visual network analysis of Twitter data for co-organizing conferences: Case C...Visual network analysis of Twitter data for co-organizing conferences: Case C...
Visual network analysis of Twitter data for co-organizing conferences: Case C...
 
New Approaches to Large-Scale Social Media Analytics: Investigating Twitter i...
New Approaches to Large-Scale Social Media Analytics: Investigating Twitter i...New Approaches to Large-Scale Social Media Analytics: Investigating Twitter i...
New Approaches to Large-Scale Social Media Analytics: Investigating Twitter i...
 
HT2016: Influence of Frequency, Recency and Semantic Context on Tag Reuse
HT2016: Influence of Frequency, Recency and Semantic Context on Tag ReuseHT2016: Influence of Frequency, Recency and Semantic Context on Tag Reuse
HT2016: Influence of Frequency, Recency and Semantic Context on Tag Reuse
 
Using Twitter as a data source: An overview of ethical challenges
Using Twitter as a data source: An overview of ethical challengesUsing Twitter as a data source: An overview of ethical challenges
Using Twitter as a data source: An overview of ethical challenges
 
Developing rich interactive eBooks to teach linked open data to professionals...
Developing rich interactive eBooks to teach linked open data to professionals...Developing rich interactive eBooks to teach linked open data to professionals...
Developing rich interactive eBooks to teach linked open data to professionals...
 
Conducting Twitter Reserch
Conducting Twitter ReserchConducting Twitter Reserch
Conducting Twitter Reserch
 
In Search of Australian Blogs: Determining the Extent of the Contemporary Aus...
In Search of Australian Blogs: Determining the Extent of the Contemporary Aus...In Search of Australian Blogs: Determining the Extent of the Contemporary Aus...
In Search of Australian Blogs: Determining the Extent of the Contemporary Aus...
 
Managing 'Big Data' in the social sciences: the contribution of an analytico-...
Managing 'Big Data' in the social sciences: the contribution of an analytico-...Managing 'Big Data' in the social sciences: the contribution of an analytico-...
Managing 'Big Data' in the social sciences: the contribution of an analytico-...
 
Pikas casci talk 11262013 final
Pikas casci talk 11262013 finalPikas casci talk 11262013 final
Pikas casci talk 11262013 final
 
Search, Exploration and Analytics of Evolving Data
Search, Exploration and Analytics of Evolving DataSearch, Exploration and Analytics of Evolving Data
Search, Exploration and Analytics of Evolving Data
 
The Future of Semantics on the Web
The Future of Semantics on the WebThe Future of Semantics on the Web
The Future of Semantics on the Web
 
One Day in the Life of a National Twittersphere
One Day in the Life of a National TwittersphereOne Day in the Life of a National Twittersphere
One Day in the Life of a National Twittersphere
 
New Methodologies for Capturing and Working with Publicly Available Twitter Data
New Methodologies for Capturing and Working with Publicly Available Twitter DataNew Methodologies for Capturing and Working with Publicly Available Twitter Data
New Methodologies for Capturing and Working with Publicly Available Twitter Data
 
Disciplinary Differences in Twitter Scholarly Communication
Disciplinary Differences in Twitter Scholarly CommunicationDisciplinary Differences in Twitter Scholarly Communication
Disciplinary Differences in Twitter Scholarly Communication
 
Temporal Effects on Hashtag Reuse in Twitter
Temporal Effects on Hashtag Reuse in TwitterTemporal Effects on Hashtag Reuse in Twitter
Temporal Effects on Hashtag Reuse in Twitter
 
LIS 653 Knowledge Organization | Pratt Institute School of Information | Fall...
LIS 653 Knowledge Organization | Pratt Institute School of Information | Fall...LIS 653 Knowledge Organization | Pratt Institute School of Information | Fall...
LIS 653 Knowledge Organization | Pratt Institute School of Information | Fall...
 
Australia's Environmental Predictive Capability
Australia's Environmental Predictive CapabilityAustralia's Environmental Predictive Capability
Australia's Environmental Predictive Capability
 

Recently uploaded

c Starting with 5000/- for Savita Escorts Service 👩🏽‍❤️‍💋‍👨🏿 8923113531 ♢ Boo...
c Starting with 5000/- for Savita Escorts Service 👩🏽‍❤️‍💋‍👨🏿 8923113531 ♢ Boo...c Starting with 5000/- for Savita Escorts Service 👩🏽‍❤️‍💋‍👨🏿 8923113531 ♢ Boo...
c Starting with 5000/- for Savita Escorts Service 👩🏽‍❤️‍💋‍👨🏿 8923113531 ♢ Boo...gurkirankumar98700
 
Website research Powerpoint for Bauer magazine
Website research Powerpoint for Bauer magazineWebsite research Powerpoint for Bauer magazine
Website research Powerpoint for Bauer magazinesamuelcoulson30
 
Learn About the Rise of Instagram Pro in 2024
Learn About the Rise of Instagram Pro in 2024Learn About the Rise of Instagram Pro in 2024
Learn About the Rise of Instagram Pro in 2024Islam Fit
 
Call Girls In Andheri East Call 9167673311 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9167673311 Book Hot And Sexy GirlsCall Girls In Andheri East Call 9167673311 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9167673311 Book Hot And Sexy GirlsPooja Nehwal
 
Dubai Call Girls O528786472 Diabolic Call Girls In Dubai
Dubai Call Girls O528786472 Diabolic Call Girls In DubaiDubai Call Girls O528786472 Diabolic Call Girls In Dubai
Dubai Call Girls O528786472 Diabolic Call Girls In Dubaihf8803863
 
Top Call Girls In Telibagh ( Lucknow ) 🔝 8923113531 🔝 Cash Payment
Top Call Girls In Telibagh ( Lucknow  ) 🔝 8923113531 🔝  Cash PaymentTop Call Girls In Telibagh ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment
Top Call Girls In Telibagh ( Lucknow ) 🔝 8923113531 🔝 Cash Paymentanilsa9823
 
Online Social Shopping Motivation: A Preliminary Study
Online Social Shopping Motivation: A Preliminary StudyOnline Social Shopping Motivation: A Preliminary Study
Online Social Shopping Motivation: A Preliminary StudyAJHSSR Journal
 
Your LinkedIn Makeover: Sociocosmos Presence Package
Your LinkedIn Makeover: Sociocosmos Presence PackageYour LinkedIn Makeover: Sociocosmos Presence Package
Your LinkedIn Makeover: Sociocosmos Presence PackageSocioCosmos
 
Elite Class ➥8448380779▻ Call Girls In Nizammuddin Delhi NCR
Elite Class ➥8448380779▻ Call Girls In Nizammuddin Delhi NCRElite Class ➥8448380779▻ Call Girls In Nizammuddin Delhi NCR
Elite Class ➥8448380779▻ Call Girls In Nizammuddin Delhi NCRDelhi Call girls
 
SELECTING A SOCIAL MEDIA MARKETING COMPANY
SELECTING A SOCIAL MEDIA MARKETING COMPANYSELECTING A SOCIAL MEDIA MARKETING COMPANY
SELECTING A SOCIAL MEDIA MARKETING COMPANYdizinfo
 
GREAT OPORTUNITY Russian Call Girls Kirti Nagar 9711199012 Independent Escort...
GREAT OPORTUNITY Russian Call Girls Kirti Nagar 9711199012 Independent Escort...GREAT OPORTUNITY Russian Call Girls Kirti Nagar 9711199012 Independent Escort...
GREAT OPORTUNITY Russian Call Girls Kirti Nagar 9711199012 Independent Escort...Mona Rathore
 
Night 7k Call Girls Noida Sector 121 Call Me: 8448380779
Night 7k Call Girls Noida Sector 121 Call Me: 8448380779Night 7k Call Girls Noida Sector 121 Call Me: 8448380779
Night 7k Call Girls Noida Sector 121 Call Me: 8448380779Delhi Call girls
 
Call Girls In Gurgaon Dlf pHACE 2 Women Delhi ncr
Call Girls In Gurgaon Dlf pHACE 2 Women Delhi ncrCall Girls In Gurgaon Dlf pHACE 2 Women Delhi ncr
Call Girls In Gurgaon Dlf pHACE 2 Women Delhi ncrSapana Sha
 
"Ready to elevate your Instagram? Let's go
"Ready to elevate your Instagram? Let's go"Ready to elevate your Instagram? Let's go
"Ready to elevate your Instagram? Let's goSocioCosmos
 
Impact Of Educational Resources on Students' Academic Performance in Economic...
Impact Of Educational Resources on Students' Academic Performance in Economic...Impact Of Educational Resources on Students' Academic Performance in Economic...
Impact Of Educational Resources on Students' Academic Performance in Economic...AJHSSR Journal
 
IMPACT OF FISCAL POLICY AND MONETARY POLICY ON THE ECONOMIC GROWTH OF NIGERIA...
IMPACT OF FISCAL POLICY AND MONETARY POLICY ON THE ECONOMIC GROWTH OF NIGERIA...IMPACT OF FISCAL POLICY AND MONETARY POLICY ON THE ECONOMIC GROWTH OF NIGERIA...
IMPACT OF FISCAL POLICY AND MONETARY POLICY ON THE ECONOMIC GROWTH OF NIGERIA...AJHSSR Journal
 
Top Astrologer, Kala ilam specialist in USA and Bangali Amil baba in Saudi Ar...
Top Astrologer, Kala ilam specialist in USA and Bangali Amil baba in Saudi Ar...Top Astrologer, Kala ilam specialist in USA and Bangali Amil baba in Saudi Ar...
Top Astrologer, Kala ilam specialist in USA and Bangali Amil baba in Saudi Ar...baharayali
 

Recently uploaded (20)

c Starting with 5000/- for Savita Escorts Service 👩🏽‍❤️‍💋‍👨🏿 8923113531 ♢ Boo...
c Starting with 5000/- for Savita Escorts Service 👩🏽‍❤️‍💋‍👨🏿 8923113531 ♢ Boo...c Starting with 5000/- for Savita Escorts Service 👩🏽‍❤️‍💋‍👨🏿 8923113531 ♢ Boo...
c Starting with 5000/- for Savita Escorts Service 👩🏽‍❤️‍💋‍👨🏿 8923113531 ♢ Boo...
 
Website research Powerpoint for Bauer magazine
Website research Powerpoint for Bauer magazineWebsite research Powerpoint for Bauer magazine
Website research Powerpoint for Bauer magazine
 
Learn About the Rise of Instagram Pro in 2024
Learn About the Rise of Instagram Pro in 2024Learn About the Rise of Instagram Pro in 2024
Learn About the Rise of Instagram Pro in 2024
 
Call Girls In Andheri East Call 9167673311 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9167673311 Book Hot And Sexy GirlsCall Girls In Andheri East Call 9167673311 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9167673311 Book Hot And Sexy Girls
 
Bicycle Safety in Focus: Preventing Fatalities and Seeking Justice
Bicycle Safety in Focus: Preventing Fatalities and Seeking JusticeBicycle Safety in Focus: Preventing Fatalities and Seeking Justice
Bicycle Safety in Focus: Preventing Fatalities and Seeking Justice
 
Dubai Call Girls O528786472 Diabolic Call Girls In Dubai
Dubai Call Girls O528786472 Diabolic Call Girls In DubaiDubai Call Girls O528786472 Diabolic Call Girls In Dubai
Dubai Call Girls O528786472 Diabolic Call Girls In Dubai
 
Top Call Girls In Telibagh ( Lucknow ) 🔝 8923113531 🔝 Cash Payment
Top Call Girls In Telibagh ( Lucknow  ) 🔝 8923113531 🔝  Cash PaymentTop Call Girls In Telibagh ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment
Top Call Girls In Telibagh ( Lucknow ) 🔝 8923113531 🔝 Cash Payment
 
Online Social Shopping Motivation: A Preliminary Study
Online Social Shopping Motivation: A Preliminary StudyOnline Social Shopping Motivation: A Preliminary Study
Online Social Shopping Motivation: A Preliminary Study
 
Your LinkedIn Makeover: Sociocosmos Presence Package
Your LinkedIn Makeover: Sociocosmos Presence PackageYour LinkedIn Makeover: Sociocosmos Presence Package
Your LinkedIn Makeover: Sociocosmos Presence Package
 
Elite Class ➥8448380779▻ Call Girls In Nizammuddin Delhi NCR
Elite Class ➥8448380779▻ Call Girls In Nizammuddin Delhi NCRElite Class ➥8448380779▻ Call Girls In Nizammuddin Delhi NCR
Elite Class ➥8448380779▻ Call Girls In Nizammuddin Delhi NCR
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Masudpur
Delhi  99530 vip 56974  Genuine Escort Service Call Girls in MasudpurDelhi  99530 vip 56974  Genuine Escort Service Call Girls in Masudpur
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Masudpur
 
SELECTING A SOCIAL MEDIA MARKETING COMPANY
SELECTING A SOCIAL MEDIA MARKETING COMPANYSELECTING A SOCIAL MEDIA MARKETING COMPANY
SELECTING A SOCIAL MEDIA MARKETING COMPANY
 
GREAT OPORTUNITY Russian Call Girls Kirti Nagar 9711199012 Independent Escort...
GREAT OPORTUNITY Russian Call Girls Kirti Nagar 9711199012 Independent Escort...GREAT OPORTUNITY Russian Call Girls Kirti Nagar 9711199012 Independent Escort...
GREAT OPORTUNITY Russian Call Girls Kirti Nagar 9711199012 Independent Escort...
 
Night 7k Call Girls Noida Sector 121 Call Me: 8448380779
Night 7k Call Girls Noida Sector 121 Call Me: 8448380779Night 7k Call Girls Noida Sector 121 Call Me: 8448380779
Night 7k Call Girls Noida Sector 121 Call Me: 8448380779
 
Call Girls In Gurgaon Dlf pHACE 2 Women Delhi ncr
Call Girls In Gurgaon Dlf pHACE 2 Women Delhi ncrCall Girls In Gurgaon Dlf pHACE 2 Women Delhi ncr
Call Girls In Gurgaon Dlf pHACE 2 Women Delhi ncr
 
"Ready to elevate your Instagram? Let's go
"Ready to elevate your Instagram? Let's go"Ready to elevate your Instagram? Let's go
"Ready to elevate your Instagram? Let's go
 
🔝9953056974 🔝Call Girls In Mehrauli Escort Service Delhi NCR
🔝9953056974 🔝Call Girls In Mehrauli  Escort Service Delhi NCR🔝9953056974 🔝Call Girls In Mehrauli  Escort Service Delhi NCR
🔝9953056974 🔝Call Girls In Mehrauli Escort Service Delhi NCR
 
Impact Of Educational Resources on Students' Academic Performance in Economic...
Impact Of Educational Resources on Students' Academic Performance in Economic...Impact Of Educational Resources on Students' Academic Performance in Economic...
Impact Of Educational Resources on Students' Academic Performance in Economic...
 
IMPACT OF FISCAL POLICY AND MONETARY POLICY ON THE ECONOMIC GROWTH OF NIGERIA...
IMPACT OF FISCAL POLICY AND MONETARY POLICY ON THE ECONOMIC GROWTH OF NIGERIA...IMPACT OF FISCAL POLICY AND MONETARY POLICY ON THE ECONOMIC GROWTH OF NIGERIA...
IMPACT OF FISCAL POLICY AND MONETARY POLICY ON THE ECONOMIC GROWTH OF NIGERIA...
 
Top Astrologer, Kala ilam specialist in USA and Bangali Amil baba in Saudi Ar...
Top Astrologer, Kala ilam specialist in USA and Bangali Amil baba in Saudi Ar...Top Astrologer, Kala ilam specialist in USA and Bangali Amil baba in Saudi Ar...
Top Astrologer, Kala ilam specialist in USA and Bangali Amil baba in Saudi Ar...
 

A Topic Analysis Approach To Revealing Discussions On The Australian Twittersphere

  • 1. A TOPIC ANALYSIS APPROACH TO REVEALING DISCUSSIONS ON THE AUSTRALIAN TWITTERSPHERE Brenda Moon Queensland University of Technology
  • 2. Introduction This paper investigates techniques to identify the topics being discussed in one week of tweets from the Australian Twittersphere. Tweets were extracted from a comprehensive dataset which captures all tweets by 2.8m Australian: the Tracking Infrastructure for Social Media Analysis (TrISMA) (Bruns, Burgess & Banks et al., 2016).
  • 3. Selected week: Sunday 2 August to Saturday 8 August 2015 • Thursday 6th August 2015 was used for One Day in the Life of a National Twittersphere (Axel Bruns and Brenda Moon, presented at Social Media and Society, London, 13 July 2016) • Same day used for initial development of topic modelling approach • Then extended to full week
  • 5. Data cleaning • Remove – retweets & multitweets (“rt”, “mt” or “via”) – URLs – dates, times, distances & weights – Words less than 3 characters – elipses ('...’) • NTLK tokenisation using Twitter Tokenizer – Remove all @users and urls – Lowercase • Convert – HTML entities to text – Hashtags to words (trim ‘#’ off hashtags) • NLTK lemmatisation • NLTK stopwords
  • 6. Hashtag pooling • Mehrotra, Sanner, Buntine & Xie (2013) looked at different options of ‘pooling’ tweets into documents before LDA analysis to see if this could increase accuracy. They found that hashtag pooling was effective (best was hashtag pooling with clustering, but more complex to apply) • Group all the tweets with hashtags into documents for each hashtag (some tweets will be added into more than one document) • Tweets without hashtags stay as individual documents
  • 7. Corpus filtering (Thursday 6 August 2015) • Raw tweets: 963,064 • After data cleaning: 583,528 • After hashtag pooling: 516,263 – 23% of tweets had hashtags • Dictionary pruning – remove most frequent and least frequent terms – no_above=0.5 (percent of documents), no_below=5 (documents) – 223,157 unique tokens reduced to 49,964 unique tokens
  • 8. Latent Dirichlet Allocation (LDA) • Gensim LDA (Lau & Baldwin, 2014) • LdaMulticore • Identify 30 topics • 100 passes
  • 9. Thursday 6th August 2015 – overall terms https://github.com/bmabey/pyLDAvis
  • 10. Thursday 6th August 2015 Topic 2: Politics / coal / China / Queensland
  • 11. Thursday 6th August 2015 Topic 5: Cricket – The Ashes
  • 12. Thursday 6th August 2015 Topic 5: Cricket – The Ashes
  • 13. Thursday 6th August 2015 Topic 5: Cricket – The Ashes
  • 14. Thursday 6th August 2015 Topic 5: Cricket – The Ashes
  • 15. Thursday 6th August 2015 30 topics, With hashtag pooling. MH370
  • 16. Thursday 6th August 2015 30 topics, With hashtag pooling. Comparison to other study Pop? Teen culture? MH370
  • 17. 1.1m tweets from 147k, to 224k accounts 294k nodes total, including non-Australians 535k edges from 856k @mentions / RTs Visualisation: Gephi, Force Atlas 2 Colours: Gephi, modularity resolution 1.0 Labels assigned through qualitative evaluation Politics Cricket Teen Culture Pop From “One Day in the Life of a National Twittersphere” by Axel Bruns and Brenda Moon, presented at Social Media and Society, London, 13 July 2016.
  • 18. Further Outlook • Confirm initial topic labelling by looking at top tweets for each topic • Check whether the hashtag pooling has allowed non-hashtag tweet topics to still be visible • Use statistical coherence of model (U_Mass Coherence, C_V coherence) to tune LDA parameters • Model different numbers of topics (coarse/fine grain) • Relate topics per user back to our mention network graphs • Extend to the full week (or longer) • Compare to alternative approaches – Doc2Vec / Tensorflow / dynamic LDA etc
  • 19. References • Blei, D. M. (2011). Introduction to probabilistic topic models. Communications of the ACM, 1– 16. Retrieved from http://www.cs.princeton.edu/~blei/papers/Blei2011.pdf • Mehrotra, R., Sanner, S., Buntine, W., & Xie, L. (2013). Improving LDA Topic Models for Microblogs via Tweet Pooling and Automatic Labeling. Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, 889–892. http://doi.org/10.1145/2484028.2484166 • Lau, J. H., & Baldwin, T. (2014). An Empirical Evaluation of doc2vec with Practical Insights into Document Embedding Generation. • Puschmann, C., & Scheffler, T. (2016). Topic modeling for media and communication research : A short primer (HIIG Discussion Paper Series No. 2016–5). Retrieved from http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2836478 • Sievert, C., & Shirley, K. (2014). LDAvis: A method for visualizing and interpreting topics. Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces, 63–70. Retrieved from http://www.aclweb.org/anthology/W/W14/W14-3110