SlideShare a Scribd company logo
1 of 16
Text Mining in Economics:
Methods and Applications
Mgr. Petr Koráb, Ph.D.
Lentiamo, Prague
Zeppelin University, Friedrichshafen
Research Design
Zeppelin University, 18 Oct 2022
About the instructor
• 2022 – Visiting Resercher, Zeppelin University
• 2019 – Data Analyst, Lentiamo, Prague
• 2015 – 2018: Assistant Professor in Finance, Mendel University
• 2015 – 2018: Researcher, Research Centre, Mendel University
• 2014, 2015, 2017: Visting Researcher, Zeppelin University
• 2018: Visiting Resercher, University of Missouri St. Louis
• 2014: Junior Fellow, WIFO, Vienna
• Recent papers:
Koráb, Mallek, Dibooglu, 2021. Effects of quantitative easing on firm performance in the euro area. The North American Journal
of Economics and Finance. vol. 57(C).
Koráb, Dibooglu, Fidrmuc. Trade-offs of dollarization: Meta-analysis evidence. R&R, Journal of International Money and Finance.
Outline
• Why we care about text data in economics
• Text mining and research trends in economics
• Text data representation
• Text data databases
• Data science software for text mining
• Text mining methods for applied research
• Examples of research at the Machine Learning & Text Mining in Economics research group at
Zeppelin University
Why do we care about text data in Economics ?
• Developments since the late 20th century, such as:
• general development of technologies to process big data
• availability of data from social networks (e.g. Twitter, YouTube)
• transcripts of politicians’ statements and central bankers’ meetings
• publicly available access to major text databases (Google Trends, RSS feeds, Wikipedia, Google Ngrams)
… have resulted in a vast amount of text data easily accessible to analysts, students, and researchers
• Digital economics incorporates many of the
recent advances related to modern computing
and digitization
Source: United Nations Conference
on Trade and Development, 2021
Text mining and research trends in economics
• In finance: text from financial news, social media, and
company filings is used to predict asset price
movements and study the causal impact of new
information.
• In macroeconomics: text is used to forecast variation in
inflation and unemployment, and estimate the effects
of policy uncertainty.
• In media economics: text from news and social media
is used to study the drivers and effects of political slant.
• In industrial organization and marketing: text from
advertisements and product reviews is used to study
the drivers of consumer decision making.
• In political economy: text from politicians’ speeches is
used to study the dynamics of political agendas and
debate
(Gentzkow, et al., 2019. Text as Data,
Journal of Economic Literature)
Text mining in economics articles in JSTOR database
Source: Constellate.org
How computers work with text … data representation
Text means nothing more than
numbers to the computer !
PNG
Image
Rank 2
Tensor
sentence
array of tokenised words
Text data representation - example of computer generated text
Customers know the product, comment on it
and express their opinions.
A neural network predicts 10 consequent
words for a sentence
• Emotions
• Feelings
• Personal experience
• 6-layer neural network
• 16 GB RAM
• 3 minutes training time
Amazon costomer reviews Computer-generated text
Not the best of Cream, but it is good. Just not great. I think this is a good album thats all i gotta say.
Another great Jimmy Buffett album that's perfect for the backyard Barbeque. This was another brilliant album one of their best if not best.
What can you say about this album except that it's great.
The songs gives different emotions they're very creative on this album.
This cd is really very good the music just flows very smooth.
I love everything maroon has done their musicality is exceptional
fantastic lyrics.
Source: Koráb, 2021. Training Neural Networks
to Create Text Like a Human. Towards Data
Science (Medium). Available from here.
Text means nothing more than
numbers to the computer !!
Text databases for small research projects
• Google trends - free data exploration tool that lets better understand what people search on
Google
• manual data collection is time-consuming
• automatic download requires programming knowledge (Python, R)
• Google trends datastore - contains pre-processed datasets from Google trends
• Twitter - provides anonymised data from Twitter feeds
• multiple ways of data collection, see the article
• Central banks’ and government officials’ websites and reporitories – ECB, Kremlin, FED
• Kaggle - platform that allows users to participate in predictive modeling competitions and to
explore and publish data sets
Data science software (general overview)
• Academia
• STATA - research in social sciences and economics
• Julia - scientific computing (macro-modelling, machine learning, ..)
• MATLAB - central banks, corporate research, risk modelling
• SPSS - market research agencies, sociology
• Python - data science, machine learning
• R - statistical analytics
• Companies
In short: Python and R are the
prevalent tools for text mining in
academia
Companies often use specialised
Business Intelligence software for
specific text mining purposes
Text mining methods: word cloud
• Word cloud is an image composed of words used in a particular
text or subject, in which the size of each word indicates its
frequency or importance
• A simple graph is frequently used in business, as well as in
academia
• Implementation:
o R: https://www.r-graph-gallery.com/wordcloud.html
o SPSS: https://github.com/IBMPredictiveAnalytics/Word_Cloud_Visualization
o Python: https://pypi.org/project/wordcloud/
o STATA: https://www.stata-journal.com/article.html?article=dm0094
o MATLAB: https://www.mathworks.com/help/matlab/ref/wordcloud.html
• Python coding example: here
Source: Kurt, F., 2020. Masking With WordCloud in Python: 500
Most Frequently Used Words in German. The Startup.
Word cloud applications
• Primarily data summarization
• customer reviews
• opinion polls
• politicians’ statements
• monetary policy transcripts, etc.
Recommended literature:
• Martin Feldkircher, Paul Hofmarcher, Pierre Siklos. 2021. What’s the Message? Interpreting Monetary Policy
Through Central Bankers’ Speeches. SUERF Policy Brief, No 153.
• Petr Koráb, Jarko Fidrmuc, David Štrba. 2021. Guide to Using Word Clouds for Applied Research Design.
Towards Data Science. Dec 14, 2021.
Text mining methods: sentiment analysis
• Sentiment analysis is a large field in natural language
processing (NLP) that uses techniques to identify, extract
and quantify emotions from text data
• In companies, it helps understand customer feedback,
evaluate social media conversations, help prioritize
communication with customers in customer care
departments, etc.
• Both Python and R provide a large variety of sentiment
classifiers and libraries
• Python coding example: here
Recommended reading: Koráb, 2021. The Most Favorable Pre-trained
Sentiment Classifiers in Python, Towards Data Science (Medium).
Available from here.
Research example I: animated word cloud in economics
• Word clouds are commonly presented as static
plots without time dimension
• We create MP4 video with keywords
constructed from titles of research articles in 5
leading economic journals, 1900 – now.
• Single video maps modern era of economic
science
• Our contribution:
• First application of dynamic word clouds in
economics (to the best of knowledge)
• Example of:
• Quantitative historical research method
• Data story-telling in economics
Source: Koráb, Štrba, Fidrmuc 2021. Animated Word Clouds: A Novel
Way for the Visualization of Word Frequencies. Python in Plain
English. Available from here.
Results of unpublished research
Conclusions
• Text mining methods have been frequently used in economics over the last 15-20
years due to a massive increase in the volume of text data
• Modern computing methods and algorithms make use of text mining accessible
to anyone
• Computers do not at all understand language more than a sequence of numbers
• Simple text mining methods are implemented in all standard programs and can
be used for small projects and theses
• Text mining is a great area to focus on in your career in the private sector or
academia
Thank you for your attention
Happy to ansver your questions
Feel free to contact me: xpetrkorab@gmail.com
My blog (text mining, applied ML): https://petrkorab.medium.com

More Related Content

Similar to slides_ZU_Text_mining_final (MEDIUM).pdf

Fried data summit big data for lob content
Fried data summit big data for lob contentFried data summit big data for lob content
Fried data summit big data for lob contentJeff Fried
 
Digital Business Engineering: Findings from the Install4Schenker case
Digital Business Engineering: Findings from the Install4Schenker caseDigital Business Engineering: Findings from the Install4Schenker case
Digital Business Engineering: Findings from the Install4Schenker caseSebastian Opriel
 
SoundSoftware: Software Sustainability for audio and Music Researchers
SoundSoftware: Software Sustainability for audio and Music Researchers SoundSoftware: Software Sustainability for audio and Music Researchers
SoundSoftware: Software Sustainability for audio and Music Researchers SoundSoftware ac.uk
 
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...Enrico Motta
 
Data Science Unit1 AMET.pdf
Data Science Unit1 AMET.pdfData Science Unit1 AMET.pdf
Data Science Unit1 AMET.pdfmustaq4
 
Text analysis and Semantic Search with GATE
Text analysis and Semantic Search with GATEText analysis and Semantic Search with GATE
Text analysis and Semantic Search with GATEDiana Maynard
 
Data-centric design and the knowledge graph
Data-centric design and the knowledge graphData-centric design and the knowledge graph
Data-centric design and the knowledge graphAlan Morrison
 
Some New Directions in the Economics of AI
Some New Directions in the Economics of AISome New Directions in the Economics of AI
Some New Directions in the Economics of AIJuan Mateos-Garcia
 
Open government data portals: from publishing to use and impact
Open government data portals: from publishing to use and impactOpen government data portals: from publishing to use and impact
Open government data portals: from publishing to use and impactElena Simperl
 
On Understanding Data Scientists
On Understanding  Data ScientistsOn Understanding  Data Scientists
On Understanding Data ScientistsJácome Cunha
 
Text mining and data mining
Text mining and data mining Text mining and data mining
Text mining and data mining Bhawi247
 
GATE: a text analysis tool for social media
GATE: a text analysis tool for social mediaGATE: a text analysis tool for social media
GATE: a text analysis tool for social mediaDiana Maynard
 
VCCORP SoICT 2018
VCCORP SoICT 2018VCCORP SoICT 2018
VCCORP SoICT 2018Tuan Hoang
 
The web of data: how are we doing so far?
The web of data: how are we doing so far?The web of data: how are we doing so far?
The web of data: how are we doing so far?Elena Simperl
 
[DSC MENA 24] Nada_GabAllah_-_Advancement_in_NLP_and_Text_Analytics.pptx
[DSC MENA 24] Nada_GabAllah_-_Advancement_in_NLP_and_Text_Analytics.pptx[DSC MENA 24] Nada_GabAllah_-_Advancement_in_NLP_and_Text_Analytics.pptx
[DSC MENA 24] Nada_GabAllah_-_Advancement_in_NLP_and_Text_Analytics.pptxDataScienceConferenc1
 

Similar to slides_ZU_Text_mining_final (MEDIUM).pdf (20)

Fried data summit big data for lob content
Fried data summit big data for lob contentFried data summit big data for lob content
Fried data summit big data for lob content
 
Digital Business Engineering: Findings from the Install4Schenker case
Digital Business Engineering: Findings from the Install4Schenker caseDigital Business Engineering: Findings from the Install4Schenker case
Digital Business Engineering: Findings from the Install4Schenker case
 
Big Data & DS Analytics for PAARL
Big Data & DS Analytics for PAARLBig Data & DS Analytics for PAARL
Big Data & DS Analytics for PAARL
 
SoundSoftware: Software Sustainability for audio and Music Researchers
SoundSoftware: Software Sustainability for audio and Music Researchers SoundSoftware: Software Sustainability for audio and Music Researchers
SoundSoftware: Software Sustainability for audio and Music Researchers
 
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
 
Data Science Unit1 AMET.pdf
Data Science Unit1 AMET.pdfData Science Unit1 AMET.pdf
Data Science Unit1 AMET.pdf
 
Hobbit project overview presented at EBDVF 2017
Hobbit project overview presented at EBDVF 2017Hobbit project overview presented at EBDVF 2017
Hobbit project overview presented at EBDVF 2017
 
How to be data savvy manager
How to be data savvy managerHow to be data savvy manager
How to be data savvy manager
 
Text analysis and Semantic Search with GATE
Text analysis and Semantic Search with GATEText analysis and Semantic Search with GATE
Text analysis and Semantic Search with GATE
 
Data-centric design and the knowledge graph
Data-centric design and the knowledge graphData-centric design and the knowledge graph
Data-centric design and the knowledge graph
 
Some New Directions in the Economics of AI
Some New Directions in the Economics of AISome New Directions in the Economics of AI
Some New Directions in the Economics of AI
 
Open government data portals: from publishing to use and impact
Open government data portals: from publishing to use and impactOpen government data portals: from publishing to use and impact
Open government data portals: from publishing to use and impact
 
On Understanding Data Scientists
On Understanding  Data ScientistsOn Understanding  Data Scientists
On Understanding Data Scientists
 
sample PPT.pptx
sample PPT.pptxsample PPT.pptx
sample PPT.pptx
 
Text mining and data mining
Text mining and data mining Text mining and data mining
Text mining and data mining
 
GATE: a text analysis tool for social media
GATE: a text analysis tool for social mediaGATE: a text analysis tool for social media
GATE: a text analysis tool for social media
 
VCCORP SoICT 2018
VCCORP SoICT 2018VCCORP SoICT 2018
VCCORP SoICT 2018
 
The web of data: how are we doing so far?
The web of data: how are we doing so far?The web of data: how are we doing so far?
The web of data: how are we doing so far?
 
[DSC MENA 24] Nada_GabAllah_-_Advancement_in_NLP_and_Text_Analytics.pptx
[DSC MENA 24] Nada_GabAllah_-_Advancement_in_NLP_and_Text_Analytics.pptx[DSC MENA 24] Nada_GabAllah_-_Advancement_in_NLP_and_Text_Analytics.pptx
[DSC MENA 24] Nada_GabAllah_-_Advancement_in_NLP_and_Text_Analytics.pptx
 
R vs Python vs SAS
R vs Python vs SASR vs Python vs SAS
R vs Python vs SAS
 

Recently uploaded

The Economic History of the U.S. Lecture 17.pdf
The Economic History of the U.S. Lecture 17.pdfThe Economic History of the U.S. Lecture 17.pdf
The Economic History of the U.S. Lecture 17.pdfGale Pooley
 
Booking open Available Pune Call Girls Talegaon Dabhade 6297143586 Call Hot ...
Booking open Available Pune Call Girls Talegaon Dabhade  6297143586 Call Hot ...Booking open Available Pune Call Girls Talegaon Dabhade  6297143586 Call Hot ...
Booking open Available Pune Call Girls Talegaon Dabhade 6297143586 Call Hot ...Call Girls in Nagpur High Profile
 
The Economic History of the U.S. Lecture 26.pdf
The Economic History of the U.S. Lecture 26.pdfThe Economic History of the U.S. Lecture 26.pdf
The Economic History of the U.S. Lecture 26.pdfGale Pooley
 
03_Emmanuel Ndiaye_Degroof Petercam.pptx
03_Emmanuel Ndiaye_Degroof Petercam.pptx03_Emmanuel Ndiaye_Degroof Petercam.pptx
03_Emmanuel Ndiaye_Degroof Petercam.pptxFinTech Belgium
 
02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptx
02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptx02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptx
02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptxFinTech Belgium
 
Call US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure service
Call US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure serviceCall US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure service
Call US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure servicePooja Nehwal
 
The Economic History of the U.S. Lecture 23.pdf
The Economic History of the U.S. Lecture 23.pdfThe Economic History of the U.S. Lecture 23.pdf
The Economic History of the U.S. Lecture 23.pdfGale Pooley
 
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Dharavi Russian callg Girls, { 09892124323 } || Call Girl In Mumbai ...
Dharavi Russian callg Girls, { 09892124323 } || Call Girl In Mumbai ...Dharavi Russian callg Girls, { 09892124323 } || Call Girl In Mumbai ...
Dharavi Russian callg Girls, { 09892124323 } || Call Girl In Mumbai ...Pooja Nehwal
 
Call Girls Koregaon Park Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Koregaon Park Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Koregaon Park Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Koregaon Park Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
VIP Independent Call Girls in Bandra West 🌹 9920725232 ( Call Me ) Mumbai Esc...
VIP Independent Call Girls in Bandra West 🌹 9920725232 ( Call Me ) Mumbai Esc...VIP Independent Call Girls in Bandra West 🌹 9920725232 ( Call Me ) Mumbai Esc...
VIP Independent Call Girls in Bandra West 🌹 9920725232 ( Call Me ) Mumbai Esc...dipikadinghjn ( Why You Choose Us? ) Escorts
 
Vip Call US 📞 7738631006 ✅Call Girls In Sakinaka ( Mumbai )
Vip Call US 📞 7738631006 ✅Call Girls In Sakinaka ( Mumbai )Vip Call US 📞 7738631006 ✅Call Girls In Sakinaka ( Mumbai )
Vip Call US 📞 7738631006 ✅Call Girls In Sakinaka ( Mumbai )Pooja Nehwal
 
The Economic History of the U.S. Lecture 25.pdf
The Economic History of the U.S. Lecture 25.pdfThe Economic History of the U.S. Lecture 25.pdf
The Economic History of the U.S. Lecture 25.pdfGale Pooley
 
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...ssifa0344
 
VVIP Pune Call Girls Katraj (7001035870) Pune Escorts Nearby with Complete Sa...
VVIP Pune Call Girls Katraj (7001035870) Pune Escorts Nearby with Complete Sa...VVIP Pune Call Girls Katraj (7001035870) Pune Escorts Nearby with Complete Sa...
VVIP Pune Call Girls Katraj (7001035870) Pune Escorts Nearby with Complete Sa...Call Girls in Nagpur High Profile
 
CALL ON ➥8923113531 🔝Call Girls Gomti Nagar Lucknow best sexual service
CALL ON ➥8923113531 🔝Call Girls Gomti Nagar Lucknow best sexual serviceCALL ON ➥8923113531 🔝Call Girls Gomti Nagar Lucknow best sexual service
CALL ON ➥8923113531 🔝Call Girls Gomti Nagar Lucknow best sexual serviceanilsa9823
 
High Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
High Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsHigh Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
High Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Booking open Available Pune Call Girls Shivane 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Shivane  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Shivane  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Shivane 6297143586 Call Hot Indian Gi...Call Girls in Nagpur High Profile
 

Recently uploaded (20)

The Economic History of the U.S. Lecture 17.pdf
The Economic History of the U.S. Lecture 17.pdfThe Economic History of the U.S. Lecture 17.pdf
The Economic History of the U.S. Lecture 17.pdf
 
Booking open Available Pune Call Girls Talegaon Dabhade 6297143586 Call Hot ...
Booking open Available Pune Call Girls Talegaon Dabhade  6297143586 Call Hot ...Booking open Available Pune Call Girls Talegaon Dabhade  6297143586 Call Hot ...
Booking open Available Pune Call Girls Talegaon Dabhade 6297143586 Call Hot ...
 
The Economic History of the U.S. Lecture 26.pdf
The Economic History of the U.S. Lecture 26.pdfThe Economic History of the U.S. Lecture 26.pdf
The Economic History of the U.S. Lecture 26.pdf
 
03_Emmanuel Ndiaye_Degroof Petercam.pptx
03_Emmanuel Ndiaye_Degroof Petercam.pptx03_Emmanuel Ndiaye_Degroof Petercam.pptx
03_Emmanuel Ndiaye_Degroof Petercam.pptx
 
02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptx
02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptx02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptx
02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptx
 
Call US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure service
Call US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure serviceCall US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure service
Call US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure service
 
The Economic History of the U.S. Lecture 23.pdf
The Economic History of the U.S. Lecture 23.pdfThe Economic History of the U.S. Lecture 23.pdf
The Economic History of the U.S. Lecture 23.pdf
 
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escorts
 
Dharavi Russian callg Girls, { 09892124323 } || Call Girl In Mumbai ...
Dharavi Russian callg Girls, { 09892124323 } || Call Girl In Mumbai ...Dharavi Russian callg Girls, { 09892124323 } || Call Girl In Mumbai ...
Dharavi Russian callg Girls, { 09892124323 } || Call Girl In Mumbai ...
 
Call Girls Koregaon Park Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Koregaon Park Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Koregaon Park Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Koregaon Park Call Me 7737669865 Budget Friendly No Advance Booking
 
VIP Independent Call Girls in Bandra West 🌹 9920725232 ( Call Me ) Mumbai Esc...
VIP Independent Call Girls in Bandra West 🌹 9920725232 ( Call Me ) Mumbai Esc...VIP Independent Call Girls in Bandra West 🌹 9920725232 ( Call Me ) Mumbai Esc...
VIP Independent Call Girls in Bandra West 🌹 9920725232 ( Call Me ) Mumbai Esc...
 
Vip Call US 📞 7738631006 ✅Call Girls In Sakinaka ( Mumbai )
Vip Call US 📞 7738631006 ✅Call Girls In Sakinaka ( Mumbai )Vip Call US 📞 7738631006 ✅Call Girls In Sakinaka ( Mumbai )
Vip Call US 📞 7738631006 ✅Call Girls In Sakinaka ( Mumbai )
 
The Economic History of the U.S. Lecture 25.pdf
The Economic History of the U.S. Lecture 25.pdfThe Economic History of the U.S. Lecture 25.pdf
The Economic History of the U.S. Lecture 25.pdf
 
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...
 
VVIP Pune Call Girls Katraj (7001035870) Pune Escorts Nearby with Complete Sa...
VVIP Pune Call Girls Katraj (7001035870) Pune Escorts Nearby with Complete Sa...VVIP Pune Call Girls Katraj (7001035870) Pune Escorts Nearby with Complete Sa...
VVIP Pune Call Girls Katraj (7001035870) Pune Escorts Nearby with Complete Sa...
 
CALL ON ➥8923113531 🔝Call Girls Gomti Nagar Lucknow best sexual service
CALL ON ➥8923113531 🔝Call Girls Gomti Nagar Lucknow best sexual serviceCALL ON ➥8923113531 🔝Call Girls Gomti Nagar Lucknow best sexual service
CALL ON ➥8923113531 🔝Call Girls Gomti Nagar Lucknow best sexual service
 
Veritas Interim Report 1 January–31 March 2024
Veritas Interim Report 1 January–31 March 2024Veritas Interim Report 1 January–31 March 2024
Veritas Interim Report 1 January–31 March 2024
 
High Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
High Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsHigh Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
High Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
 
(Vedika) Low Rate Call Girls in Pune Call Now 8250077686 Pune Escorts 24x7
(Vedika) Low Rate Call Girls in Pune Call Now 8250077686 Pune Escorts 24x7(Vedika) Low Rate Call Girls in Pune Call Now 8250077686 Pune Escorts 24x7
(Vedika) Low Rate Call Girls in Pune Call Now 8250077686 Pune Escorts 24x7
 
Booking open Available Pune Call Girls Shivane 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Shivane  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Shivane  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Shivane 6297143586 Call Hot Indian Gi...
 

slides_ZU_Text_mining_final (MEDIUM).pdf

  • 1. Text Mining in Economics: Methods and Applications Mgr. Petr Koráb, Ph.D. Lentiamo, Prague Zeppelin University, Friedrichshafen Research Design Zeppelin University, 18 Oct 2022
  • 2. About the instructor • 2022 – Visiting Resercher, Zeppelin University • 2019 – Data Analyst, Lentiamo, Prague • 2015 – 2018: Assistant Professor in Finance, Mendel University • 2015 – 2018: Researcher, Research Centre, Mendel University • 2014, 2015, 2017: Visting Researcher, Zeppelin University • 2018: Visiting Resercher, University of Missouri St. Louis • 2014: Junior Fellow, WIFO, Vienna • Recent papers: Koráb, Mallek, Dibooglu, 2021. Effects of quantitative easing on firm performance in the euro area. The North American Journal of Economics and Finance. vol. 57(C). Koráb, Dibooglu, Fidrmuc. Trade-offs of dollarization: Meta-analysis evidence. R&R, Journal of International Money and Finance.
  • 3. Outline • Why we care about text data in economics • Text mining and research trends in economics • Text data representation • Text data databases • Data science software for text mining • Text mining methods for applied research • Examples of research at the Machine Learning & Text Mining in Economics research group at Zeppelin University
  • 4. Why do we care about text data in Economics ? • Developments since the late 20th century, such as: • general development of technologies to process big data • availability of data from social networks (e.g. Twitter, YouTube) • transcripts of politicians’ statements and central bankers’ meetings • publicly available access to major text databases (Google Trends, RSS feeds, Wikipedia, Google Ngrams) … have resulted in a vast amount of text data easily accessible to analysts, students, and researchers • Digital economics incorporates many of the recent advances related to modern computing and digitization Source: United Nations Conference on Trade and Development, 2021
  • 5. Text mining and research trends in economics • In finance: text from financial news, social media, and company filings is used to predict asset price movements and study the causal impact of new information. • In macroeconomics: text is used to forecast variation in inflation and unemployment, and estimate the effects of policy uncertainty. • In media economics: text from news and social media is used to study the drivers and effects of political slant. • In industrial organization and marketing: text from advertisements and product reviews is used to study the drivers of consumer decision making. • In political economy: text from politicians’ speeches is used to study the dynamics of political agendas and debate (Gentzkow, et al., 2019. Text as Data, Journal of Economic Literature) Text mining in economics articles in JSTOR database Source: Constellate.org
  • 6. How computers work with text … data representation Text means nothing more than numbers to the computer ! PNG Image Rank 2 Tensor sentence array of tokenised words
  • 7. Text data representation - example of computer generated text Customers know the product, comment on it and express their opinions. A neural network predicts 10 consequent words for a sentence • Emotions • Feelings • Personal experience • 6-layer neural network • 16 GB RAM • 3 minutes training time Amazon costomer reviews Computer-generated text Not the best of Cream, but it is good. Just not great. I think this is a good album thats all i gotta say. Another great Jimmy Buffett album that's perfect for the backyard Barbeque. This was another brilliant album one of their best if not best. What can you say about this album except that it's great. The songs gives different emotions they're very creative on this album. This cd is really very good the music just flows very smooth. I love everything maroon has done their musicality is exceptional fantastic lyrics. Source: Koráb, 2021. Training Neural Networks to Create Text Like a Human. Towards Data Science (Medium). Available from here. Text means nothing more than numbers to the computer !!
  • 8. Text databases for small research projects • Google trends - free data exploration tool that lets better understand what people search on Google • manual data collection is time-consuming • automatic download requires programming knowledge (Python, R) • Google trends datastore - contains pre-processed datasets from Google trends • Twitter - provides anonymised data from Twitter feeds • multiple ways of data collection, see the article • Central banks’ and government officials’ websites and reporitories – ECB, Kremlin, FED • Kaggle - platform that allows users to participate in predictive modeling competitions and to explore and publish data sets
  • 9. Data science software (general overview) • Academia • STATA - research in social sciences and economics • Julia - scientific computing (macro-modelling, machine learning, ..) • MATLAB - central banks, corporate research, risk modelling • SPSS - market research agencies, sociology • Python - data science, machine learning • R - statistical analytics • Companies In short: Python and R are the prevalent tools for text mining in academia Companies often use specialised Business Intelligence software for specific text mining purposes
  • 10. Text mining methods: word cloud • Word cloud is an image composed of words used in a particular text or subject, in which the size of each word indicates its frequency or importance • A simple graph is frequently used in business, as well as in academia • Implementation: o R: https://www.r-graph-gallery.com/wordcloud.html o SPSS: https://github.com/IBMPredictiveAnalytics/Word_Cloud_Visualization o Python: https://pypi.org/project/wordcloud/ o STATA: https://www.stata-journal.com/article.html?article=dm0094 o MATLAB: https://www.mathworks.com/help/matlab/ref/wordcloud.html • Python coding example: here Source: Kurt, F., 2020. Masking With WordCloud in Python: 500 Most Frequently Used Words in German. The Startup.
  • 11. Word cloud applications • Primarily data summarization • customer reviews • opinion polls • politicians’ statements • monetary policy transcripts, etc. Recommended literature: • Martin Feldkircher, Paul Hofmarcher, Pierre Siklos. 2021. What’s the Message? Interpreting Monetary Policy Through Central Bankers’ Speeches. SUERF Policy Brief, No 153. • Petr Koráb, Jarko Fidrmuc, David Štrba. 2021. Guide to Using Word Clouds for Applied Research Design. Towards Data Science. Dec 14, 2021.
  • 12. Text mining methods: sentiment analysis • Sentiment analysis is a large field in natural language processing (NLP) that uses techniques to identify, extract and quantify emotions from text data • In companies, it helps understand customer feedback, evaluate social media conversations, help prioritize communication with customers in customer care departments, etc. • Both Python and R provide a large variety of sentiment classifiers and libraries • Python coding example: here Recommended reading: Koráb, 2021. The Most Favorable Pre-trained Sentiment Classifiers in Python, Towards Data Science (Medium). Available from here.
  • 13. Research example I: animated word cloud in economics • Word clouds are commonly presented as static plots without time dimension • We create MP4 video with keywords constructed from titles of research articles in 5 leading economic journals, 1900 – now. • Single video maps modern era of economic science • Our contribution: • First application of dynamic word clouds in economics (to the best of knowledge) • Example of: • Quantitative historical research method • Data story-telling in economics Source: Koráb, Štrba, Fidrmuc 2021. Animated Word Clouds: A Novel Way for the Visualization of Word Frequencies. Python in Plain English. Available from here.
  • 15. Conclusions • Text mining methods have been frequently used in economics over the last 15-20 years due to a massive increase in the volume of text data • Modern computing methods and algorithms make use of text mining accessible to anyone • Computers do not at all understand language more than a sequence of numbers • Simple text mining methods are implemented in all standard programs and can be used for small projects and theses • Text mining is a great area to focus on in your career in the private sector or academia
  • 16. Thank you for your attention Happy to ansver your questions Feel free to contact me: xpetrkorab@gmail.com My blog (text mining, applied ML): https://petrkorab.medium.com