Booking open Available Pune Call Girls Shivane 6297143586 Call Hot Indian Gi...
slides_ZU_Text_mining_final (MEDIUM).pdf
1. Text Mining in Economics:
Methods and Applications
Mgr. Petr Koráb, Ph.D.
Lentiamo, Prague
Zeppelin University, Friedrichshafen
Research Design
Zeppelin University, 18 Oct 2022
2. About the instructor
• 2022 – Visiting Resercher, Zeppelin University
• 2019 – Data Analyst, Lentiamo, Prague
• 2015 – 2018: Assistant Professor in Finance, Mendel University
• 2015 – 2018: Researcher, Research Centre, Mendel University
• 2014, 2015, 2017: Visting Researcher, Zeppelin University
• 2018: Visiting Resercher, University of Missouri St. Louis
• 2014: Junior Fellow, WIFO, Vienna
• Recent papers:
Koráb, Mallek, Dibooglu, 2021. Effects of quantitative easing on firm performance in the euro area. The North American Journal
of Economics and Finance. vol. 57(C).
Koráb, Dibooglu, Fidrmuc. Trade-offs of dollarization: Meta-analysis evidence. R&R, Journal of International Money and Finance.
3. Outline
• Why we care about text data in economics
• Text mining and research trends in economics
• Text data representation
• Text data databases
• Data science software for text mining
• Text mining methods for applied research
• Examples of research at the Machine Learning & Text Mining in Economics research group at
Zeppelin University
4. Why do we care about text data in Economics ?
• Developments since the late 20th century, such as:
• general development of technologies to process big data
• availability of data from social networks (e.g. Twitter, YouTube)
• transcripts of politicians’ statements and central bankers’ meetings
• publicly available access to major text databases (Google Trends, RSS feeds, Wikipedia, Google Ngrams)
… have resulted in a vast amount of text data easily accessible to analysts, students, and researchers
• Digital economics incorporates many of the
recent advances related to modern computing
and digitization
Source: United Nations Conference
on Trade and Development, 2021
5. Text mining and research trends in economics
• In finance: text from financial news, social media, and
company filings is used to predict asset price
movements and study the causal impact of new
information.
• In macroeconomics: text is used to forecast variation in
inflation and unemployment, and estimate the effects
of policy uncertainty.
• In media economics: text from news and social media
is used to study the drivers and effects of political slant.
• In industrial organization and marketing: text from
advertisements and product reviews is used to study
the drivers of consumer decision making.
• In political economy: text from politicians’ speeches is
used to study the dynamics of political agendas and
debate
(Gentzkow, et al., 2019. Text as Data,
Journal of Economic Literature)
Text mining in economics articles in JSTOR database
Source: Constellate.org
6. How computers work with text … data representation
Text means nothing more than
numbers to the computer !
PNG
Image
Rank 2
Tensor
sentence
array of tokenised words
7. Text data representation - example of computer generated text
Customers know the product, comment on it
and express their opinions.
A neural network predicts 10 consequent
words for a sentence
• Emotions
• Feelings
• Personal experience
• 6-layer neural network
• 16 GB RAM
• 3 minutes training time
Amazon costomer reviews Computer-generated text
Not the best of Cream, but it is good. Just not great. I think this is a good album thats all i gotta say.
Another great Jimmy Buffett album that's perfect for the backyard Barbeque. This was another brilliant album one of their best if not best.
What can you say about this album except that it's great.
The songs gives different emotions they're very creative on this album.
This cd is really very good the music just flows very smooth.
I love everything maroon has done their musicality is exceptional
fantastic lyrics.
Source: Koráb, 2021. Training Neural Networks
to Create Text Like a Human. Towards Data
Science (Medium). Available from here.
Text means nothing more than
numbers to the computer !!
8. Text databases for small research projects
• Google trends - free data exploration tool that lets better understand what people search on
Google
• manual data collection is time-consuming
• automatic download requires programming knowledge (Python, R)
• Google trends datastore - contains pre-processed datasets from Google trends
• Twitter - provides anonymised data from Twitter feeds
• multiple ways of data collection, see the article
• Central banks’ and government officials’ websites and reporitories – ECB, Kremlin, FED
• Kaggle - platform that allows users to participate in predictive modeling competitions and to
explore and publish data sets
9. Data science software (general overview)
• Academia
• STATA - research in social sciences and economics
• Julia - scientific computing (macro-modelling, machine learning, ..)
• MATLAB - central banks, corporate research, risk modelling
• SPSS - market research agencies, sociology
• Python - data science, machine learning
• R - statistical analytics
• Companies
In short: Python and R are the
prevalent tools for text mining in
academia
Companies often use specialised
Business Intelligence software for
specific text mining purposes
10. Text mining methods: word cloud
• Word cloud is an image composed of words used in a particular
text or subject, in which the size of each word indicates its
frequency or importance
• A simple graph is frequently used in business, as well as in
academia
• Implementation:
o R: https://www.r-graph-gallery.com/wordcloud.html
o SPSS: https://github.com/IBMPredictiveAnalytics/Word_Cloud_Visualization
o Python: https://pypi.org/project/wordcloud/
o STATA: https://www.stata-journal.com/article.html?article=dm0094
o MATLAB: https://www.mathworks.com/help/matlab/ref/wordcloud.html
• Python coding example: here
Source: Kurt, F., 2020. Masking With WordCloud in Python: 500
Most Frequently Used Words in German. The Startup.
11. Word cloud applications
• Primarily data summarization
• customer reviews
• opinion polls
• politicians’ statements
• monetary policy transcripts, etc.
Recommended literature:
• Martin Feldkircher, Paul Hofmarcher, Pierre Siklos. 2021. What’s the Message? Interpreting Monetary Policy
Through Central Bankers’ Speeches. SUERF Policy Brief, No 153.
• Petr Koráb, Jarko Fidrmuc, David Štrba. 2021. Guide to Using Word Clouds for Applied Research Design.
Towards Data Science. Dec 14, 2021.
12. Text mining methods: sentiment analysis
• Sentiment analysis is a large field in natural language
processing (NLP) that uses techniques to identify, extract
and quantify emotions from text data
• In companies, it helps understand customer feedback,
evaluate social media conversations, help prioritize
communication with customers in customer care
departments, etc.
• Both Python and R provide a large variety of sentiment
classifiers and libraries
• Python coding example: here
Recommended reading: Koráb, 2021. The Most Favorable Pre-trained
Sentiment Classifiers in Python, Towards Data Science (Medium).
Available from here.
13. Research example I: animated word cloud in economics
• Word clouds are commonly presented as static
plots without time dimension
• We create MP4 video with keywords
constructed from titles of research articles in 5
leading economic journals, 1900 – now.
• Single video maps modern era of economic
science
• Our contribution:
• First application of dynamic word clouds in
economics (to the best of knowledge)
• Example of:
• Quantitative historical research method
• Data story-telling in economics
Source: Koráb, Štrba, Fidrmuc 2021. Animated Word Clouds: A Novel
Way for the Visualization of Word Frequencies. Python in Plain
English. Available from here.
15. Conclusions
• Text mining methods have been frequently used in economics over the last 15-20
years due to a massive increase in the volume of text data
• Modern computing methods and algorithms make use of text mining accessible
to anyone
• Computers do not at all understand language more than a sequence of numbers
• Simple text mining methods are implemented in all standard programs and can
be used for small projects and theses
• Text mining is a great area to focus on in your career in the private sector or
academia
16. Thank you for your attention
Happy to ansver your questions
Feel free to contact me: xpetrkorab@gmail.com
My blog (text mining, applied ML): https://petrkorab.medium.com