SlideShare a Scribd company logo

A Primer on Text Mining for Business

Slides of the course on big data by C. Levallois from EMLYON Business School. For business students. Check the online video connected with these slides. -> Definition of text mining, the main categories of tools available (such as topic categorization or sentiment analysis) and their use for business.

1 of 15
Download to read offline
MK99 – Big Data 1 
Big data & cross-platform analytics 
MOOC lectures Pr. Clement Levallois
MK99 – Big Data 2 
A primer on text mining for business 
• 
Text mining: 
computational methods to find interesting information in texts 
• 
Quasi synonyms: 
– 
natural language processing (abbreviated in NLP) 
– 
computational linguistics (name of a scientific discipline)
MK99 – Big Data 3 
Text… what kinds? 
• 
Books 
• 
Tweets 
• 
Product reviews on Amazon 
• 
LinkedIn profiles 
• 
The whole Wikipedia 
• 
Free text answers in the results of a survey 
• 
Tenders, contracts, laws, … 
• 
Print and online media 
• 
Archival material 
• 
…
MK99 – Big Data 4 
What can be done? 
• 
Sentiment analysis 
– 
Is this piece of text of a positive or negative tone? 
• 
Topic modeling / topic detection 
– 
What is the main theme of this 20-page booklet? 
• 
Semantic disambiguation 
– 
“Paris” is mentioned in this text. Is this Paris Hilton or Paris, France? 
• 
Named Entity Recognition (NER) 
– 
Automatically find the individuals, organizations and events named in the text, and the relations between them. 
• 
Semantic enrichment 
– 
If you searched Google for “TV”, results for “television” will also show up 
• 
Language detection 
– 
“Ich spreche Deutsch” -> this sentence is written in German 
• 
Automatic Translation 
– 
See Google Translate 
•Summarizing 
–Shortening a text while keeping its core message intact 
•Spelling correction 
–Well, that’s easy 
•Topic Classification 
–Is this email a spam or not?
MK99 – Big Data 5 
Amaze me! 
• 
Demo on sentiment analysis 
With a tool by Stanford: http://nlp.stanford.edu:8080/sentiment/rntnDemo.html 
• 
Demo on semantic disambiguation 
With a tool by a collaborative effort: http://dbpedia-spotlight.github.io/demo/ 
(click on “annotate”, and also change the text for one of your own)
MK99 – Big Data 6 
What can’t be done yet (but is actively researched) 
• 
Detection of irony 
• 
Robust translation 
• 
Reasoning beyond Q&A 
What makes things harder 
• 
Non English texts 
• 
Slang and colloquial speech-forms 
• 
Real time processing

Recommended

The business stakes of data integration
The business stakes of data integrationThe business stakes of data integration
The business stakes of data integrationClement Levallois
 
Data mining in social network
Data mining in social networkData mining in social network
Data mining in social networkakash_mishra
 
Data mining on Social Media
Data mining on Social MediaData mining on Social Media
Data mining on Social Mediahome
 
Social media mining PPT
Social media mining PPTSocial media mining PPT
Social media mining PPTChhavi Mathur
 
Mining social data
Mining social dataMining social data
Mining social dataMalk Zameth
 

More Related Content

What's hot

Evolving social data mining and affective analysis
Evolving social data mining and affective analysis  Evolving social data mining and affective analysis
Evolving social data mining and affective analysis Athena Vakali
 
Social Media Mining: An Introduction
Social Media Mining: An IntroductionSocial Media Mining: An Introduction
Social Media Mining: An IntroductionAli Abbasi
 
Introduction to the Responsible Use of Social Media Monitoring and SOCMINT Tools
Introduction to the Responsible Use of Social Media Monitoring and SOCMINT ToolsIntroduction to the Responsible Use of Social Media Monitoring and SOCMINT Tools
Introduction to the Responsible Use of Social Media Monitoring and SOCMINT ToolsMike Kujawski
 
Data mining for social media
Data mining for social mediaData mining for social media
Data mining for social mediarangesharp
 
efficient data query in intermittently-connected mobile ad hoc social networks
efficient data query in intermittently-connected mobile ad hoc social networksefficient data query in intermittently-connected mobile ad hoc social networks
efficient data query in intermittently-connected mobile ad hoc social networksswathi78
 
Data mining based social network
Data mining based social networkData mining based social network
Data mining based social networkFiras Husseini
 
Incentive compatible privacy preserving data analysis
Incentive compatible privacy preserving data analysisIncentive compatible privacy preserving data analysis
Incentive compatible privacy preserving data analysisJPINFOTECH JAYAPRAKASH
 
Big Data Analytics : A Social Network Approach
Big Data Analytics : A Social Network ApproachBig Data Analytics : A Social Network Approach
Big Data Analytics : A Social Network ApproachAndry Alamsyah
 
Social Media Data Mining
Social Media Data MiningSocial Media Data Mining
Social Media Data MiningRyan Reede
 
Social Media Mining - Chapter 10 (Behavior Analytics)
Social Media Mining - Chapter 10 (Behavior Analytics)Social Media Mining - Chapter 10 (Behavior Analytics)
Social Media Mining - Chapter 10 (Behavior Analytics)SocialMediaMining
 
Social Media Mining - Chapter 5 (Data Mining Essentials)
Social Media Mining - Chapter 5 (Data Mining Essentials)Social Media Mining - Chapter 5 (Data Mining Essentials)
Social Media Mining - Chapter 5 (Data Mining Essentials)SocialMediaMining
 
Social Targeting: Understanding Social Media Data Mining & Analysis
Social Targeting: Understanding Social Media Data Mining & AnalysisSocial Targeting: Understanding Social Media Data Mining & Analysis
Social Targeting: Understanding Social Media Data Mining & AnalysisInfini Graph
 
Social media analytics - Making sense of Big Data
Social media analytics - Making sense of Big DataSocial media analytics - Making sense of Big Data
Social media analytics - Making sense of Big DataHenrik Hammer Eliassen
 
Social Media Mining - Chapter 2 (Graph Essentials)
Social Media Mining - Chapter 2 (Graph Essentials)Social Media Mining - Chapter 2 (Graph Essentials)
Social Media Mining - Chapter 2 (Graph Essentials)SocialMediaMining
 
Fact Checking & Information Retrieval
Fact Checking & Information RetrievalFact Checking & Information Retrieval
Fact Checking & Information RetrievalMatthew Lease
 
Presentation big data and social media final_video
Presentation big data and social media final_videoPresentation big data and social media final_video
Presentation big data and social media final_videoramikaurraminder
 
Dissemination of Awareness Evolution “What is really going on?” Pilkada 2015 ...
Dissemination of Awareness Evolution “What is really going on?” Pilkada 2015 ...Dissemination of Awareness Evolution “What is really going on?” Pilkada 2015 ...
Dissemination of Awareness Evolution “What is really going on?” Pilkada 2015 ...Andry Alamsyah
 

What's hot (20)

Evolving social data mining and affective analysis
Evolving social data mining and affective analysis  Evolving social data mining and affective analysis
Evolving social data mining and affective analysis
 
Social Media Mining: An Introduction
Social Media Mining: An IntroductionSocial Media Mining: An Introduction
Social Media Mining: An Introduction
 
Introduction to the Responsible Use of Social Media Monitoring and SOCMINT Tools
Introduction to the Responsible Use of Social Media Monitoring and SOCMINT ToolsIntroduction to the Responsible Use of Social Media Monitoring and SOCMINT Tools
Introduction to the Responsible Use of Social Media Monitoring and SOCMINT Tools
 
Social Media Mining and Analytics
Social Media Mining and AnalyticsSocial Media Mining and Analytics
Social Media Mining and Analytics
 
Data mining for social media
Data mining for social mediaData mining for social media
Data mining for social media
 
efficient data query in intermittently-connected mobile ad hoc social networks
efficient data query in intermittently-connected mobile ad hoc social networksefficient data query in intermittently-connected mobile ad hoc social networks
efficient data query in intermittently-connected mobile ad hoc social networks
 
Social Data Mining
Social Data MiningSocial Data Mining
Social Data Mining
 
Data mining based social network
Data mining based social networkData mining based social network
Data mining based social network
 
Incentive compatible privacy preserving data analysis
Incentive compatible privacy preserving data analysisIncentive compatible privacy preserving data analysis
Incentive compatible privacy preserving data analysis
 
Big Data Analytics : A Social Network Approach
Big Data Analytics : A Social Network ApproachBig Data Analytics : A Social Network Approach
Big Data Analytics : A Social Network Approach
 
Social Media Data Mining
Social Media Data MiningSocial Media Data Mining
Social Media Data Mining
 
Social Media Mining - Chapter 10 (Behavior Analytics)
Social Media Mining - Chapter 10 (Behavior Analytics)Social Media Mining - Chapter 10 (Behavior Analytics)
Social Media Mining - Chapter 10 (Behavior Analytics)
 
Social Media Mining - Chapter 5 (Data Mining Essentials)
Social Media Mining - Chapter 5 (Data Mining Essentials)Social Media Mining - Chapter 5 (Data Mining Essentials)
Social Media Mining - Chapter 5 (Data Mining Essentials)
 
Social Targeting: Understanding Social Media Data Mining & Analysis
Social Targeting: Understanding Social Media Data Mining & AnalysisSocial Targeting: Understanding Social Media Data Mining & Analysis
Social Targeting: Understanding Social Media Data Mining & Analysis
 
Social media analytics - Making sense of Big Data
Social media analytics - Making sense of Big DataSocial media analytics - Making sense of Big Data
Social media analytics - Making sense of Big Data
 
Social media with big data analytics
Social media with big data analyticsSocial media with big data analytics
Social media with big data analytics
 
Social Media Mining - Chapter 2 (Graph Essentials)
Social Media Mining - Chapter 2 (Graph Essentials)Social Media Mining - Chapter 2 (Graph Essentials)
Social Media Mining - Chapter 2 (Graph Essentials)
 
Fact Checking & Information Retrieval
Fact Checking & Information RetrievalFact Checking & Information Retrieval
Fact Checking & Information Retrieval
 
Presentation big data and social media final_video
Presentation big data and social media final_videoPresentation big data and social media final_video
Presentation big data and social media final_video
 
Dissemination of Awareness Evolution “What is really going on?” Pilkada 2015 ...
Dissemination of Awareness Evolution “What is really going on?” Pilkada 2015 ...Dissemination of Awareness Evolution “What is really going on?” Pilkada 2015 ...
Dissemination of Awareness Evolution “What is really going on?” Pilkada 2015 ...
 

Viewers also liked

Quick Start Tutorial of KH Coder 2: Quantitative Content Analysis or Text Min...
Quick Start Tutorial of KH Coder 2: Quantitative Content Analysis or Text Min...Quick Start Tutorial of KH Coder 2: Quantitative Content Analysis or Text Min...
Quick Start Tutorial of KH Coder 2: Quantitative Content Analysis or Text Min...khcoder
 
Introduction to Text Mining
Introduction to Text MiningIntroduction to Text Mining
Introduction to Text MiningMinha Hwang
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with PythonBenjamin Bengfort
 
Gephi Tutorial Visualization
Gephi Tutorial VisualizationGephi Tutorial Visualization
Gephi Tutorial VisualizationGephi Consortium
 
Docker for Java Developers
Docker for Java DevelopersDocker for Java Developers
Docker for Java DevelopersNGINX, Inc.
 
KVM and docker LXC Benchmarking with OpenStack
KVM and docker LXC Benchmarking with OpenStackKVM and docker LXC Benchmarking with OpenStack
KVM and docker LXC Benchmarking with OpenStackBoden Russell
 
Facebook Network Analysis using Gephi
Facebook Network Analysis using GephiFacebook Network Analysis using Gephi
Facebook Network Analysis using GephiSarah Joy Murray
 

Viewers also liked (8)

Quick Start Tutorial of KH Coder 2: Quantitative Content Analysis or Text Min...
Quick Start Tutorial of KH Coder 2: Quantitative Content Analysis or Text Min...Quick Start Tutorial of KH Coder 2: Quantitative Content Analysis or Text Min...
Quick Start Tutorial of KH Coder 2: Quantitative Content Analysis or Text Min...
 
Introduction to Text Mining
Introduction to Text MiningIntroduction to Text Mining
Introduction to Text Mining
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with Python
 
Gephi Tutorial Visualization
Gephi Tutorial VisualizationGephi Tutorial Visualization
Gephi Tutorial Visualization
 
Textmining Introduction
Textmining IntroductionTextmining Introduction
Textmining Introduction
 
Docker for Java Developers
Docker for Java DevelopersDocker for Java Developers
Docker for Java Developers
 
KVM and docker LXC Benchmarking with OpenStack
KVM and docker LXC Benchmarking with OpenStackKVM and docker LXC Benchmarking with OpenStack
KVM and docker LXC Benchmarking with OpenStack
 
Facebook Network Analysis using Gephi
Facebook Network Analysis using GephiFacebook Network Analysis using Gephi
Facebook Network Analysis using Gephi
 

Similar to A Primer on Text Mining for Business

Relevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search TechnologiesRelevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search Technologiesenterprisesearchmeetup
 
Knowledge Extraction from Social Media
Knowledge Extraction from Social MediaKnowledge Extraction from Social Media
Knowledge Extraction from Social MediaSeth Grimes
 
Km cognitive computing overview by ken martin 19 jan2015
Km   cognitive computing overview by ken martin 19 jan2015Km   cognitive computing overview by ken martin 19 jan2015
Km cognitive computing overview by ken martin 19 jan2015HCL Technologies
 
KM - Cognitive Computing overview by Ken Martin 13Apr2016
KM - Cognitive Computing overview by Ken Martin 13Apr2016KM - Cognitive Computing overview by Ken Martin 13Apr2016
KM - Cognitive Computing overview by Ken Martin 13Apr2016HCL Technologies
 
Why Social Media Matters to You and Your Agency
Why Social Media Matters to You and Your AgencyWhy Social Media Matters to You and Your Agency
Why Social Media Matters to You and Your Agencygvaughan
 
Search Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By DesignSearch Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By DesignMarianne Sweeny
 
Kmwi2008 Pollard From Content To Context And From Collection To Connection V3
Kmwi2008 Pollard From Content To Context And From Collection To Connection V3Kmwi2008 Pollard From Content To Context And From Collection To Connection V3
Kmwi2008 Pollard From Content To Context And From Collection To Connection V3Dave Pollard
 
Building Effective Frameworks for Social Media Analysis
Building Effective Frameworks for Social Media AnalysisBuilding Effective Frameworks for Social Media Analysis
Building Effective Frameworks for Social Media AnalysisOpen Analytics
 
Information Architecture for Retail Web Sites: Lessons from the Field
Information Architecture for Retail Web Sites: Lessons from the FieldInformation Architecture for Retail Web Sites: Lessons from the Field
Information Architecture for Retail Web Sites: Lessons from the FieldNick Berry
 
AI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge ManagementAI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge ManagementTrey Grainger
 
Session 01 designing and scoping a data science project
Session 01 designing and scoping a data science projectSession 01 designing and scoping a data science project
Session 01 designing and scoping a data science projectbodaceacat
 
Session 01 designing and scoping a data science project
Session 01 designing and scoping a data science projectSession 01 designing and scoping a data science project
Session 01 designing and scoping a data science projectSara-Jayne Terp
 
Semantic engagement
Semantic engagementSemantic engagement
Semantic engagementSTIinnsbruck
 
When to use the different text analytics tools - Meaning Cloud
When to use the different text analytics tools - Meaning CloudWhen to use the different text analytics tools - Meaning Cloud
When to use the different text analytics tools - Meaning CloudMeaningCloud
 
Narrative Mind Week 5 H4D Stanford 2016
Narrative Mind Week 5 H4D Stanford 2016Narrative Mind Week 5 H4D Stanford 2016
Narrative Mind Week 5 H4D Stanford 2016Stanford University
 
Text analytics on social media
Text analytics on social mediaText analytics on social media
Text analytics on social mediaVenkatramanan P.R.
 
How to Manage Social Media for the Busy Professional - 40 Plus DC [long]
How to Manage Social Media for the Busy Professional - 40 Plus DC [long]How to Manage Social Media for the Busy Professional - 40 Plus DC [long]
How to Manage Social Media for the Busy Professional - 40 Plus DC [long]gvaughan
 

Similar to A Primer on Text Mining for Business (20)

Relevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search TechnologiesRelevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search Technologies
 
Knowledge Extraction from Social Media
Knowledge Extraction from Social MediaKnowledge Extraction from Social Media
Knowledge Extraction from Social Media
 
Km cognitive computing overview by ken martin 19 jan2015
Km   cognitive computing overview by ken martin 19 jan2015Km   cognitive computing overview by ken martin 19 jan2015
Km cognitive computing overview by ken martin 19 jan2015
 
KM - Cognitive Computing overview by Ken Martin 13Apr2016
KM - Cognitive Computing overview by Ken Martin 13Apr2016KM - Cognitive Computing overview by Ken Martin 13Apr2016
KM - Cognitive Computing overview by Ken Martin 13Apr2016
 
AKM PPT C4 ASSET FORMATION
AKM PPT C4 ASSET FORMATIONAKM PPT C4 ASSET FORMATION
AKM PPT C4 ASSET FORMATION
 
Why Social Media Matters to You and Your Agency
Why Social Media Matters to You and Your AgencyWhy Social Media Matters to You and Your Agency
Why Social Media Matters to You and Your Agency
 
Search Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By DesignSearch Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By Design
 
Kmwi2008 Pollard From Content To Context And From Collection To Connection V3
Kmwi2008 Pollard From Content To Context And From Collection To Connection V3Kmwi2008 Pollard From Content To Context And From Collection To Connection V3
Kmwi2008 Pollard From Content To Context And From Collection To Connection V3
 
Building Effective Frameworks for Social Media Analysis
Building Effective Frameworks for Social Media AnalysisBuilding Effective Frameworks for Social Media Analysis
Building Effective Frameworks for Social Media Analysis
 
Information Architecture for Retail Web Sites: Lessons from the Field
Information Architecture for Retail Web Sites: Lessons from the FieldInformation Architecture for Retail Web Sites: Lessons from the Field
Information Architecture for Retail Web Sites: Lessons from the Field
 
AI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge ManagementAI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge Management
 
Session 01 designing and scoping a data science project
Session 01 designing and scoping a data science projectSession 01 designing and scoping a data science project
Session 01 designing and scoping a data science project
 
Session 01 designing and scoping a data science project
Session 01 designing and scoping a data science projectSession 01 designing and scoping a data science project
Session 01 designing and scoping a data science project
 
Semantic engagement
Semantic engagementSemantic engagement
Semantic engagement
 
When to use the different text analytics tools - Meaning Cloud
When to use the different text analytics tools - Meaning CloudWhen to use the different text analytics tools - Meaning Cloud
When to use the different text analytics tools - Meaning Cloud
 
Ola ei nov. 22 2103
Ola ei nov. 22 2103Ola ei nov. 22 2103
Ola ei nov. 22 2103
 
call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...
 
Narrative Mind Week 5 H4D Stanford 2016
Narrative Mind Week 5 H4D Stanford 2016Narrative Mind Week 5 H4D Stanford 2016
Narrative Mind Week 5 H4D Stanford 2016
 
Text analytics on social media
Text analytics on social mediaText analytics on social media
Text analytics on social media
 
How to Manage Social Media for the Busy Professional - 40 Plus DC [long]
How to Manage Social Media for the Busy Professional - 40 Plus DC [long]How to Manage Social Media for the Busy Professional - 40 Plus DC [long]
How to Manage Social Media for the Busy Professional - 40 Plus DC [long]
 

More from Clement Levallois

Part 2: covid-19 on Twitter, with a focus on 3 new seed accounts
Part 2: covid-19 on Twitter, with a focus on 3 new seed accountsPart 2: covid-19 on Twitter, with a focus on 3 new seed accounts
Part 2: covid-19 on Twitter, with a focus on 3 new seed accountsClement Levallois
 
Education et intelligence artificielle
Education et intelligence artificielleEducation et intelligence artificielle
Education et intelligence artificielleClement Levallois
 
3 familles d'intelligence artificielle et leurs applications business
3 familles d'intelligence artificielle et leurs applications business3 familles d'intelligence artificielle et leurs applications business
3 familles d'intelligence artificielle et leurs applications businessClement Levallois
 
Présentation FrenchWeb: Qu'est-ce que la visualisation des données?
Présentation FrenchWeb: Qu'est-ce que la visualisation des données?Présentation FrenchWeb: Qu'est-ce que la visualisation des données?
Présentation FrenchWeb: Qu'est-ce que la visualisation des données?Clement Levallois
 
Presentation of programming languages for beginners
Presentation of programming languages for beginnersPresentation of programming languages for beginners
Presentation of programming languages for beginnersClement Levallois
 
Umigon: crowdsourcing in the classroom
Umigon: crowdsourcing in the classroomUmigon: crowdsourcing in the classroom
Umigon: crowdsourcing in the classroomClement Levallois
 
Data visualization: enjeux pour le business
Data visualization: enjeux pour le businessData visualization: enjeux pour le business
Data visualization: enjeux pour le businessClement Levallois
 
An explanation of machine learning for business
An explanation of machine learning for businessAn explanation of machine learning for business
An explanation of machine learning for businessClement Levallois
 

More from Clement Levallois (9)

Part 2: covid-19 on Twitter, with a focus on 3 new seed accounts
Part 2: covid-19 on Twitter, with a focus on 3 new seed accountsPart 2: covid-19 on Twitter, with a focus on 3 new seed accounts
Part 2: covid-19 on Twitter, with a focus on 3 new seed accounts
 
Education et intelligence artificielle
Education et intelligence artificielleEducation et intelligence artificielle
Education et intelligence artificielle
 
3 familles d'intelligence artificielle et leurs applications business
3 familles d'intelligence artificielle et leurs applications business3 familles d'intelligence artificielle et leurs applications business
3 familles d'intelligence artificielle et leurs applications business
 
Présentation FrenchWeb: Qu'est-ce que la visualisation des données?
Présentation FrenchWeb: Qu'est-ce que la visualisation des données?Présentation FrenchWeb: Qu'est-ce que la visualisation des données?
Présentation FrenchWeb: Qu'est-ce que la visualisation des données?
 
Presentation of programming languages for beginners
Presentation of programming languages for beginnersPresentation of programming languages for beginners
Presentation of programming languages for beginners
 
Umigon: crowdsourcing in the classroom
Umigon: crowdsourcing in the classroomUmigon: crowdsourcing in the classroom
Umigon: crowdsourcing in the classroom
 
Data visualization: enjeux pour le business
Data visualization: enjeux pour le businessData visualization: enjeux pour le business
Data visualization: enjeux pour le business
 
Twitter for beginners
Twitter for beginnersTwitter for beginners
Twitter for beginners
 
An explanation of machine learning for business
An explanation of machine learning for businessAn explanation of machine learning for business
An explanation of machine learning for business
 

Recently uploaded

NewBase 26 January 2024 Energy News issue - 1702 by Khaled Al Awadi_compres...
NewBase  26 January 2024  Energy News issue - 1702 by Khaled Al Awadi_compres...NewBase  26 January 2024  Energy News issue - 1702 by Khaled Al Awadi_compres...
NewBase 26 January 2024 Energy News issue - 1702 by Khaled Al Awadi_compres...Khaled Al Awadi
 
FICCI Monthly Bulletin February 2024.pdf
FICCI  Monthly Bulletin February 2024.pdfFICCI  Monthly Bulletin February 2024.pdf
FICCI Monthly Bulletin February 2024.pdfsubarnamostafa1
 
Diageo Strategy Presentation made in February 2024 CAGNY
Diageo Strategy Presentation made in February 2024 CAGNYDiageo Strategy Presentation made in February 2024 CAGNY
Diageo Strategy Presentation made in February 2024 CAGNYNeil Kimberley
 
Danylo Fedirko: Leveraging AI in SEO and optimizing for SRE (ChatGPT and Bard...
Danylo Fedirko: Leveraging AI in SEO and optimizing for SRE (ChatGPT and Bard...Danylo Fedirko: Leveraging AI in SEO and optimizing for SRE (ChatGPT and Bard...
Danylo Fedirko: Leveraging AI in SEO and optimizing for SRE (ChatGPT and Bard...Lviv Startup Club
 
Grevault battery storage system manufacturer
Grevault battery storage system manufacturerGrevault battery storage system manufacturer
Grevault battery storage system manufacturerGrevault
 
Miller Coors Presentation at CAGNY Feb 2024
Miller Coors Presentation at CAGNY Feb 2024Miller Coors Presentation at CAGNY Feb 2024
Miller Coors Presentation at CAGNY Feb 2024Neil Kimberley
 
Session 2 - Value Proposition 1 JAX Bridges
Session 2 - Value Proposition 1 JAX BridgesSession 2 - Value Proposition 1 JAX Bridges
Session 2 - Value Proposition 1 JAX BridgesAnamaria Contreras
 
SYY CAGNY 2024 PRESENTATION (February 20, 2024)
SYY CAGNY 2024 PRESENTATION (February 20, 2024)SYY CAGNY 2024 PRESENTATION (February 20, 2024)
SYY CAGNY 2024 PRESENTATION (February 20, 2024)SYYIR
 
Entrepreneurship Skills, Attitude & Behavior Development
Entrepreneurship Skills, Attitude & Behavior DevelopmentEntrepreneurship Skills, Attitude & Behavior Development
Entrepreneurship Skills, Attitude & Behavior DevelopmentVisionPublisher
 
Pernod Ricard presentation at CAGNY 2024
Pernod Ricard presentation at CAGNY 2024Pernod Ricard presentation at CAGNY 2024
Pernod Ricard presentation at CAGNY 2024Neil Kimberley
 
TriStar Gold Corporate Presentation February 2024
TriStar Gold Corporate Presentation February 2024TriStar Gold Corporate Presentation February 2024
TriStar Gold Corporate Presentation February 2024Adnet Communications
 
Construction Documents Guide: Types and Significance in 2024
Construction Documents Guide: Types and Significance in 2024Construction Documents Guide: Types and Significance in 2024
Construction Documents Guide: Types and Significance in 2024caddrafting1
 
Research Showcase 2024 final presentation slides
Research Showcase 2024 final presentation slidesResearch Showcase 2024 final presentation slides
Research Showcase 2024 final presentation slidesenterpriseresearchcentre
 
02.20 Webinar - Online Giving Trends.pdf
02.20 Webinar - Online Giving Trends.pdf02.20 Webinar - Online Giving Trends.pdf
02.20 Webinar - Online Giving Trends.pdfBloomerang
 
Ensuring Financial Integrity: Conducting Effective Audits of Branch Office Ac...
Ensuring Financial Integrity: Conducting Effective Audits of Branch Office Ac...Ensuring Financial Integrity: Conducting Effective Audits of Branch Office Ac...
Ensuring Financial Integrity: Conducting Effective Audits of Branch Office Ac...MiMOiQ1
 
flutter_bootcamp_MUGDSC_Presentation.pptx
flutter_bootcamp_MUGDSC_Presentation.pptxflutter_bootcamp_MUGDSC_Presentation.pptx
flutter_bootcamp_MUGDSC_Presentation.pptxRakshaAgrawal21
 
Leistungsbeschreibung PLM Recruitment 2024
Leistungsbeschreibung PLM Recruitment 2024Leistungsbeschreibung PLM Recruitment 2024
Leistungsbeschreibung PLM Recruitment 2024Joerg Speikamp
 
ZEOTAR EV Prince Team English Presentation
ZEOTAR EV Prince Team English PresentationZEOTAR EV Prince Team English Presentation
ZEOTAR EV Prince Team English PresentationKings Reddys
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Bloomerang Fundraising Week02.26.2024.pdf
Bloomerang Fundraising Week02.26.2024.pdfBloomerang Fundraising Week02.26.2024.pdf
Bloomerang Fundraising Week02.26.2024.pdfBloomerang
 

Recently uploaded (20)

NewBase 26 January 2024 Energy News issue - 1702 by Khaled Al Awadi_compres...
NewBase  26 January 2024  Energy News issue - 1702 by Khaled Al Awadi_compres...NewBase  26 January 2024  Energy News issue - 1702 by Khaled Al Awadi_compres...
NewBase 26 January 2024 Energy News issue - 1702 by Khaled Al Awadi_compres...
 
FICCI Monthly Bulletin February 2024.pdf
FICCI  Monthly Bulletin February 2024.pdfFICCI  Monthly Bulletin February 2024.pdf
FICCI Monthly Bulletin February 2024.pdf
 
Diageo Strategy Presentation made in February 2024 CAGNY
Diageo Strategy Presentation made in February 2024 CAGNYDiageo Strategy Presentation made in February 2024 CAGNY
Diageo Strategy Presentation made in February 2024 CAGNY
 
Danylo Fedirko: Leveraging AI in SEO and optimizing for SRE (ChatGPT and Bard...
Danylo Fedirko: Leveraging AI in SEO and optimizing for SRE (ChatGPT and Bard...Danylo Fedirko: Leveraging AI in SEO and optimizing for SRE (ChatGPT and Bard...
Danylo Fedirko: Leveraging AI in SEO and optimizing for SRE (ChatGPT and Bard...
 
Grevault battery storage system manufacturer
Grevault battery storage system manufacturerGrevault battery storage system manufacturer
Grevault battery storage system manufacturer
 
Miller Coors Presentation at CAGNY Feb 2024
Miller Coors Presentation at CAGNY Feb 2024Miller Coors Presentation at CAGNY Feb 2024
Miller Coors Presentation at CAGNY Feb 2024
 
Session 2 - Value Proposition 1 JAX Bridges
Session 2 - Value Proposition 1 JAX BridgesSession 2 - Value Proposition 1 JAX Bridges
Session 2 - Value Proposition 1 JAX Bridges
 
SYY CAGNY 2024 PRESENTATION (February 20, 2024)
SYY CAGNY 2024 PRESENTATION (February 20, 2024)SYY CAGNY 2024 PRESENTATION (February 20, 2024)
SYY CAGNY 2024 PRESENTATION (February 20, 2024)
 
Entrepreneurship Skills, Attitude & Behavior Development
Entrepreneurship Skills, Attitude & Behavior DevelopmentEntrepreneurship Skills, Attitude & Behavior Development
Entrepreneurship Skills, Attitude & Behavior Development
 
Pernod Ricard presentation at CAGNY 2024
Pernod Ricard presentation at CAGNY 2024Pernod Ricard presentation at CAGNY 2024
Pernod Ricard presentation at CAGNY 2024
 
TriStar Gold Corporate Presentation February 2024
TriStar Gold Corporate Presentation February 2024TriStar Gold Corporate Presentation February 2024
TriStar Gold Corporate Presentation February 2024
 
Construction Documents Guide: Types and Significance in 2024
Construction Documents Guide: Types and Significance in 2024Construction Documents Guide: Types and Significance in 2024
Construction Documents Guide: Types and Significance in 2024
 
Research Showcase 2024 final presentation slides
Research Showcase 2024 final presentation slidesResearch Showcase 2024 final presentation slides
Research Showcase 2024 final presentation slides
 
02.20 Webinar - Online Giving Trends.pdf
02.20 Webinar - Online Giving Trends.pdf02.20 Webinar - Online Giving Trends.pdf
02.20 Webinar - Online Giving Trends.pdf
 
Ensuring Financial Integrity: Conducting Effective Audits of Branch Office Ac...
Ensuring Financial Integrity: Conducting Effective Audits of Branch Office Ac...Ensuring Financial Integrity: Conducting Effective Audits of Branch Office Ac...
Ensuring Financial Integrity: Conducting Effective Audits of Branch Office Ac...
 
flutter_bootcamp_MUGDSC_Presentation.pptx
flutter_bootcamp_MUGDSC_Presentation.pptxflutter_bootcamp_MUGDSC_Presentation.pptx
flutter_bootcamp_MUGDSC_Presentation.pptx
 
Leistungsbeschreibung PLM Recruitment 2024
Leistungsbeschreibung PLM Recruitment 2024Leistungsbeschreibung PLM Recruitment 2024
Leistungsbeschreibung PLM Recruitment 2024
 
ZEOTAR EV Prince Team English Presentation
ZEOTAR EV Prince Team English PresentationZEOTAR EV Prince Team English Presentation
ZEOTAR EV Prince Team English Presentation
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Bloomerang Fundraising Week02.26.2024.pdf
Bloomerang Fundraising Week02.26.2024.pdfBloomerang Fundraising Week02.26.2024.pdf
Bloomerang Fundraising Week02.26.2024.pdf
 

A Primer on Text Mining for Business

  • 1. MK99 – Big Data 1 Big data & cross-platform analytics MOOC lectures Pr. Clement Levallois
  • 2. MK99 – Big Data 2 A primer on text mining for business • Text mining: computational methods to find interesting information in texts • Quasi synonyms: – natural language processing (abbreviated in NLP) – computational linguistics (name of a scientific discipline)
  • 3. MK99 – Big Data 3 Text… what kinds? • Books • Tweets • Product reviews on Amazon • LinkedIn profiles • The whole Wikipedia • Free text answers in the results of a survey • Tenders, contracts, laws, … • Print and online media • Archival material • …
  • 4. MK99 – Big Data 4 What can be done? • Sentiment analysis – Is this piece of text of a positive or negative tone? • Topic modeling / topic detection – What is the main theme of this 20-page booklet? • Semantic disambiguation – “Paris” is mentioned in this text. Is this Paris Hilton or Paris, France? • Named Entity Recognition (NER) – Automatically find the individuals, organizations and events named in the text, and the relations between them. • Semantic enrichment – If you searched Google for “TV”, results for “television” will also show up • Language detection – “Ich spreche Deutsch” -> this sentence is written in German • Automatic Translation – See Google Translate •Summarizing –Shortening a text while keeping its core message intact •Spelling correction –Well, that’s easy •Topic Classification –Is this email a spam or not?
  • 5. MK99 – Big Data 5 Amaze me! • Demo on sentiment analysis With a tool by Stanford: http://nlp.stanford.edu:8080/sentiment/rntnDemo.html • Demo on semantic disambiguation With a tool by a collaborative effort: http://dbpedia-spotlight.github.io/demo/ (click on “annotate”, and also change the text for one of your own)
  • 6. MK99 – Big Data 6 What can’t be done yet (but is actively researched) • Detection of irony • Robust translation • Reasoning beyond Q&A What makes things harder • Non English texts • Slang and colloquial speech-forms • Real time processing
  • 7. MK99 – Big Data 7 Example of routine operations when working with text (or, how to follow the most basic conversation in comput. linguistics) • Stemming – “liked” and “like” will be reduced to their stem “lik” to facilitate further operations • Lemmatizing – Grouping “liked”, “like” and “likes” to count them as one basic semantic unit • Part-of-Speech tagging (aka POS tagging) – Automatically detecting the grammatical function of the terms used in a sentence, to facilitate translation or else • “Starting the text analysis with a bag-of-words model” – Operation which consists in just listing and counting all different words in the text. • N-grams – The text “I am Dutch” is made of 3 words: I, am, Dutch. But it can also be interesting to look at bigrams in the text: “I am”, “am Dutch”. Or trigrams: “I am Dutch”. – When neighboring words are considered together just like we did, they are called n-grams. This can reveal interesting things about frequent expressions used in the text. – A good example of how useful this can be: visit the Ngram Viewer by Google: https://books.google.com/ngrams
  • 8. MK99 – Big Data 8 Chief benefit: Getting to know individuals better • Without text mining, we have access to “external”, “cold” states of the individual – Behavior (eg, clicks), external attributes (address, gender, encyclopedia entry), social networks (but relatively cold ones.) • With text mining, we have access to “internal”, “hot” states: - opinions - intentions - preferences - degree of consensus - social networks (who mentions whom: how, in which context) - implicit attributes of the speaker
  • 9. MK99 – Big Data 9 How easy is it? • Too easy… the limit is legal and ethical, not technical “Predicting the Political Alignment of Twitter Users” by Conover et al. (2011). http://cnets.indiana.edu/wp-content/uploads/conover_prediction_socialcom_pdfexpress_ok_version.pdf “Political Tendency Identification in Twitter using Sentiment Analysis Techniques” by Pla and Hurtado (2014). http://anthology.aclweb.org/C/C14/C14-1019.pdf “Private traits and attributes are predictable from digital records of human behavior” by Kosinski et al. (2013). http://www.pnas.org/content/110/15/5802.abstract (and this gets even more powerful when mixing text mining, network analysis and machine learning)
  • 10. MK99 – Big Data 10 What use for text mining in a business context? 1. Client facing 2. Business management 3. Business development
  • 11. MK99 – Big Data 11 1. Market facing activities • Refined scoring: propensity scores (including churn), scoring of prospects •Refined individualization of campaigns –ads, email campaigns, coupons, etc. •Better community management –Getting a clear and precise picture of how customers and prospects perceive, talk about, and engage with your brand / product / industry.
  • 12. MK99 – Big Data 12 2. Business Management • Organizational mapping – Getting a view of the organization through text flows. – Example: getting a view on the activity of a business school through a map of its scientific publications. • HRM – Finding talents in niche industries, based on the mining of their profiles • Marketing research – refined segmentation + targeting + positioning, measuring customer satisfaction, perceptual mapping.
  • 13. MK99 – Big Data 13 3. Business development • Developing adjunct services – product recommendation systems (eg, Amazon’s) – detection and matching of needs (eg, detection of complaints / mood changes) – product enhancements (eg, content enrichment through localization/personalization) • Developing new products entirely, based on – different search engines – alert systems / automated systems based on monitoring textual input – knowledge databases – new forms of content curation / high value info creation + delivery
  • 14. MK99 – Big Data 14 Interesting players through their “Data Services” package + many APIs listed on www.programmableweb.com
  • 15. MK99 – Big Data 15 This slide presentation is part of a course offered by EMLYON Business School (www.em-lyon.com) Contact Clement Levallois (levallois [at] em-lyon.com) for more information.