SlideShare a Scribd company logo
1 of 20
Agenda
2
Problem Summary
Confusion about
precise definition of
analytics
Benefit of ‘practical’
definitions
Issues with the
conventional ‘practical’
model of analytics
Model Details
Data source: ‘analytics’
job adverts
Topic modeling &
Latent Dirichlet
Allocation
Model build & data
pre-processing
Implications
Model analysis
An alternative
definition of analytics
Implications for OR/MS
Analytics is …
3
…. delivering the right
decision support to the right
people at the right time.
Laursen & Thorlund, 2010, p XII
… the scientific process of
transforming data into insight
for making better decisions
INFORMS
… [the] technologies, systems,
practices, & applications to analyze
critical business data so as to gain
new insights
Lim et al, 2012
… the extensive use of data, statistical
& quantitative analysis, explanatory &
predictive models, & fact-based
management to drive
decisions & actions.
Davenport & Harris , 2007, p 7
… an outgrowth of what is known as
business intelligence […] Today’s
expansive, global enterprises generate a
deluge of data that is impossible for a
human to make sense of.
Varshney & Mojsilovic, 2011
Analytics with a capital "A" is an
umbrella term that represents
our industry at a macro level,
and analytics with a small "a"
refers to technology used to
analyze data.
Eckerson, 2011
… information-intensive concepts
and methods to improve business
decision making.
Chiang et al, 2012
… is the process of obtaining
an optimal and realistic
decision based on existing data
Hamel, 2011
… data analysis that changes the
behavior of the organization
Hackathom, 2010
the science of analysis
… the science of analysis
Wikipedia
… the method of logical
analysis
Meriam Webster
… the brains to cloud
computing’s brawn
Croll, 2011
… the process of transforming data,
from a variety of sources and of a
variety of types, into insights that
support, improve and/or automate
business decisions, using
technological, quantitative and
presentation techniques
Mortenson et al, 2013
… a group of approaches, organizational
procedures and tools used in combination
with one another to gain information,
analyze that information, and predict
outcomes of problem solutions
Trkman et al, 2010
… the use of data, information
technology, statistical analysis, quantitative
methods, and mathematical or computer-based
models to help managers gain improved insight
about their business operations and make
better, fact-based decisions
Evans, 2012
• Many contrasting and often contradictory definitions
• Particularly difficult to distinguish analytics from
business intelligence or similar fields
• Does it matter?
 Potential confusion
 As analytics is multi-disciplinary it is important
that a common language can be established
 Important so that the growing job market can be
met with the appropriate training
What is Analytics?
Analytics: Practical Definition
4
Source: Blackett, 2012
Advantages
• Focuses on application &
generation of value
• Demonstrates the
disciplines informing
analytics
Issues
• Some methods suggest
different purposes
• Suggesting progression to
prescriptive as advanced
may not always hold
Job Adverts
5
• Analyse “analytics” job adverts – following the tradition of
‘ASP’ studies (e.g. Liberatore and Luo, 2012)
• Instead of studying a smaller pool of jobs, we access
through the LinkedIn API
 Over 250k jobs online
 77% of all jobs are posted on LinkedIn (Dougherty, 2012)
• Scripted using Python & stored in MongoDB
 OAuth, SimpleJSON, & PyMongo
• Need to reduce and generalise results from >6,800 adverts
with >50,000 unique words.
Topic Models
6
• Topic models assume documents to be a collection of
latent topics. The topics determine which words are used
• Probabilistic models that determine the topics by analysis
of the co-occurrence of the words used
• The most common are Probabilistic Latent Semantic
Indexing (pLSI) and Latent Dirichlet Allocation (LDA)
Latent Dirichlet Allocation (LDA)
7
• Basic conception is that a collection of documents has
three layers and contains:
Documents
Words
Words
W
Topics
Z
Topic
Distribution
Ө
Alpha
Parameter
α
Beta
Parameter
β
Adapted from Blei et al, 2003N M
Latent Dirichlet Allocation - Process
8
• Model is built by:
1. Estimating topics as product of observed words
2. Use to estimate document topic proportions
3. Evaluate corpus based on the distributions suggested in
(1) & (2)
4. Use (3) to improve topic estimations (1)
5. Reiterate until best fit found
Latent Dirichlet Allocation - Assumptions
9
• Bag-of-words / exchangeability
• The number of topics is known and pre-determined (K )
 Cross-validation to identify K with the lowest perplexity
• Topic independence
 As α is a parameter of a Dirichlet prior, each topic is assumed to
be independent and not correlated
 In this research correlation between topics has to be assumed.
 Alternative is the correlated topic model (Blei & Lafferty, 2007),
which uses a logistic normal rather than a Dirichlet distribution
Data Pre-Processing & Model Build
10
• Strip HTML / XML
• Remove stop words, numbers and punctuation
• Remove words < 3 characters
• Remove most and least frequent words
 Python: HTMLParser, GenSim and String
 R: TM and TopicModels
• To stem or not to stem?
 "the job involves managing analytics projects"
 "the job involves the management of analytical projects“
 "has experience running projects using management science and analytics"
 "managing a team of scientists analysing the experience of runners"
Topic Results
• 30 topics identified
• All topics are created equally but some are more topical
than others
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
Most Likely Topic per Document as % of Corpus
11
Most Likely Terms in Topics
• Analysis of the 3rd, 4th & 5th most likely topics
Digital & Web (8%)
Topic 3 (4th
)
other media
across working
understanding analysis
social projects
responsible required
ensure within
design key
performance digital
company manager
products their
lead tools
role services
Topic 13 (3rd
)
working market
develop project
software process
media reporting
key through
requirements solutions
manager excellent
your strategy
multiple more
service opportunity
manage well
opportunities clients
Consultancy (17%)
12
Topic 9 (5th)
risk systems
design solutions
services other
tools technical
teams related
provide required
position degree
such operations
global skills
project opportunity
clients service
excellent products
Technical (7%)
Most Likely Terms in Topics (cont.)
• Analysis of the top two most likely topics
Topic 20 (1st
)
reporting analysis
media required
strategy related
strategic manager
company degree
risk online
products across
drive must
manage responsible
well financial
planning industry
lead software
Topic 21 (2nd
)
services solutions
technology clients
digital consulting
your more
implementation management
oracle technical
capabilities design
provide advisory
strategy integration
technologies sap
career enterprise
solution architecture
Strategic (41%)Computing (20%)
13
Model Analysis
• Main five topics:
 Technical
 Digital/Web
 Consultancy
 Computing
 Strategic
• ‘Digital/Web’ is a specialism within analytics (also ‘Financial’)
• ‘Technical’ & ‘Consultancy’ are specific job types or environments
 However, some technical (‘hard’) skills & some consulting-type (‘soft’) skills
are likely to be required in all analytics jobs
• ‘Computing’ & ‘Strategic’?
14
The Analytics of Computing?
15
Basic Analytics Capability
SoftHard
Data
Warehouses
Big Data
Architecture
Stock Market
Analysis
Algorithmic
Trading
Fraud
Investigation
Automatic
Fraud
Detection
Customer
Segmentation
Propensity
Modeling
Clickstream
Analysis
Behavioural
Targeting
Qualitative
Text Analysis
Natural
Language
Processing
Reports &
Dashboards
Advanced
Visualisation
Advanced Analytics Capability
Discovery
Analytics
The Analytics of Strategy?
16
Basic Analytics Capability
SoftHard
Trial & Error
Experimentation
Optimisation Simulation
Basic
Forecasting
ARIMA Time
Series
Performance
Metrics
Data
Envelopment
Analysis
A/B Testing
Multivariate
Testing
Business
Analysis
Business
Process
Optimisation
Requirements
Gathering
Problem
Structuring
Advanced Analytics Capability
Decision
Analytics
An Alternative Definition of Analytics
17
Descriptive Analytics
Predictive Analytics Prescriptive Analytics
Statistical and data modeling techniques designed to describe past
events and answer “what happened”?
Data mining and machine learning
techniques used to predict future
events and answer “what will
happen next”?
OR/MS , advanced statistical and
mathematical models used to
prescribe future actions and answer
“what should we do next”?
An Alternative Definition of Analytics
Technological Strategic
Lower Risk Decisions Higher Risk Decisions
18
Discovery Analytics Decision Analytics
Advanced Discovery
Analytics
Reporting & alerts
Market research
Information systems
Basic historical analysis
Performance metrics
Stakeholder consultation
Advanced visualisation
Real time insights
Automated decisions
Advanced Decision
Analytics
Advanced modelling
Problem structuring
Decision analysis
Advanced
Summary & Implications for OR/MS
• Implemented a correlated topic model on 6,873 job adverts
• An alternative practical definition of analytics has been
suggested: discovery and decision analytics
 Maintains the focus on business value, application & the
disciplines that inform analytics
 However, removes the contradictions in the previous model
• OR/MS has an obvious role in advanced decision analytics,
both in hard and soft applications
• Further exploration (and/or promotion) of the role of
OR/MS in advanced discovery analytics
19
Contact Details and Questions
Email: m.j.mortenson@lboro.ac.uk
Website: www.whatisanalytics.co.uk
Mobile: 07833 XXXXXX
LinkedIn: http://www.linkedin.com/profile/view?id=114000243&trk=tab_pro
(or search Michael Mortenson)
20

More Related Content

What's hot

Data analytics presentation- Management career institute
Data analytics presentation- Management career institute Data analytics presentation- Management career institute
Data analytics presentation- Management career institute PoojaPatidar11
 
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and ProvidersText/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and ProvidersSeth Grimes
 
Introduction to data analytics
Introduction to data analyticsIntroduction to data analytics
Introduction to data analyticsSSaudia
 
12 Things the Semantic Web Should Know about Content Analytics
12 Things the Semantic Web Should Know about Content Analytics12 Things the Semantic Web Should Know about Content Analytics
12 Things the Semantic Web Should Know about Content AnalyticsSeth Grimes
 
Where are the data professionals
Where are the data professionalsWhere are the data professionals
Where are the data professionalsSteven Miller
 
From Rocket Science to Data Science
From Rocket Science to Data ScienceFrom Rocket Science to Data Science
From Rocket Science to Data ScienceSanghamitra Deb
 
Text Analytics 2014: User Perspectives on Solutions and Providers
Text Analytics 2014: User Perspectives on Solutions and ProvidersText Analytics 2014: User Perspectives on Solutions and Providers
Text Analytics 2014: User Perspectives on Solutions and ProvidersSeth Grimes
 
Gartner Business Intelligence & Analytics Summit Brochure
Gartner Business Intelligence & Analytics Summit BrochureGartner Business Intelligence & Analytics Summit Brochure
Gartner Business Intelligence & Analytics Summit BrochureNadia Smith
 
Introduction to data analytics
Introduction to data analyticsIntroduction to data analytics
Introduction to data analyticsUmasree Raghunath
 
KDD 2019 IADSS Workshop - Skills to Master Machine Learning and Data Science ...
KDD 2019 IADSS Workshop - Skills to Master Machine Learning and Data Science ...KDD 2019 IADSS Workshop - Skills to Master Machine Learning and Data Science ...
KDD 2019 IADSS Workshop - Skills to Master Machine Learning and Data Science ...IADSS
 
How relevant is Predictive Analytics relevant today?
How relevant is Predictive Analytics relevant today?How relevant is Predictive Analytics relevant today?
How relevant is Predictive Analytics relevant today?Steven Mugerwa
 
940 sponsor gazdak_using our laptop
940 sponsor gazdak_using our laptop940 sponsor gazdak_using our laptop
940 sponsor gazdak_using our laptopRising Media, Inc.
 
data scientists and their role
data scientists and their roledata scientists and their role
data scientists and their rolebhavesh lande
 
Who is a data scientist
Who is a data scientist  Who is a data scientist
Who is a data scientist prateek kumar
 
Competitive Advantage with Optimization MII
Competitive Advantage with Optimization MIICompetitive Advantage with Optimization MII
Competitive Advantage with Optimization MIIAnwar Ali Mohamed
 
Introduction to Business Data Analytics
Introduction to Business Data AnalyticsIntroduction to Business Data Analytics
Introduction to Business Data AnalyticsVadivelM9
 

What's hot (20)

Data analytics presentation- Management career institute
Data analytics presentation- Management career institute Data analytics presentation- Management career institute
Data analytics presentation- Management career institute
 
Data analytics
Data analyticsData analytics
Data analytics
 
Road Map for Careers in Big Data
Road Map for Careers in Big DataRoad Map for Careers in Big Data
Road Map for Careers in Big Data
 
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and ProvidersText/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
 
Data analytics
Data analyticsData analytics
Data analytics
 
Introduction to data analytics
Introduction to data analyticsIntroduction to data analytics
Introduction to data analytics
 
12 Things the Semantic Web Should Know about Content Analytics
12 Things the Semantic Web Should Know about Content Analytics12 Things the Semantic Web Should Know about Content Analytics
12 Things the Semantic Web Should Know about Content Analytics
 
Where are the data professionals
Where are the data professionalsWhere are the data professionals
Where are the data professionals
 
From Rocket Science to Data Science
From Rocket Science to Data ScienceFrom Rocket Science to Data Science
From Rocket Science to Data Science
 
Text Analytics 2014: User Perspectives on Solutions and Providers
Text Analytics 2014: User Perspectives on Solutions and ProvidersText Analytics 2014: User Perspectives on Solutions and Providers
Text Analytics 2014: User Perspectives on Solutions and Providers
 
Gartner Business Intelligence & Analytics Summit Brochure
Gartner Business Intelligence & Analytics Summit BrochureGartner Business Intelligence & Analytics Summit Brochure
Gartner Business Intelligence & Analytics Summit Brochure
 
Introduction to data analytics
Introduction to data analyticsIntroduction to data analytics
Introduction to data analytics
 
KDD 2019 IADSS Workshop - Skills to Master Machine Learning and Data Science ...
KDD 2019 IADSS Workshop - Skills to Master Machine Learning and Data Science ...KDD 2019 IADSS Workshop - Skills to Master Machine Learning and Data Science ...
KDD 2019 IADSS Workshop - Skills to Master Machine Learning and Data Science ...
 
How relevant is Predictive Analytics relevant today?
How relevant is Predictive Analytics relevant today?How relevant is Predictive Analytics relevant today?
How relevant is Predictive Analytics relevant today?
 
940 sponsor gazdak_using our laptop
940 sponsor gazdak_using our laptop940 sponsor gazdak_using our laptop
940 sponsor gazdak_using our laptop
 
data scientists and their role
data scientists and their roledata scientists and their role
data scientists and their role
 
Who is a data scientist
Who is a data scientist  Who is a data scientist
Who is a data scientist
 
Competitive Advantage with Optimization MII
Competitive Advantage with Optimization MIICompetitive Advantage with Optimization MII
Competitive Advantage with Optimization MII
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Introduction to Business Data Analytics
Introduction to Business Data AnalyticsIntroduction to Business Data Analytics
Introduction to Business Data Analytics
 

Viewers also liked

Dance and dense denso
Dance and dense densoDance and dense denso
Dance and dense densoPathetic_punk
 
Educacion a distancia
Educacion a distancia Educacion a distancia
Educacion a distancia mirianaqp
 
Linkedin For Solicitors
Linkedin For SolicitorsLinkedin For Solicitors
Linkedin For SolicitorsCartwrightKing
 
Telemno Life Agency-Product Presentation
Telemno Life Agency-Product PresentationTelemno Life Agency-Product Presentation
Telemno Life Agency-Product PresentationTELEMNO LIFE AGENCY
 
Software
SoftwareSoftware
Softwaredabuone
 
Wave 5 - The Socialisation of Brands | UM | Social Media Tracker
Wave 5 - The Socialisation of Brands | UM | Social Media TrackerWave 5 - The Socialisation of Brands | UM | Social Media Tracker
Wave 5 - The Socialisation of Brands | UM | Social Media TrackerUM Wave
 
Diary Bert Caron
Diary Bert CaronDiary Bert Caron
Diary Bert Caronmdrouet44
 
Presentation jenny lourdes t. cayanan
Presentation   jenny lourdes t. cayananPresentation   jenny lourdes t. cayanan
Presentation jenny lourdes t. cayananJenny Cayanan
 
Announcements April 6, 2014
Announcements April 6, 2014 Announcements April 6, 2014
Announcements April 6, 2014 Earl Oswalt
 
Lektura Zypa Cupak
Lektura Zypa CupakLektura Zypa Cupak
Lektura Zypa Cupakdenisdudas00
 
собеседование
собеседованиесобеседование
собеседованиеvinnipukkk
 

Viewers also liked (16)

Dance and dense denso
Dance and dense densoDance and dense denso
Dance and dense denso
 
Educacion a distancia
Educacion a distancia Educacion a distancia
Educacion a distancia
 
Linkedin For Solicitors
Linkedin For SolicitorsLinkedin For Solicitors
Linkedin For Solicitors
 
Telemno Life Agency-Product Presentation
Telemno Life Agency-Product PresentationTelemno Life Agency-Product Presentation
Telemno Life Agency-Product Presentation
 
Question 7
Question 7Question 7
Question 7
 
Software
SoftwareSoftware
Software
 
Wave 5 - The Socialisation of Brands | UM | Social Media Tracker
Wave 5 - The Socialisation of Brands | UM | Social Media TrackerWave 5 - The Socialisation of Brands | UM | Social Media Tracker
Wave 5 - The Socialisation of Brands | UM | Social Media Tracker
 
Diary Bert Caron
Diary Bert CaronDiary Bert Caron
Diary Bert Caron
 
Presentation jenny lourdes t. cayanan
Presentation   jenny lourdes t. cayananPresentation   jenny lourdes t. cayanan
Presentation jenny lourdes t. cayanan
 
Trekking in kerala
Trekking in keralaTrekking in kerala
Trekking in kerala
 
Announcements April 6, 2014
Announcements April 6, 2014 Announcements April 6, 2014
Announcements April 6, 2014
 
Lektura Zypa Cupak
Lektura Zypa CupakLektura Zypa Cupak
Lektura Zypa Cupak
 
Abc ppt1
Abc ppt1Abc ppt1
Abc ppt1
 
Muscle pharm recon
Muscle pharm reconMuscle pharm recon
Muscle pharm recon
 
Ppt2
Ppt2Ppt2
Ppt2
 
собеседование
собеседованиесобеседование
собеседование
 

Similar to A Topic Model of Analytics Job Adverts (The Operational Research Society 55th Annual Conference)

Presentation to Analytics Network of the OR Society Nov 2020
Presentation to Analytics Network of the OR Society Nov 2020Presentation to Analytics Network of the OR Society Nov 2020
Presentation to Analytics Network of the OR Society Nov 2020Paul Laughlin
 
Tips and Tricks to be an Effective Data Scientist
Tips and Tricks to be an Effective Data ScientistTips and Tricks to be an Effective Data Scientist
Tips and Tricks to be an Effective Data ScientistLisa Cohen
 
Self Study Business Approach to DS_01022022.docx
Self Study Business Approach to DS_01022022.docxSelf Study Business Approach to DS_01022022.docx
Self Study Business Approach to DS_01022022.docxShanmugasundaram M
 
Tips for Effective Data Science in the Enterprise
Tips for Effective Data Science in the EnterpriseTips for Effective Data Science in the Enterprise
Tips for Effective Data Science in the EnterpriseLisa Cohen
 
Data science vs. Data scientist by Jothi Periasamy
Data science vs. Data scientist by Jothi PeriasamyData science vs. Data scientist by Jothi Periasamy
Data science vs. Data scientist by Jothi PeriasamyPeter Kua
 
Veda Semantics - introduction document
Veda Semantics - introduction documentVeda Semantics - introduction document
Veda Semantics - introduction documentrajatkr
 
Helping data scientists escape the seduction of the sandbox - Krish Swamy, We...
Helping data scientists escape the seduction of the sandbox - Krish Swamy, We...Helping data scientists escape the seduction of the sandbox - Krish Swamy, We...
Helping data scientists escape the seduction of the sandbox - Krish Swamy, We...Sri Ambati
 
Practical Machine Learning
Practical Machine LearningPractical Machine Learning
Practical Machine LearningLynn Langit
 
PPT1-Buss Intel Analytics.pptx
PPT1-Buss Intel  Analytics.pptxPPT1-Buss Intel  Analytics.pptx
PPT1-Buss Intel Analytics.pptxssuser28b150
 
The Python ecosystem for data science - Landscape Overview
The Python ecosystem for data science - Landscape OverviewThe Python ecosystem for data science - Landscape Overview
The Python ecosystem for data science - Landscape OverviewDr. Ananth Krishnamoorthy
 
Data Analytics Course In Surat.pdf
Data Analytics Course In Surat.pdfData Analytics Course In Surat.pdf
Data Analytics Course In Surat.pdfSujata Gupta
 
Data Mining Xuequn Shang NorthWestern Polytechnical University
Data Mining Xuequn Shang NorthWestern Polytechnical UniversityData Mining Xuequn Shang NorthWestern Polytechnical University
Data Mining Xuequn Shang NorthWestern Polytechnical Universitybutest
 
The Softer Skills that analysts need (beyond Data Visualisation)
The Softer Skills that analysts need (beyond Data Visualisation)The Softer Skills that analysts need (beyond Data Visualisation)
The Softer Skills that analysts need (beyond Data Visualisation)Paul Laughlin
 
Advanced Analytics and Data Science Expertise
Advanced Analytics and Data Science ExpertiseAdvanced Analytics and Data Science Expertise
Advanced Analytics and Data Science ExpertiseSoftServe
 
The profile of the management (data) scientist: Potential scenarios and skill...
The profile of the management (data) scientist: Potential scenarios and skill...The profile of the management (data) scientist: Potential scenarios and skill...
The profile of the management (data) scientist: Potential scenarios and skill...Juan Mateos-Garcia
 
D92-198gstindspdx
D92-198gstindspdxD92-198gstindspdx
D92-198gstindspdxThinkful
 
Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2Roger Barga
 
Gse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedGse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedcedrinemadera
 

Similar to A Topic Model of Analytics Job Adverts (The Operational Research Society 55th Annual Conference) (20)

Presentation to Analytics Network of the OR Society Nov 2020
Presentation to Analytics Network of the OR Society Nov 2020Presentation to Analytics Network of the OR Society Nov 2020
Presentation to Analytics Network of the OR Society Nov 2020
 
Tips and Tricks to be an Effective Data Scientist
Tips and Tricks to be an Effective Data ScientistTips and Tricks to be an Effective Data Scientist
Tips and Tricks to be an Effective Data Scientist
 
Text Analytics for Legal work
Text Analytics for Legal workText Analytics for Legal work
Text Analytics for Legal work
 
Self Study Business Approach to DS_01022022.docx
Self Study Business Approach to DS_01022022.docxSelf Study Business Approach to DS_01022022.docx
Self Study Business Approach to DS_01022022.docx
 
Tips for Effective Data Science in the Enterprise
Tips for Effective Data Science in the EnterpriseTips for Effective Data Science in the Enterprise
Tips for Effective Data Science in the Enterprise
 
Data science vs. Data scientist by Jothi Periasamy
Data science vs. Data scientist by Jothi PeriasamyData science vs. Data scientist by Jothi Periasamy
Data science vs. Data scientist by Jothi Periasamy
 
Veda Semantics - introduction document
Veda Semantics - introduction documentVeda Semantics - introduction document
Veda Semantics - introduction document
 
Helping data scientists escape the seduction of the sandbox - Krish Swamy, We...
Helping data scientists escape the seduction of the sandbox - Krish Swamy, We...Helping data scientists escape the seduction of the sandbox - Krish Swamy, We...
Helping data scientists escape the seduction of the sandbox - Krish Swamy, We...
 
Practical Machine Learning
Practical Machine LearningPractical Machine Learning
Practical Machine Learning
 
PPT1-Buss Intel Analytics.pptx
PPT1-Buss Intel  Analytics.pptxPPT1-Buss Intel  Analytics.pptx
PPT1-Buss Intel Analytics.pptx
 
The Python ecosystem for data science - Landscape Overview
The Python ecosystem for data science - Landscape OverviewThe Python ecosystem for data science - Landscape Overview
The Python ecosystem for data science - Landscape Overview
 
Proposed Talk Outline for Pycon2017
Proposed Talk Outline for Pycon2017 Proposed Talk Outline for Pycon2017
Proposed Talk Outline for Pycon2017
 
Data Analytics Course In Surat.pdf
Data Analytics Course In Surat.pdfData Analytics Course In Surat.pdf
Data Analytics Course In Surat.pdf
 
Data Mining Xuequn Shang NorthWestern Polytechnical University
Data Mining Xuequn Shang NorthWestern Polytechnical UniversityData Mining Xuequn Shang NorthWestern Polytechnical University
Data Mining Xuequn Shang NorthWestern Polytechnical University
 
The Softer Skills that analysts need (beyond Data Visualisation)
The Softer Skills that analysts need (beyond Data Visualisation)The Softer Skills that analysts need (beyond Data Visualisation)
The Softer Skills that analysts need (beyond Data Visualisation)
 
Advanced Analytics and Data Science Expertise
Advanced Analytics and Data Science ExpertiseAdvanced Analytics and Data Science Expertise
Advanced Analytics and Data Science Expertise
 
The profile of the management (data) scientist: Potential scenarios and skill...
The profile of the management (data) scientist: Potential scenarios and skill...The profile of the management (data) scientist: Potential scenarios and skill...
The profile of the management (data) scientist: Potential scenarios and skill...
 
D92-198gstindspdx
D92-198gstindspdxD92-198gstindspdx
D92-198gstindspdx
 
Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2
 
Gse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedGse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-shared
 

Recently uploaded

Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 

Recently uploaded (20)

Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 

A Topic Model of Analytics Job Adverts (The Operational Research Society 55th Annual Conference)

  • 1.
  • 2. Agenda 2 Problem Summary Confusion about precise definition of analytics Benefit of ‘practical’ definitions Issues with the conventional ‘practical’ model of analytics Model Details Data source: ‘analytics’ job adverts Topic modeling & Latent Dirichlet Allocation Model build & data pre-processing Implications Model analysis An alternative definition of analytics Implications for OR/MS
  • 3. Analytics is … 3 …. delivering the right decision support to the right people at the right time. Laursen & Thorlund, 2010, p XII … the scientific process of transforming data into insight for making better decisions INFORMS … [the] technologies, systems, practices, & applications to analyze critical business data so as to gain new insights Lim et al, 2012 … the extensive use of data, statistical & quantitative analysis, explanatory & predictive models, & fact-based management to drive decisions & actions. Davenport & Harris , 2007, p 7 … an outgrowth of what is known as business intelligence […] Today’s expansive, global enterprises generate a deluge of data that is impossible for a human to make sense of. Varshney & Mojsilovic, 2011 Analytics with a capital "A" is an umbrella term that represents our industry at a macro level, and analytics with a small "a" refers to technology used to analyze data. Eckerson, 2011 … information-intensive concepts and methods to improve business decision making. Chiang et al, 2012 … is the process of obtaining an optimal and realistic decision based on existing data Hamel, 2011 … data analysis that changes the behavior of the organization Hackathom, 2010 the science of analysis … the science of analysis Wikipedia … the method of logical analysis Meriam Webster … the brains to cloud computing’s brawn Croll, 2011 … the process of transforming data, from a variety of sources and of a variety of types, into insights that support, improve and/or automate business decisions, using technological, quantitative and presentation techniques Mortenson et al, 2013 … a group of approaches, organizational procedures and tools used in combination with one another to gain information, analyze that information, and predict outcomes of problem solutions Trkman et al, 2010 … the use of data, information technology, statistical analysis, quantitative methods, and mathematical or computer-based models to help managers gain improved insight about their business operations and make better, fact-based decisions Evans, 2012 • Many contrasting and often contradictory definitions • Particularly difficult to distinguish analytics from business intelligence or similar fields • Does it matter?  Potential confusion  As analytics is multi-disciplinary it is important that a common language can be established  Important so that the growing job market can be met with the appropriate training What is Analytics?
  • 4. Analytics: Practical Definition 4 Source: Blackett, 2012 Advantages • Focuses on application & generation of value • Demonstrates the disciplines informing analytics Issues • Some methods suggest different purposes • Suggesting progression to prescriptive as advanced may not always hold
  • 5. Job Adverts 5 • Analyse “analytics” job adverts – following the tradition of ‘ASP’ studies (e.g. Liberatore and Luo, 2012) • Instead of studying a smaller pool of jobs, we access through the LinkedIn API  Over 250k jobs online  77% of all jobs are posted on LinkedIn (Dougherty, 2012) • Scripted using Python & stored in MongoDB  OAuth, SimpleJSON, & PyMongo • Need to reduce and generalise results from >6,800 adverts with >50,000 unique words.
  • 6. Topic Models 6 • Topic models assume documents to be a collection of latent topics. The topics determine which words are used • Probabilistic models that determine the topics by analysis of the co-occurrence of the words used • The most common are Probabilistic Latent Semantic Indexing (pLSI) and Latent Dirichlet Allocation (LDA)
  • 7. Latent Dirichlet Allocation (LDA) 7 • Basic conception is that a collection of documents has three layers and contains: Documents Words Words W Topics Z Topic Distribution Ө Alpha Parameter α Beta Parameter β Adapted from Blei et al, 2003N M
  • 8. Latent Dirichlet Allocation - Process 8 • Model is built by: 1. Estimating topics as product of observed words 2. Use to estimate document topic proportions 3. Evaluate corpus based on the distributions suggested in (1) & (2) 4. Use (3) to improve topic estimations (1) 5. Reiterate until best fit found
  • 9. Latent Dirichlet Allocation - Assumptions 9 • Bag-of-words / exchangeability • The number of topics is known and pre-determined (K )  Cross-validation to identify K with the lowest perplexity • Topic independence  As α is a parameter of a Dirichlet prior, each topic is assumed to be independent and not correlated  In this research correlation between topics has to be assumed.  Alternative is the correlated topic model (Blei & Lafferty, 2007), which uses a logistic normal rather than a Dirichlet distribution
  • 10. Data Pre-Processing & Model Build 10 • Strip HTML / XML • Remove stop words, numbers and punctuation • Remove words < 3 characters • Remove most and least frequent words  Python: HTMLParser, GenSim and String  R: TM and TopicModels • To stem or not to stem?  "the job involves managing analytics projects"  "the job involves the management of analytical projects“  "has experience running projects using management science and analytics"  "managing a team of scientists analysing the experience of runners"
  • 11. Topic Results • 30 topics identified • All topics are created equally but some are more topical than others 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% Most Likely Topic per Document as % of Corpus 11
  • 12. Most Likely Terms in Topics • Analysis of the 3rd, 4th & 5th most likely topics Digital & Web (8%) Topic 3 (4th ) other media across working understanding analysis social projects responsible required ensure within design key performance digital company manager products their lead tools role services Topic 13 (3rd ) working market develop project software process media reporting key through requirements solutions manager excellent your strategy multiple more service opportunity manage well opportunities clients Consultancy (17%) 12 Topic 9 (5th) risk systems design solutions services other tools technical teams related provide required position degree such operations global skills project opportunity clients service excellent products Technical (7%)
  • 13. Most Likely Terms in Topics (cont.) • Analysis of the top two most likely topics Topic 20 (1st ) reporting analysis media required strategy related strategic manager company degree risk online products across drive must manage responsible well financial planning industry lead software Topic 21 (2nd ) services solutions technology clients digital consulting your more implementation management oracle technical capabilities design provide advisory strategy integration technologies sap career enterprise solution architecture Strategic (41%)Computing (20%) 13
  • 14. Model Analysis • Main five topics:  Technical  Digital/Web  Consultancy  Computing  Strategic • ‘Digital/Web’ is a specialism within analytics (also ‘Financial’) • ‘Technical’ & ‘Consultancy’ are specific job types or environments  However, some technical (‘hard’) skills & some consulting-type (‘soft’) skills are likely to be required in all analytics jobs • ‘Computing’ & ‘Strategic’? 14
  • 15. The Analytics of Computing? 15 Basic Analytics Capability SoftHard Data Warehouses Big Data Architecture Stock Market Analysis Algorithmic Trading Fraud Investigation Automatic Fraud Detection Customer Segmentation Propensity Modeling Clickstream Analysis Behavioural Targeting Qualitative Text Analysis Natural Language Processing Reports & Dashboards Advanced Visualisation Advanced Analytics Capability Discovery Analytics
  • 16. The Analytics of Strategy? 16 Basic Analytics Capability SoftHard Trial & Error Experimentation Optimisation Simulation Basic Forecasting ARIMA Time Series Performance Metrics Data Envelopment Analysis A/B Testing Multivariate Testing Business Analysis Business Process Optimisation Requirements Gathering Problem Structuring Advanced Analytics Capability Decision Analytics
  • 17. An Alternative Definition of Analytics 17 Descriptive Analytics Predictive Analytics Prescriptive Analytics Statistical and data modeling techniques designed to describe past events and answer “what happened”? Data mining and machine learning techniques used to predict future events and answer “what will happen next”? OR/MS , advanced statistical and mathematical models used to prescribe future actions and answer “what should we do next”?
  • 18. An Alternative Definition of Analytics Technological Strategic Lower Risk Decisions Higher Risk Decisions 18 Discovery Analytics Decision Analytics Advanced Discovery Analytics Reporting & alerts Market research Information systems Basic historical analysis Performance metrics Stakeholder consultation Advanced visualisation Real time insights Automated decisions Advanced Decision Analytics Advanced modelling Problem structuring Decision analysis Advanced
  • 19. Summary & Implications for OR/MS • Implemented a correlated topic model on 6,873 job adverts • An alternative practical definition of analytics has been suggested: discovery and decision analytics  Maintains the focus on business value, application & the disciplines that inform analytics  However, removes the contradictions in the previous model • OR/MS has an obvious role in advanced decision analytics, both in hard and soft applications • Further exploration (and/or promotion) of the role of OR/MS in advanced discovery analytics 19
  • 20. Contact Details and Questions Email: m.j.mortenson@lboro.ac.uk Website: www.whatisanalytics.co.uk Mobile: 07833 XXXXXX LinkedIn: http://www.linkedin.com/profile/view?id=114000243&trk=tab_pro (or search Michael Mortenson) 20