SlideShare a Scribd company logo
1 of 16
jobknowledge.eu 
facebook.com/jobknowledge 
@Jobknowledge 
Small talk 
Text mining in organizational research: 
a review and a case study 
Vladimer Kobayashi, Hannah Berkers, Stefan Mol, Gabór Kismihók & Deanne den Hartog
Overview 
The case study: Extracting job information from vacancies 
• The problem: Modernizing job analysis 
• The data: 500,000 online vacancies 
• The use of a framework: knowledge from the job analysis field 
• The techniques: feature extraction 
• The results: Successful automatic categorization of job information 
The review: text mining techniques and tasks in organizational research 
• The task: Invitation for a special issue on big data in ORM 
• The paper: Our structure so far 
• The question: Feedback
The case study: Extracting job information from vacancies 
The problem: Modernizing job analysis 
Jobs are changing, but job analysis is lagging behind 
• Seen as a tedious and expensive, but necessary task 
• Not up to speed with the changes in work 
• Accuracy of job analysis using job incumbents as a source is questioned 
• Not taking advantage of the ‘big data’ opportunities
The case study: Extracting job information from vacancies 
The data: 500,000 English online vacancies 
An often overlooked rich source of job information 
Could facilitate upscaling amount of data used in job analysis
The case study: Extracting job information from vacancies 
The use of a framework: knowledge from the job analysis field 
Skills can be extracted from job advertisements (Sodhi & Son, 2009; Smith & Ali, 2014) 
Studies conducted in the field of Information Technologies with a focus on the use of 
technologies 
Need for a more deductive approach (George, Haas, & Pentland, 2014) 
We go beyond this research by using knowledge from the job analysis field 
We categorize job information based on the basic distinction between job attributes 
and job activities (Sackett & Laczo, 2003) 
First step toward the extraction of finer grained job information
The case study: Extracting job information from vacancies 
The use of a framework: knowledge from the job analysis field 
Categorization into job attributes and job activities 
Use of manual labelling of 300 random vacancies (3,921 labelled sentences) 
Based on definitions of the finer grained job features (either attribute or activity), such 
as knowledge, abilities, tasks, responsibilities etc.
The case study: Extracting job information from vacancies 
The techniques: Feature extraction 
Feature Matrix 
TEXT PREPROCESSING TEXT ENCODING 
Text Preprocessing 
• Sentence and word tokenization 
• Lower case transformation 
• Stopwords removal, e.g. the, and, etc 
• Extra whitespace 
• Lemmatization 
Text Encoding 
• Linguistic preprocessing, e.g. part of 
speech (POS) tagging 
F E A T U R E S 
S E N T E N C E S 
Job Vacancies Preprocessed Vacancies
The case study: Extracting job information from vacancies 
Feature list 
• Sentence Length (after removing certain words) 
• POS of first word (job activity sentences usually start with a verb) 
• First word (both kind of sentences often start with certain words) 
• Last Word (job attribute sentences commonly end with certain words ) 
• Proportion of nouns and adjectives 
• Proportion of verbs and TO 
• Proportion of verbs followed by noun, verb, adjectives, adverb 
• Frequent words
The case study: Extracting job information from vacancies 
Application of Data Mining Techniques to the Feature Matrix 
• Naïve Bayes 
• Support Vector Machines 
• Random Forest 
The results: Successful automatic categorization of job information 
At least 95% mean accuracy based on 10-fold cross validation 
compared with the base classifier accuracy of 55%
The case study: Extracting job information from vacancies 
Future work 
• Semi-supervised labelling 
• Finer classification 
• Consideration of more features
The review: Text mining techniques in organizational research 
The task: Invitation for a special issue on big data in ORM 
Introduce the methods of text analysis to organizational scientists 
Review of various techniques for mining textual data 
The pros and cons of different approaches (best practices) 
Illustrations from the current project on job analysis showing how 
these procedures can be applied to a substantive area
The review: Text mining techniques in organizational research 
The paper: Our structure so far 
1. Introduction 
Text data in organizational research and issues that could be solved with text mining 
Introduce the case study on text mining in job analysis 
2. Review of text mining techniques 
Definitions and terminology 
Text preprocessing 
3 tasks done in text mining: classification, feature construction, and feature selection 
Evaluating text mining results
The review: Text mining techniques in organizational research 
The paper: Our structure so far 
2. Review of text mining techniques 
For each task 
a) Text mining techniques applied to perform the tasks 
b) Possibilities for applying Organizational frameworks 
c) Advantages and disadvantages of these techniques illustrated with 
examples from Organizational Research and other fields 
d) Illustration from our case study
The review: Text mining techniques in organizational research 
The paper: Our structure so far 
3. Discussion of opportunities and challenges of text mining in Organizational Research 
Opportunities such as extending the application of text mining to other problems in 
Organizational Research (input?) 
Challenges such as dealing with data size, access and protection of data, language 
issues etc. 
4. Conclusion
The review: Text mining techniques in organizational research 
The question: Feedback 
What problems you are dealing with right now (or in the past) that make use of text 
data? 
What are the opportunities that you see for text mining? 
Which part of text mining would you like to learn more about? 
Do you have experience in submitting a manuscript to ORM?
References 
The question: Feedback 
George, G., Haas, M.R. & Pentland, A. (2014). From the editors: Big Data and Management. 
Academy of Management Journal, 57 (2), 321-326. 
Sackett, P.R., & Laczo, R.M. (2003). Job and Work Analysis. In Comprehensive Handbook of 
Psychology: Industrial and Organizational Psychology, vol. 12, ed. W.C. Borman, D.R. Ilgen, 
& R.J. Klimoski, pp. 21-37. New York: Wiley. 
Smith, D., & Ali, A. (2014). Analysing Computer Programming Job Trend Using Web Data Mining. 
Issues in Informing Science and Information Technology, 11, 203-214. 
Sodhi, M.S., & Son, B-G. (2009). Content Analysis of O.R. Job Advertisements to Infer Required Skills. 
Journal of the Operational Research Society, 61, 1315-1327.

More Related Content

What's hot

Web Scale Information Extraction tutorial ecml2013
Web Scale Information Extraction tutorial ecml2013Web Scale Information Extraction tutorial ecml2013
Web Scale Information Extraction tutorial ecml2013Anna Lisa Gentile
 
An Advanced IR System of Relational Keyword Search Technique
An Advanced IR System of Relational Keyword Search TechniqueAn Advanced IR System of Relational Keyword Search Technique
An Advanced IR System of Relational Keyword Search Techniquepaperpublications3
 
IRJET- Missing Value Evaluation in SQL Queries: A Survey
IRJET- 	  Missing Value Evaluation in SQL Queries: A SurveyIRJET- 	  Missing Value Evaluation in SQL Queries: A Survey
IRJET- Missing Value Evaluation in SQL Queries: A SurveyIRJET Journal
 
Tdm information retrieval
Tdm information retrievalTdm information retrieval
Tdm information retrievalKU Leuven
 
Aggregation for searching complex information spaces
Aggregation for searching complex information spacesAggregation for searching complex information spaces
Aggregation for searching complex information spacesMounia Lalmas-Roelleke
 
Text Mining Framework
Text Mining FrameworkText Mining Framework
Text Mining FrameworkPrakhyath Rai
 
Towards Automatic Analysis of Online Discussions among Hong Kong Students
Towards Automatic Analysis of Online Discussions among Hong Kong StudentsTowards Automatic Analysis of Online Discussions among Hong Kong Students
Towards Automatic Analysis of Online Discussions among Hong Kong StudentsCITE
 
Are topic-specific search term, journal name and author name recommendations ...
Are topic-specific search term, journal name and author name recommendations ...Are topic-specific search term, journal name and author name recommendations ...
Are topic-specific search term, journal name and author name recommendations ...GESIS
 
Model of information retrieval (3)
Model  of information retrieval (3)Model  of information retrieval (3)
Model of information retrieval (3)9866825059
 
Information_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibInformation_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibEl Habib NFAOUI
 
Review of Various Text Categorization Methods
Review of Various Text Categorization MethodsReview of Various Text Categorization Methods
Review of Various Text Categorization Methodsiosrjce
 

What's hot (16)

Text mining
Text miningText mining
Text mining
 
Web Scale Information Extraction tutorial ecml2013
Web Scale Information Extraction tutorial ecml2013Web Scale Information Extraction tutorial ecml2013
Web Scale Information Extraction tutorial ecml2013
 
An Advanced IR System of Relational Keyword Search Technique
An Advanced IR System of Relational Keyword Search TechniqueAn Advanced IR System of Relational Keyword Search Technique
An Advanced IR System of Relational Keyword Search Technique
 
CV
CVCV
CV
 
IRJET- Missing Value Evaluation in SQL Queries: A Survey
IRJET- 	  Missing Value Evaluation in SQL Queries: A SurveyIRJET- 	  Missing Value Evaluation in SQL Queries: A Survey
IRJET- Missing Value Evaluation in SQL Queries: A Survey
 
Tdm information retrieval
Tdm information retrievalTdm information retrieval
Tdm information retrieval
 
Text MIning
Text MIningText MIning
Text MIning
 
Aggregation for searching complex information spaces
Aggregation for searching complex information spacesAggregation for searching complex information spaces
Aggregation for searching complex information spaces
 
Text Mining Framework
Text Mining FrameworkText Mining Framework
Text Mining Framework
 
Ir 01
Ir   01Ir   01
Ir 01
 
Towards Automatic Analysis of Online Discussions among Hong Kong Students
Towards Automatic Analysis of Online Discussions among Hong Kong StudentsTowards Automatic Analysis of Online Discussions among Hong Kong Students
Towards Automatic Analysis of Online Discussions among Hong Kong Students
 
Are topic-specific search term, journal name and author name recommendations ...
Are topic-specific search term, journal name and author name recommendations ...Are topic-specific search term, journal name and author name recommendations ...
Are topic-specific search term, journal name and author name recommendations ...
 
Model of information retrieval (3)
Model  of information retrieval (3)Model  of information retrieval (3)
Model of information retrieval (3)
 
Information_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibInformation_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_Habib
 
Review of Various Text Categorization Methods
Review of Various Text Categorization MethodsReview of Various Text Categorization Methods
Review of Various Text Categorization Methods
 
Methodology Mashups: Systematic Searches, Plus ...
Methodology Mashups: Systematic Searches, Plus ... Methodology Mashups: Systematic Searches, Plus ...
Methodology Mashups: Systematic Searches, Plus ...
 

Viewers also liked

Summary of Eduworks project
Summary of Eduworks projectSummary of Eduworks project
Summary of Eduworks projectEduworks Network
 
What have you learnt about technologies from the process of condtructing this...
What have you learnt about technologies from the process of condtructing this...What have you learnt about technologies from the process of condtructing this...
What have you learnt about technologies from the process of condtructing this...08cornicktho
 
Eduworks kick-off presentation: Corvinno
Eduworks kick-off presentation: CorvinnoEduworks kick-off presentation: Corvinno
Eduworks kick-off presentation: CorvinnoEduworks Network
 
Eduworks presentation at Textkernel 17-01-2014
Eduworks presentation at Textkernel 17-01-2014Eduworks presentation at Textkernel 17-01-2014
Eduworks presentation at Textkernel 17-01-2014Eduworks Network
 
Eduworks kick-off presentation: U-SIEGEN
Eduworks kick-off presentation: U-SIEGENEduworks kick-off presentation: U-SIEGEN
Eduworks kick-off presentation: U-SIEGENEduworks Network
 
9th dutch flemish meeting presentation serlie mol
9th dutch flemish meeting presentation serlie mol9th dutch flemish meeting presentation serlie mol
9th dutch flemish meeting presentation serlie molEduworks Network
 
Eduworks kick-off presentation: CEU
Eduworks kick-off presentation: CEUEduworks kick-off presentation: CEU
Eduworks kick-off presentation: CEUEduworks Network
 
Extending Computerized Adaptive Testing to Multiple Objectives: Envisioned on...
Extending Computerized Adaptive Testing to Multiple Objectives: Envisioned on...Extending Computerized Adaptive Testing to Multiple Objectives: Envisioned on...
Extending Computerized Adaptive Testing to Multiple Objectives: Envisioned on...Eduworks Network
 
Eduworks summer school 2014, detailed programme
Eduworks summer school 2014, detailed programmeEduworks summer school 2014, detailed programme
Eduworks summer school 2014, detailed programmeEduworks Network
 
Aias newsletter autumn 2014
Aias newsletter autumn 2014Aias newsletter autumn 2014
Aias newsletter autumn 2014Eduworks Network
 
EAWOP SGM presentation slides
EAWOP SGM presentation slidesEAWOP SGM presentation slides
EAWOP SGM presentation slidesEduworks Network
 
Barriers to Adoption for Learning Analytics at a Dutch University
Barriers to Adoption for Learning Analytics at a Dutch UniversityBarriers to Adoption for Learning Analytics at a Dutch University
Barriers to Adoption for Learning Analytics at a Dutch UniversityEduworks Network
 
Eduworks kick-off presentation: USAL
Eduworks kick-off presentation: USALEduworks kick-off presentation: USAL
Eduworks kick-off presentation: USALEduworks Network
 
Ke hoach truyen thong xi mang hoang long - Hà Quốc Nam
Ke hoach truyen thong xi mang hoang long - Hà Quốc NamKe hoach truyen thong xi mang hoang long - Hà Quốc Nam
Ke hoach truyen thong xi mang hoang long - Hà Quốc NamQuoc Nam
 

Viewers also liked (18)

JIA & RMB Manchester
JIA & RMB ManchesterJIA & RMB Manchester
JIA & RMB Manchester
 
Summary of Eduworks project
Summary of Eduworks projectSummary of Eduworks project
Summary of Eduworks project
 
What have you learnt about technologies from the process of condtructing this...
What have you learnt about technologies from the process of condtructing this...What have you learnt about technologies from the process of condtructing this...
What have you learnt about technologies from the process of condtructing this...
 
Eduworks kick-off presentation: Corvinno
Eduworks kick-off presentation: CorvinnoEduworks kick-off presentation: Corvinno
Eduworks kick-off presentation: Corvinno
 
Eduworks presentation at Textkernel 17-01-2014
Eduworks presentation at Textkernel 17-01-2014Eduworks presentation at Textkernel 17-01-2014
Eduworks presentation at Textkernel 17-01-2014
 
Eduworks kick-off presentation: U-SIEGEN
Eduworks kick-off presentation: U-SIEGENEduworks kick-off presentation: U-SIEGEN
Eduworks kick-off presentation: U-SIEGEN
 
9th dutch flemish meeting presentation serlie mol
9th dutch flemish meeting presentation serlie mol9th dutch flemish meeting presentation serlie mol
9th dutch flemish meeting presentation serlie mol
 
Eduworks kick-off presentation: CEU
Eduworks kick-off presentation: CEUEduworks kick-off presentation: CEU
Eduworks kick-off presentation: CEU
 
Extending Computerized Adaptive Testing to Multiple Objectives: Envisioned on...
Extending Computerized Adaptive Testing to Multiple Objectives: Envisioned on...Extending Computerized Adaptive Testing to Multiple Objectives: Envisioned on...
Extending Computerized Adaptive Testing to Multiple Objectives: Envisioned on...
 
Eduworks summer school 2014, detailed programme
Eduworks summer school 2014, detailed programmeEduworks summer school 2014, detailed programme
Eduworks summer school 2014, detailed programme
 
Aias newsletter autumn 2014
Aias newsletter autumn 2014Aias newsletter autumn 2014
Aias newsletter autumn 2014
 
WAOP 2014 presentation
WAOP 2014 presentation WAOP 2014 presentation
WAOP 2014 presentation
 
Ingrid 2014 stefan mol
Ingrid 2014 stefan molIngrid 2014 stefan mol
Ingrid 2014 stefan mol
 
EAWOP SGM presentation slides
EAWOP SGM presentation slidesEAWOP SGM presentation slides
EAWOP SGM presentation slides
 
Barriers to Adoption for Learning Analytics at a Dutch University
Barriers to Adoption for Learning Analytics at a Dutch UniversityBarriers to Adoption for Learning Analytics at a Dutch University
Barriers to Adoption for Learning Analytics at a Dutch University
 
Eduworks kick-off presentation: USAL
Eduworks kick-off presentation: USALEduworks kick-off presentation: USAL
Eduworks kick-off presentation: USAL
 
Aiec & csr presentation
Aiec & csr presentationAiec & csr presentation
Aiec & csr presentation
 
Ke hoach truyen thong xi mang hoang long - Hà Quốc Nam
Ke hoach truyen thong xi mang hoang long - Hà Quốc NamKe hoach truyen thong xi mang hoang long - Hà Quốc Nam
Ke hoach truyen thong xi mang hoang long - Hà Quốc Nam
 

Similar to H Berkers & V.Kobayashi: Small talk ORM paper 29 9-2014

Qualitative Studies in Software Engineering - Interviews, Observation, Ground...
Qualitative Studies in Software Engineering - Interviews, Observation, Ground...Qualitative Studies in Software Engineering - Interviews, Observation, Ground...
Qualitative Studies in Software Engineering - Interviews, Observation, Ground...alessio_ferrari
 
Classification of News and Research Articles Using Text Pattern Mining
Classification of News and Research Articles Using Text Pattern MiningClassification of News and Research Articles Using Text Pattern Mining
Classification of News and Research Articles Using Text Pattern MiningIOSR Journals
 
Query formulation process
Query formulation processQuery formulation process
Query formulation processmalathimurugan
 
Data presentation and analysis for case study research
Data presentation and analysis for case study researchData presentation and analysis for case study research
Data presentation and analysis for case study researchhomedenogrey
 
Publishing Qualitative Research
Publishing Qualitative ResearchPublishing Qualitative Research
Publishing Qualitative ResearchJoel West
 
Search powered by deep learning smart data 2017
Search powered by deep learning smart data 2017Search powered by deep learning smart data 2017
Search powered by deep learning smart data 2017Debanjan Mahata
 
Search Powered by Deep Learning SmartData 2017
Search Powered by Deep Learning SmartData 2017Search Powered by Deep Learning SmartData 2017
Search Powered by Deep Learning SmartData 2017Debanjan Mahata
 
Case Study Research in Software Engineering
Case Study Research in Software EngineeringCase Study Research in Software Engineering
Case Study Research in Software Engineeringalessio_ferrari
 
Pemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptxPemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptxelisarosa29
 
A Novel Data mining Technique to Discover Patterns from Huge Text Corpus
A Novel Data mining Technique to Discover Patterns from Huge  Text CorpusA Novel Data mining Technique to Discover Patterns from Huge  Text Corpus
A Novel Data mining Technique to Discover Patterns from Huge Text CorpusIJMER
 
Data Management Lab: Session 2 slides
Data Management Lab: Session 2 slidesData Management Lab: Session 2 slides
Data Management Lab: Session 2 slidesIUPUI
 
Inverted files for text search engines
Inverted files for text search enginesInverted files for text search engines
Inverted files for text search enginesunyil96
 
kantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptkantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptbutest
 
tr-2006-17.doc Word document
tr-2006-17.doc Word documenttr-2006-17.doc Word document
tr-2006-17.doc Word documentbutest
 

Similar to H Berkers & V.Kobayashi: Small talk ORM paper 29 9-2014 (20)

Qualitative Studies in Software Engineering - Interviews, Observation, Ground...
Qualitative Studies in Software Engineering - Interviews, Observation, Ground...Qualitative Studies in Software Engineering - Interviews, Observation, Ground...
Qualitative Studies in Software Engineering - Interviews, Observation, Ground...
 
A0210110
A0210110A0210110
A0210110
 
Chapter 1: Introduction to Data Mining
Chapter 1: Introduction to Data MiningChapter 1: Introduction to Data Mining
Chapter 1: Introduction to Data Mining
 
Classification of News and Research Articles Using Text Pattern Mining
Classification of News and Research Articles Using Text Pattern MiningClassification of News and Research Articles Using Text Pattern Mining
Classification of News and Research Articles Using Text Pattern Mining
 
Query formulation process
Query formulation processQuery formulation process
Query formulation process
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
Data presentation and analysis for case study research
Data presentation and analysis for case study researchData presentation and analysis for case study research
Data presentation and analysis for case study research
 
Publishing Qualitative Research
Publishing Qualitative ResearchPublishing Qualitative Research
Publishing Qualitative Research
 
Search powered by deep learning smart data 2017
Search powered by deep learning smart data 2017Search powered by deep learning smart data 2017
Search powered by deep learning smart data 2017
 
Search Powered by Deep Learning SmartData 2017
Search Powered by Deep Learning SmartData 2017Search Powered by Deep Learning SmartData 2017
Search Powered by Deep Learning SmartData 2017
 
Case Study Research in Software Engineering
Case Study Research in Software EngineeringCase Study Research in Software Engineering
Case Study Research in Software Engineering
 
Pemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptxPemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptx
 
A Novel Data mining Technique to Discover Patterns from Huge Text Corpus
A Novel Data mining Technique to Discover Patterns from Huge  Text CorpusA Novel Data mining Technique to Discover Patterns from Huge  Text Corpus
A Novel Data mining Technique to Discover Patterns from Huge Text Corpus
 
Information Systems & Knowledge Structures
Information Systems & Knowledge StructuresInformation Systems & Knowledge Structures
Information Systems & Knowledge Structures
 
E43022023
E43022023E43022023
E43022023
 
Data Management Lab: Session 2 slides
Data Management Lab: Session 2 slidesData Management Lab: Session 2 slides
Data Management Lab: Session 2 slides
 
Text Mining
Text MiningText Mining
Text Mining
 
Inverted files for text search engines
Inverted files for text search enginesInverted files for text search engines
Inverted files for text search engines
 
kantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptkantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.ppt
 
tr-2006-17.doc Word document
tr-2006-17.doc Word documenttr-2006-17.doc Word document
tr-2006-17.doc Word document
 

More from Eduworks Network

Industrial Relations and Inequality in the Spanish Labour Market
Industrial Relations and Inequality in the Spanish Labour MarketIndustrial Relations and Inequality in the Spanish Labour Market
Industrial Relations and Inequality in the Spanish Labour MarketEduworks Network
 
Learning analytics the good the bad & the ugly
Learning analytics the good the bad & the uglyLearning analytics the good the bad & the ugly
Learning analytics the good the bad & the uglyEduworks Network
 
Dutch flemish 2013 presentation
Dutch flemish 2013 presentationDutch flemish 2013 presentation
Dutch flemish 2013 presentationEduworks Network
 
Labour Market Driven Learning Analytics
Labour Market Driven Learning AnalyticsLabour Market Driven Learning Analytics
Labour Market Driven Learning AnalyticsEduworks Network
 
Eduworks kick-off presentation: TCD
Eduworks kick-off presentation: TCDEduworks kick-off presentation: TCD
Eduworks kick-off presentation: TCDEduworks Network
 
Eduworks kick-off presentation: UvA - ABS
Eduworks kick-off presentation: UvA - ABSEduworks kick-off presentation: UvA - ABS
Eduworks kick-off presentation: UvA - ABSEduworks Network
 
Eduworks kick-off presentation: UvA AIAS
Eduworks kick-off presentation: UvA AIASEduworks kick-off presentation: UvA AIAS
Eduworks kick-off presentation: UvA AIASEduworks Network
 

More from Eduworks Network (14)

Janine berg
Janine bergJanine berg
Janine berg
 
Vladimer Kobayashi
Vladimer KobayashiVladimer Kobayashi
Vladimer Kobayashi
 
Alan Berg
Alan Berg Alan Berg
Alan Berg
 
Brian Fabo
Brian FaboBrian Fabo
Brian Fabo
 
Pablo de Pedraza
Pablo de PedrazaPablo de Pedraza
Pablo de Pedraza
 
Sisay Chala
Sisay ChalaSisay Chala
Sisay Chala
 
Industrial Relations and Inequality in the Spanish Labour Market
Industrial Relations and Inequality in the Spanish Labour MarketIndustrial Relations and Inequality in the Spanish Labour Market
Industrial Relations and Inequality in the Spanish Labour Market
 
Learning analytics the good the bad & the ugly
Learning analytics the good the bad & the uglyLearning analytics the good the bad & the ugly
Learning analytics the good the bad & the ugly
 
Dutch flemish 2013 presentation
Dutch flemish 2013 presentationDutch flemish 2013 presentation
Dutch flemish 2013 presentation
 
Labour Market Driven Learning Analytics
Labour Market Driven Learning AnalyticsLabour Market Driven Learning Analytics
Labour Market Driven Learning Analytics
 
JIA Cardiff
JIA CardiffJIA Cardiff
JIA Cardiff
 
Eduworks kick-off presentation: TCD
Eduworks kick-off presentation: TCDEduworks kick-off presentation: TCD
Eduworks kick-off presentation: TCD
 
Eduworks kick-off presentation: UvA - ABS
Eduworks kick-off presentation: UvA - ABSEduworks kick-off presentation: UvA - ABS
Eduworks kick-off presentation: UvA - ABS
 
Eduworks kick-off presentation: UvA AIAS
Eduworks kick-off presentation: UvA AIASEduworks kick-off presentation: UvA AIAS
Eduworks kick-off presentation: UvA AIAS
 

Recently uploaded

Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitolTechU
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...M56BOOKSTORE PRODUCT/SERVICE
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxJiesonDelaCerna
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
 

Recently uploaded (20)

Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptx
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of India
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptx
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
 

H Berkers & V.Kobayashi: Small talk ORM paper 29 9-2014

  • 1. jobknowledge.eu facebook.com/jobknowledge @Jobknowledge Small talk Text mining in organizational research: a review and a case study Vladimer Kobayashi, Hannah Berkers, Stefan Mol, Gabór Kismihók & Deanne den Hartog
  • 2. Overview The case study: Extracting job information from vacancies • The problem: Modernizing job analysis • The data: 500,000 online vacancies • The use of a framework: knowledge from the job analysis field • The techniques: feature extraction • The results: Successful automatic categorization of job information The review: text mining techniques and tasks in organizational research • The task: Invitation for a special issue on big data in ORM • The paper: Our structure so far • The question: Feedback
  • 3. The case study: Extracting job information from vacancies The problem: Modernizing job analysis Jobs are changing, but job analysis is lagging behind • Seen as a tedious and expensive, but necessary task • Not up to speed with the changes in work • Accuracy of job analysis using job incumbents as a source is questioned • Not taking advantage of the ‘big data’ opportunities
  • 4. The case study: Extracting job information from vacancies The data: 500,000 English online vacancies An often overlooked rich source of job information Could facilitate upscaling amount of data used in job analysis
  • 5. The case study: Extracting job information from vacancies The use of a framework: knowledge from the job analysis field Skills can be extracted from job advertisements (Sodhi & Son, 2009; Smith & Ali, 2014) Studies conducted in the field of Information Technologies with a focus on the use of technologies Need for a more deductive approach (George, Haas, & Pentland, 2014) We go beyond this research by using knowledge from the job analysis field We categorize job information based on the basic distinction between job attributes and job activities (Sackett & Laczo, 2003) First step toward the extraction of finer grained job information
  • 6. The case study: Extracting job information from vacancies The use of a framework: knowledge from the job analysis field Categorization into job attributes and job activities Use of manual labelling of 300 random vacancies (3,921 labelled sentences) Based on definitions of the finer grained job features (either attribute or activity), such as knowledge, abilities, tasks, responsibilities etc.
  • 7. The case study: Extracting job information from vacancies The techniques: Feature extraction Feature Matrix TEXT PREPROCESSING TEXT ENCODING Text Preprocessing • Sentence and word tokenization • Lower case transformation • Stopwords removal, e.g. the, and, etc • Extra whitespace • Lemmatization Text Encoding • Linguistic preprocessing, e.g. part of speech (POS) tagging F E A T U R E S S E N T E N C E S Job Vacancies Preprocessed Vacancies
  • 8. The case study: Extracting job information from vacancies Feature list • Sentence Length (after removing certain words) • POS of first word (job activity sentences usually start with a verb) • First word (both kind of sentences often start with certain words) • Last Word (job attribute sentences commonly end with certain words ) • Proportion of nouns and adjectives • Proportion of verbs and TO • Proportion of verbs followed by noun, verb, adjectives, adverb • Frequent words
  • 9. The case study: Extracting job information from vacancies Application of Data Mining Techniques to the Feature Matrix • Naïve Bayes • Support Vector Machines • Random Forest The results: Successful automatic categorization of job information At least 95% mean accuracy based on 10-fold cross validation compared with the base classifier accuracy of 55%
  • 10. The case study: Extracting job information from vacancies Future work • Semi-supervised labelling • Finer classification • Consideration of more features
  • 11. The review: Text mining techniques in organizational research The task: Invitation for a special issue on big data in ORM Introduce the methods of text analysis to organizational scientists Review of various techniques for mining textual data The pros and cons of different approaches (best practices) Illustrations from the current project on job analysis showing how these procedures can be applied to a substantive area
  • 12. The review: Text mining techniques in organizational research The paper: Our structure so far 1. Introduction Text data in organizational research and issues that could be solved with text mining Introduce the case study on text mining in job analysis 2. Review of text mining techniques Definitions and terminology Text preprocessing 3 tasks done in text mining: classification, feature construction, and feature selection Evaluating text mining results
  • 13. The review: Text mining techniques in organizational research The paper: Our structure so far 2. Review of text mining techniques For each task a) Text mining techniques applied to perform the tasks b) Possibilities for applying Organizational frameworks c) Advantages and disadvantages of these techniques illustrated with examples from Organizational Research and other fields d) Illustration from our case study
  • 14. The review: Text mining techniques in organizational research The paper: Our structure so far 3. Discussion of opportunities and challenges of text mining in Organizational Research Opportunities such as extending the application of text mining to other problems in Organizational Research (input?) Challenges such as dealing with data size, access and protection of data, language issues etc. 4. Conclusion
  • 15. The review: Text mining techniques in organizational research The question: Feedback What problems you are dealing with right now (or in the past) that make use of text data? What are the opportunities that you see for text mining? Which part of text mining would you like to learn more about? Do you have experience in submitting a manuscript to ORM?
  • 16. References The question: Feedback George, G., Haas, M.R. & Pentland, A. (2014). From the editors: Big Data and Management. Academy of Management Journal, 57 (2), 321-326. Sackett, P.R., & Laczo, R.M. (2003). Job and Work Analysis. In Comprehensive Handbook of Psychology: Industrial and Organizational Psychology, vol. 12, ed. W.C. Borman, D.R. Ilgen, & R.J. Klimoski, pp. 21-37. New York: Wiley. Smith, D., & Ali, A. (2014). Analysing Computer Programming Job Trend Using Web Data Mining. Issues in Informing Science and Information Technology, 11, 203-214. Sodhi, M.S., & Son, B-G. (2009). Content Analysis of O.R. Job Advertisements to Infer Required Skills. Journal of the Operational Research Society, 61, 1315-1327.