Alan Berg

•

0 likes•122 views

Eduworks Network

Session Vacancy Mining and Analysis

Data & Analytics

Squeezing biggish job market data
onto a laptop
Alan Mark Berg BSc.MSc.PGCE.
a.m.berg@uva.nl

Agenda
•Overview
• Who I am and what am I doing?
• Area of research
• Technique
•Example Results:
• Stereotypes
• Female
• IT
• Discrimination
•Question and Answers
• Refinements
• References

Who am I?
A rather mature, external PhD Candidate in Learning Analytics.
2. Hard Science Background: Physics, microelectronics with computational
engineering, experimental Science
3. Pragmatic: Last 17 years involved in Design and development of large scale
IT systems @UvA
○ Wishes to use the simplest technique possible for a given task.
4. Author of 4 books
5. Busy with open source communities.
○ Considers the best place to curate software
6. Stephan and Gabor are my co-supervisors. Prof Robin Boast my supervisor.
7. Status: In the process of writing up the research and then finishing the PhD.
8. Initial Infrastructure, standards papers published (see references)

Technique
❑ 3 million UK job adverts – 1150 million words - Thank you
Monsterboard.
❑ Simplest possible scenario
❑ Bag of words
❑ Unigrams
❑ Perl to process the text
❑ R language: Inferential Statistic and visualization
❑ CATA: Frequency of words
❑ Mapped job dataset to SOC 2010 occupation categories
❑ From SOC 2010 categories merged UK Labour force survey

Representation
in adverts
❏ Qualifications +
❏ Male +
❏ Head hunting -
❏ Female -
Under
Represented
Over
Represented
Male Dominated
Female Dominated

Dispersion IT
Skills
- Monitor skill dispersion
- Has implications for policy
and training
- Has implications for risks
within occupations such as
the deployment of IT
projects.

Is discrimination wording
attracted to female worded
job adverts?

Discrimination
Diffusion process into the central region where
men and women are more equally represented.
Color:
Red = Highest percentage of female wording
Notice the large amount of green (less wording)
in 2013

Refinements
❏Inferential Statistics
❏From Unigram to Bigram
❏Cleaner data sources
❏Multiple languages
❏Compare to specific surveys
❏Generation of many dictionaries
❏From dictionary to taxonomies
❏From research to practice

References
Motivation: We can develop large scale systems without using sensitive data.
Berg, A. M., Mol, S. T., Kismihók, G., & Sclater, N. (2016). The role of a reference synthetic data generator within the
field of learning analytics. Journal of Learning Analytics, 3, 107–128. http://doi.org/10.18608/jla.2016.31.7
Motivation: We need to add new xAPI profiles and be consistent to avoid issues with connecting systems
Berg, A., Scheffel, M., Drachsler, H., Ternier, S., & Specht, M. (2016). The dutch xAPI experience. In Proceedings of
the Sixth International Conference on Learning Analytics & Knowledge - LAK ’16 (pp. 544–545). New York, New York,
USA: ACM Press. http://doi.org/10.1145/2883851.2883968
Motivation: We need to add new xAPI profiles and be consistent to avoid issues with connecting systems
Berg, A., Scheffel, M., Drachsler, H., Ternier, S., & Specht, M. (2016). Dutch Cooking with xAPI Recipes: The Good,
the Bad, and the Consistent. In 2016 IEEE 16th International Conference on Advanced Learning Technologies (ICALT)
(pp. 234–236). IEEE. http://doi.org/10.1109/ICALT.2016.48
Motivation: To contribute to the discussion around LA infrastructural elements, hence providing a means to consistency
Sclater, N., Berg, A., & Webb, M. (2015). Developing an open architecture for learning analytics. Proceedings of the
EUNIS 2015 Congress. http://doi.org/ISSN: 2409-1340

What's hot

Sources of Change in Modern Knowledge Organization SystemsPaul Groth

1645 track 2 pafkaRising Media, Inc.

Elsevier’s Healthcare Knowledge GraphPaul Groth

Literature overview "OSS" and "Civic tech" 2017Keiko Ono

Using R to enhance numeracy in geography: some pros and cons Rich Harris

Introduction to Python for Data ScienceArc & Codementor

CV-LuisIbanezLuis Ibanez Herrera

Data Science using PythonShapeMySkills Pvt Ltd

Data science9diov

Programming for data science in pythonUmmeSalmaM1

SEEKing our way to better presentation of data and models from scientific inv...Natalie Stanford

Adding Open Data Value to 'Closed Data' ProblemsSimon Price

What's hot (12)

Sources of Change in Modern Knowledge Organization Systems

1645 track 2 pafka

Elsevier’s Healthcare Knowledge Graph

Literature overview "OSS" and "Civic tech" 2017

Using R to enhance numeracy in geography: some pros and cons

Introduction to Python for Data Science

CV-LuisIbanez

Data Science using Python

Data science

Programming for data science in python

SEEKing our way to better presentation of data and models from scientific inv...

Adding Open Data Value to 'Closed Data' Problems

Similar to Alan Berg

Pankaj Gupta CV / ResumePankaj Gupta, PhD

Online Masterclass Learning Analytics Hendrik Drachsler

Hoe ziet de toekomst van Learning Analytics er uit?Hendrik Drachsler

Towards reproducibility and maximally-open dataPablo Bernabeu

DBR (Design-Based Research) in mobile learning-Mlearn2013 Doha A_Palalas C_G...Agnieszka (Aga) Palalas, Ed.D.

Open reproducible researchSC CTSI at USC and CHLA

Digital Scholar Webinar: Open reproducible researchSC CTSI at USC and CHLA

NLP Workshop Presentation at Universitat de BarcelonaSergiPons5

Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...Richard Zijdeman

Developing and sharing tools for bioelectromagnetic researchRobert Oostenveld

Ecosystems in Management Research Agnieszka Radziwon

A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...María Poveda Villalón

NgspTim Clark

The web of data: how are we doing so farElena Simperl

Ani Adhikari & Michael Jordan - Computational Thinking and Inferential ThinkingMine Cetinkaya-Rundel

Dutch Cooking with xAPI Recipes, The Good, the Bad, and the ConsistentHendrik Drachsler

Research trends qualitative analysis in csclMerlien Institute

How to make impact with journal publications on Software Process ImprovementH...Torgeir Dingsøyr

Knowledge Graph MaintenancePaul Groth

Data legend dh_benelux_2017.keyRichard Zijdeman

Similar to Alan Berg (20)

Pankaj Gupta CV / Resume

Online Masterclass Learning Analytics

Hoe ziet de toekomst van Learning Analytics er uit?

Towards reproducibility and maximally-open data

DBR (Design-Based Research) in mobile learning-Mlearn2013 Doha A_Palalas C_G...

Open reproducible research

Digital Scholar Webinar: Open reproducible research

NLP Workshop Presentation at Universitat de Barcelona

Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...

Developing and sharing tools for bioelectromagnetic research

Ecosystems in Management Research

A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...

Ngsp

The web of data: how are we doing so far

Ani Adhikari & Michael Jordan - Computational Thinking and Inferential Thinking

Dutch Cooking with xAPI Recipes, The Good, the Bad, and the Consistent

Research trends qualitative analysis in cscl

How to make impact with journal publications on Software Process ImprovementH...

Knowledge Graph Maintenance

Data legend dh_benelux_2017.key

Recently uploaded

Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa

Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda

Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863

Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda

Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Universitat Politècnica de Catalunya

NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics

办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss

Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534

EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster

Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03

Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha

9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha

Call Girls in Saket 99530🔝 56974 Escort Service9953056974 Low Rate Call Girls In Saket, Delhi NCR

E-Commerce Order PredictionShraddha Kamble.pptxBoston Institute of Analytics

9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort

办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一F La

原版1:1定制南十字星大学毕业证（SCU毕业证）#文凭成绩单#真实留信学历认证永久存档208367051

High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh

科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss

Recently uploaded (20)

Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf

Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx

Dubai Call Girls Wifey O52&786472 Call Girls Dubai

Customer Service Analytics - Make Sense of All Your Data.pptx

Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)

NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx

办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一

Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...

EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx

Top 5 Best Data Analytics Courses In Queens

Call Girls In Dwarka 9654467111 Escorts Service

9654467111 Call Girls In Munirka Hotel And Home Service

Call Girls in Saket 99530🔝 56974 Escort Service

E-Commerce Order PredictionShraddha Kamble.pptx

9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service

办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一

原版1:1定制南十字星大学毕业证（SCU毕业证）#文凭成绩单#真实留信学历认证永久存档

High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...

科罗拉多大学波尔得分校毕业证学位证成绩单-可办理

Alan Berg

1. Squeezing biggish job market data onto a laptop Alan Mark Berg BSc.MSc.PGCE. a.m.berg@uva.nl

2. Agenda •Overview • Who I am and what am I doing? • Area of research • Technique •Example Results: • Stereotypes • Female • IT • Discrimination •Question and Answers • Refinements • References

3. Who am I? A rather mature, external PhD Candidate in Learning Analytics. 2. Hard Science Background: Physics, microelectronics with computational engineering, experimental Science 3. Pragmatic: Last 17 years involved in Design and development of large scale IT systems @UvA ○ Wishes to use the simplest technique possible for a given task. 4. Author of 4 books 5. Busy with open source communities. ○ Considers the best place to curate software 6. Stephan and Gabor are my co-supervisors. Prof Robin Boast my supervisor. 7. Status: In the process of writing up the research and then finishing the PhD. 8. Initial Infrastructure, standards papers published (see references)

5. Technique ❑ 3 million UK job adverts – 1150 million words - Thank you Monsterboard. ❑ Simplest possible scenario ❑ Bag of words ❑ Unigrams ❑ Perl to process the text ❑ R language: Inferential Statistic and visualization ❑ CATA: Frequency of words ❑ Mapped job dataset to SOC 2010 occupation categories ❑ From SOC 2010 categories merged UK Labour force survey

6. UK Labour Market Survey

7. Representation in adverts ❏ Qualifications + ❏ Male + ❏ Head hunting - ❏ Female - Under Represented Over Represented Male Dominated Female Dominated

8. Salary Stereotypical distribution

9. Dispersion IT Skills - Monitor skill dispersion - Has implications for policy and training - Has implications for risks within occupations such as the deployment of IT projects.

10. Dispersion Female words

11. Is discrimination wording attracted to female worded job adverts?

12. Discrimination Diffusion process into the central region where men and women are more equally represented. Color: Red = Highest percentage of female wording Notice the large amount of green (less wording) in 2013

13. Questions

14. Refinements ❏Inferential Statistics ❏From Unigram to Bigram ❏Cleaner data sources ❏Multiple languages ❏Compare to specific surveys ❏Generation of many dictionaries ❏From dictionary to taxonomies ❏From research to practice

15. References Motivation: We can develop large scale systems without using sensitive data. Berg, A. M., Mol, S. T., Kismihók, G., & Sclater, N. (2016). The role of a reference synthetic data generator within the field of learning analytics. Journal of Learning Analytics, 3, 107–128. http://doi.org/10.18608/jla.2016.31.7 Motivation: We need to add new xAPI profiles and be consistent to avoid issues with connecting systems Berg, A., Scheffel, M., Drachsler, H., Ternier, S., & Specht, M. (2016). The dutch xAPI experience. In Proceedings of the Sixth International Conference on Learning Analytics & Knowledge - LAK ’16 (pp. 544–545). New York, New York, USA: ACM Press. http://doi.org/10.1145/2883851.2883968 Motivation: We need to add new xAPI profiles and be consistent to avoid issues with connecting systems Berg, A., Scheffel, M., Drachsler, H., Ternier, S., & Specht, M. (2016). Dutch Cooking with xAPI Recipes: The Good, the Bad, and the Consistent. In 2016 IEEE 16th International Conference on Advanced Learning Technologies (ICALT) (pp. 234–236). IEEE. http://doi.org/10.1109/ICALT.2016.48 Motivation: To contribute to the discussion around LA infrastructural elements, hence providing a means to consistency Sclater, N., Berg, A., & Webb, M. (2015). Developing an open architecture for learning analytics. Proceedings of the EUNIS 2015 Congress. http://doi.org/ISSN: 2409-1340

Alan Berg

Recommended

Recommended

More Related Content

What's hot

What's hot (12)

Similar to Alan Berg

Similar to Alan Berg (20)

More from Eduworks Network

More from Eduworks Network (20)

Recently uploaded

Recently uploaded (20)

Alan Berg