SlideShare a Scribd company logo
1 of 10
Download to read offline
ARE DATA CURATORS EVER DATA SCIENTISTS?
………………………………………………………………………………………................................................................................................

LOUISE
CORTI
………………………………………….

ASSOCIATE DIERCTOR
UK DATA ARCHIVE
UNIVERSITY OF ESSEX
…………………………………….……

Online Information Conference
London, 20 NOVEMBER 2013
…………………………………………………………………………………………………………………………..…….……………………………..

• “A high ranking professional with the training and
curiosity to make discoveries in the world of big data”
• Exploiting the opportunities of big data, open data
and linked data
• Possessing skills to manipulate the data and extract
insightful patterns…in multiple petabytes of data
(1015)

…………………………………………………………………………………………………………………………………….………………………..…

UK DATA ARCHIVE
UK DATA ARCHIVE: RELATIVELY SMALL DATA

…………………………………………………………………………………………………………………………..…….……………………………..

• An easy-to-use, innovative and trusted one-stop-shop
for users and suppliers of social science data
resources: ESRC UK Data Service
• Data for secondary analysis, research, policy making
and teaching and learning
• Wide range of data:
• Government and academic surveys
• International aggregate data banks
• Qualitative data
…………………………………………………………………………………………………………………………………….………………………..…

UK DATA ARCHIVE
SCALE AND VOLUME

…………………………………………………………………………………………………………………………..…….……………………………..

•
•
•
•
•
•

@ 6,336 studies
1,632 GB
1,255,814 files
113,832 directories
Av. file size 1.29 MBytes
Grows by 120 GB per year

…………………………………………………………………………………………………………………………………….………………………..…

UK DATA ARCHIVE
OUR WORKFLOW AND DATA SKILLS
…………………………………………………………………………………………………………………………..…….……………………………..

DATA APPRAISAL

DATA LICENSING

DATA ANALYSIS

User support
For Secondary
Analysis

Pre-Ingest

Controlling
Access

DM TRAINING
DATA DESCRIPTION

DATA HANDLING & QA

DATA DISCLOSURE
ANALYSIS

DATA TRANSFORMATION

…………………………………………………………………………………………………………………………………….………………………..…

UK DATA ARCHIVE
DATA SKILLS AND DOMAIN EXPERTISE

…………………………………………………………………………………………………………………………..…….……………………………..

•
•
•
•

Roles discreet
No one person does them
Require data skills/ domain expertise
Most who interact with data have post grad
qualifications in social science

• In our organization, ‘data scientist’ role appropriate to
those who:
• transform data - use scripting to help, analyze integrity
• manipulate, link and merge data sources - harmonized
and added value products, e.g. European variables on
education, or historic census data
…………………………………………………………………………………………………………………………………….………………………..…

UK DATA ARCHIVE
WHO ARE OUR ‘DATA SCIENTISTS?’

…………………………………………………………………………………………………………………………..…….……………………………..

• Only ever one of them at any one time
• They are definitely data ‘geeks ‘and are proud of that term
• They are highly intelligent social science postgrad /ECRs
with experience in analyzing socio-demographic data
• They are database/ programming-curious and have picked
up these skills along the way. Extract-Transform-Load skills
• They have picked up curation skills within the organization,
e.g. metadata and preservation requirements

• Work with others outside our organization
…………………………………………………………………………………………………………………………………….………………………..…

UK DATA ARCHIVE
AM I A DATA SCIENTIST?

…………………………………………………………………………………………………………………………..…….……………………………..

• Currently, I probably am not
• Undergrad chemist - experimental data, mass spectrometry
readings, but pre-digital days

• Postgrad and ECR social scientist – designed and
analyzed large-scale national survey data
• Apply my own research and methods skills:
• data selection and appraisal,
• data documentation and metadata
• user delivery, support and promotion

• I have picked up curation skills along the way
• My preferred pathway for a Data Professional
…………………………………………………………………………………………………………………………………….………………………..…

UK DATA ARCHIVE
WE REALLY NEED THESE SKILLS

…………………………………………………………………………………………………………………………..…….……………………………..

• For social science, much larger data are on their way
• Linking, scaling up, real time feeds, data mining
• The Big Data Family is born - David Willetts MP
announces the ESRC Big Data Network
• Recent £64 million investment
• Phase 1 : Administrative Data Research Network
• Phase 2: Business and Local Government Data Research
Centres
• Phase 3: Third Sector and Social Media Data

• Data scientists are critical to this programme
…………………………………………………………………………………………………………………………………….………………………..…

UK DATA ARCHIVE
‘DATA SCIENCE’ TRAINING – DOMAIN ISSUE?

…………………………………………………………………………………………………………………………..…….……………………………..

• Increased data management /
preservation training in academia
• We train a wide range – HE staff,
students, research support staff
• Why? More/better data, more users
• Other social science data archives/
data libraries starting to provide this

To draw great insights from
data, you have to know the
data, know the business, and
know the contextual
relationships that are built into
the business (Pierson, Smart
Data Collective)

…………………………………………………………………………………………………………………………………….………………………..…

UK DATA ARCHIVE

More Related Content

Similar to Louise Corti Data scientists

Triangulating our professional development
Triangulating our professional developmentTriangulating our professional development
Triangulating our professional developmentNancy Wright White
 
report.doc
report.docreport.doc
report.docbutest
 
You've got it -how to convert existing databases into great web content
You've got it -how to convert existing databases into great web contentYou've got it -how to convert existing databases into great web content
You've got it -how to convert existing databases into great web contentStephanie Cannon
 
Abstract contents
Abstract contentsAbstract contents
Abstract contentsloisy28
 
Data Science & BI Salary & Skills Report
Data Science & BI Salary & Skills ReportData Science & BI Salary & Skills Report
Data Science & BI Salary & Skills ReportPaul Buzby
 
Big data technologies : A survey
Big data technologies : A survey Big data technologies : A survey
Big data technologies : A survey fatimabenjelloun1
 
AutoCarto Six Retrospective
AutoCarto Six RetrospectiveAutoCarto Six Retrospective
AutoCarto Six RetrospectiveBarry Wellar
 
Eta design-guide-2019oct
Eta design-guide-2019octEta design-guide-2019oct
Eta design-guide-2019octssuserae99fb
 
Africa Data Revolution Report 2018
Africa Data Revolution Report 2018Africa Data Revolution Report 2018
Africa Data Revolution Report 2018bamaemmanuel
 
Final Report
Final ReportFinal Report
Final Reporttdsrogers
 
Unstructured Data and the Enterprise
Unstructured Data and the EnterpriseUnstructured Data and the Enterprise
Unstructured Data and the EnterpriseDATAVERSITY
 
9780137564279_Sample.pdf
9780137564279_Sample.pdf9780137564279_Sample.pdf
9780137564279_Sample.pdfNormanApaza1
 
Heritage & Creative Learning Framework (by Anna Hansen)
Heritage & Creative Learning Framework (by Anna Hansen)Heritage & Creative Learning Framework (by Anna Hansen)
Heritage & Creative Learning Framework (by Anna Hansen)Nicholas Poole
 

Similar to Louise Corti Data scientists (20)

Web Search 101
Web Search 101Web Search 101
Web Search 101
 
Triangulating our professional development
Triangulating our professional developmentTriangulating our professional development
Triangulating our professional development
 
Keyword Research for Professionals - SMX Stockholm 2012
Keyword Research for Professionals - SMX Stockholm 2012Keyword Research for Professionals - SMX Stockholm 2012
Keyword Research for Professionals - SMX Stockholm 2012
 
report.doc
report.docreport.doc
report.doc
 
You've got it -how to convert existing databases into great web content
You've got it -how to convert existing databases into great web contentYou've got it -how to convert existing databases into great web content
You've got it -how to convert existing databases into great web content
 
Abstract contents
Abstract contentsAbstract contents
Abstract contents
 
Data Science & BI Salary & Skills Report
Data Science & BI Salary & Skills ReportData Science & BI Salary & Skills Report
Data Science & BI Salary & Skills Report
 
Master's Thesis
Master's ThesisMaster's Thesis
Master's Thesis
 
Big data technologies : A survey
Big data technologies : A survey Big data technologies : A survey
Big data technologies : A survey
 
Benchmarks for Digital Preservation tools. Kresimir Duretec, Artur Kulmukhame...
Benchmarks for Digital Preservation tools. Kresimir Duretec, Artur Kulmukhame...Benchmarks for Digital Preservation tools. Kresimir Duretec, Artur Kulmukhame...
Benchmarks for Digital Preservation tools. Kresimir Duretec, Artur Kulmukhame...
 
AutoCarto Six Retrospective
AutoCarto Six RetrospectiveAutoCarto Six Retrospective
AutoCarto Six Retrospective
 
Big Data Social Network Analysis
Big Data Social Network AnalysisBig Data Social Network Analysis
Big Data Social Network Analysis
 
Eta design-guide-2019oct
Eta design-guide-2019octEta design-guide-2019oct
Eta design-guide-2019oct
 
Africa Data Revolution Report 2018
Africa Data Revolution Report 2018Africa Data Revolution Report 2018
Africa Data Revolution Report 2018
 
R data
R dataR data
R data
 
Final Report
Final ReportFinal Report
Final Report
 
Sameh's CV
Sameh's CVSameh's CV
Sameh's CV
 
Unstructured Data and the Enterprise
Unstructured Data and the EnterpriseUnstructured Data and the Enterprise
Unstructured Data and the Enterprise
 
9780137564279_Sample.pdf
9780137564279_Sample.pdf9780137564279_Sample.pdf
9780137564279_Sample.pdf
 
Heritage & Creative Learning Framework (by Anna Hansen)
Heritage & Creative Learning Framework (by Anna Hansen)Heritage & Creative Learning Framework (by Anna Hansen)
Heritage & Creative Learning Framework (by Anna Hansen)
 

More from Incisive_Events

Alain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersAlain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersIncisive_Events
 
Andrew Cox Research data management
Andrew Cox Research data managementAndrew Cox Research data management
Andrew Cox Research data managementIncisive_Events
 
Anne Osterrieder Tools for sharing your research
Anne Osterrieder Tools for sharing your researchAnne Osterrieder Tools for sharing your research
Anne Osterrieder Tools for sharing your researchIncisive_Events
 
Mahendra Mahey British Library Labs
Mahendra Mahey British Library LabsMahendra Mahey British Library Labs
Mahendra Mahey British Library LabsIncisive_Events
 
Phil Bradley The future of Search
Phil Bradley The future of SearchPhil Bradley The future of Search
Phil Bradley The future of SearchIncisive_Events
 
Arthur Weiss Google vs other search tools
Arthur Weiss Google vs other search toolsArthur Weiss Google vs other search tools
Arthur Weiss Google vs other search toolsIncisive_Events
 
James Bennett CLA Search and Licence System
James Bennett CLA Search and Licence SystemJames Bennett CLA Search and Licence System
James Bennett CLA Search and Licence SystemIncisive_Events
 
Lucy Montgomery Open access for scholarly books
Lucy Montgomery Open access for scholarly booksLucy Montgomery Open access for scholarly books
Lucy Montgomery Open access for scholarly booksIncisive_Events
 
Max Espley Royal Society of Chemistry and Open Access
Max Espley Royal Society of Chemistry and Open AccessMax Espley Royal Society of Chemistry and Open Access
Max Espley Royal Society of Chemistry and Open AccessIncisive_Events
 
Jacob Morgan The Future of Work
Jacob Morgan The Future of WorkJacob Morgan The Future of Work
Jacob Morgan The Future of WorkIncisive_Events
 
Mark Stevenson Surviving in a fast changing world
Mark Stevenson Surviving in a fast changing worldMark Stevenson Surviving in a fast changing world
Mark Stevenson Surviving in a fast changing worldIncisive_Events
 
Alex Follett Integrating your library into wider institutional environment
Alex Follett Integrating your library into wider institutional environmentAlex Follett Integrating your library into wider institutional environment
Alex Follett Integrating your library into wider institutional environmentIncisive_Events
 
Sarah Fahy Reshaping Your Team
Sarah Fahy Reshaping Your TeamSarah Fahy Reshaping Your Team
Sarah Fahy Reshaping Your TeamIncisive_Events
 
James Andrews User Engagement
James Andrews User EngagementJames Andrews User Engagement
James Andrews User EngagementIncisive_Events
 
Heini Oikkonen Mobile Library App Goes Home
Heini Oikkonen Mobile Library App Goes HomeHeini Oikkonen Mobile Library App Goes Home
Heini Oikkonen Mobile Library App Goes HomeIncisive_Events
 
Henry Stiller Implementing New Roles For Information Professionals
Henry Stiller Implementing New Roles For Information ProfessionalsHenry Stiller Implementing New Roles For Information Professionals
Henry Stiller Implementing New Roles For Information ProfessionalsIncisive_Events
 
Ellyssa Krosky The future of libraries and information services
Ellyssa Krosky The future of libraries and information servicesEllyssa Krosky The future of libraries and information services
Ellyssa Krosky The future of libraries and information servicesIncisive_Events
 
Miguel Garcia How Yammer Can Improve Collaboration
Miguel Garcia How Yammer Can Improve Collaboration Miguel Garcia How Yammer Can Improve Collaboration
Miguel Garcia How Yammer Can Improve Collaboration Incisive_Events
 

More from Incisive_Events (20)

Alain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersAlain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producers
 
Andrew Cox Research data management
Andrew Cox Research data managementAndrew Cox Research data management
Andrew Cox Research data management
 
Jan Reichelt Mendeley
Jan Reichelt MendeleyJan Reichelt Mendeley
Jan Reichelt Mendeley
 
Rachel Green Jove
Rachel Green JoveRachel Green Jove
Rachel Green Jove
 
Anne Osterrieder Tools for sharing your research
Anne Osterrieder Tools for sharing your researchAnne Osterrieder Tools for sharing your research
Anne Osterrieder Tools for sharing your research
 
Mahendra Mahey British Library Labs
Mahendra Mahey British Library LabsMahendra Mahey British Library Labs
Mahendra Mahey British Library Labs
 
Phil Bradley The future of Search
Phil Bradley The future of SearchPhil Bradley The future of Search
Phil Bradley The future of Search
 
Arthur Weiss Google vs other search tools
Arthur Weiss Google vs other search toolsArthur Weiss Google vs other search tools
Arthur Weiss Google vs other search tools
 
James Bennett CLA Search and Licence System
James Bennett CLA Search and Licence SystemJames Bennett CLA Search and Licence System
James Bennett CLA Search and Licence System
 
Lucy Montgomery Open access for scholarly books
Lucy Montgomery Open access for scholarly booksLucy Montgomery Open access for scholarly books
Lucy Montgomery Open access for scholarly books
 
Max Espley Royal Society of Chemistry and Open Access
Max Espley Royal Society of Chemistry and Open AccessMax Espley Royal Society of Chemistry and Open Access
Max Espley Royal Society of Chemistry and Open Access
 
Jacob Morgan The Future of Work
Jacob Morgan The Future of WorkJacob Morgan The Future of Work
Jacob Morgan The Future of Work
 
Mark Stevenson Surviving in a fast changing world
Mark Stevenson Surviving in a fast changing worldMark Stevenson Surviving in a fast changing world
Mark Stevenson Surviving in a fast changing world
 
Alex Follett Integrating your library into wider institutional environment
Alex Follett Integrating your library into wider institutional environmentAlex Follett Integrating your library into wider institutional environment
Alex Follett Integrating your library into wider institutional environment
 
Sarah Fahy Reshaping Your Team
Sarah Fahy Reshaping Your TeamSarah Fahy Reshaping Your Team
Sarah Fahy Reshaping Your Team
 
James Andrews User Engagement
James Andrews User EngagementJames Andrews User Engagement
James Andrews User Engagement
 
Heini Oikkonen Mobile Library App Goes Home
Heini Oikkonen Mobile Library App Goes HomeHeini Oikkonen Mobile Library App Goes Home
Heini Oikkonen Mobile Library App Goes Home
 
Henry Stiller Implementing New Roles For Information Professionals
Henry Stiller Implementing New Roles For Information ProfessionalsHenry Stiller Implementing New Roles For Information Professionals
Henry Stiller Implementing New Roles For Information Professionals
 
Ellyssa Krosky The future of libraries and information services
Ellyssa Krosky The future of libraries and information servicesEllyssa Krosky The future of libraries and information services
Ellyssa Krosky The future of libraries and information services
 
Miguel Garcia How Yammer Can Improve Collaboration
Miguel Garcia How Yammer Can Improve Collaboration Miguel Garcia How Yammer Can Improve Collaboration
Miguel Garcia How Yammer Can Improve Collaboration
 

Recently uploaded

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 

Recently uploaded (20)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 

Louise Corti Data scientists

  • 1. ARE DATA CURATORS EVER DATA SCIENTISTS? ………………………………………………………………………………………................................................................................................ LOUISE CORTI …………………………………………. ASSOCIATE DIERCTOR UK DATA ARCHIVE UNIVERSITY OF ESSEX …………………………………….…… Online Information Conference London, 20 NOVEMBER 2013
  • 2. …………………………………………………………………………………………………………………………..…….…………………………….. • “A high ranking professional with the training and curiosity to make discoveries in the world of big data” • Exploiting the opportunities of big data, open data and linked data • Possessing skills to manipulate the data and extract insightful patterns…in multiple petabytes of data (1015) …………………………………………………………………………………………………………………………………….………………………..… UK DATA ARCHIVE
  • 3. UK DATA ARCHIVE: RELATIVELY SMALL DATA …………………………………………………………………………………………………………………………..…….…………………………….. • An easy-to-use, innovative and trusted one-stop-shop for users and suppliers of social science data resources: ESRC UK Data Service • Data for secondary analysis, research, policy making and teaching and learning • Wide range of data: • Government and academic surveys • International aggregate data banks • Qualitative data …………………………………………………………………………………………………………………………………….………………………..… UK DATA ARCHIVE
  • 4. SCALE AND VOLUME …………………………………………………………………………………………………………………………..…….…………………………….. • • • • • • @ 6,336 studies 1,632 GB 1,255,814 files 113,832 directories Av. file size 1.29 MBytes Grows by 120 GB per year …………………………………………………………………………………………………………………………………….………………………..… UK DATA ARCHIVE
  • 5. OUR WORKFLOW AND DATA SKILLS …………………………………………………………………………………………………………………………..…….…………………………….. DATA APPRAISAL DATA LICENSING DATA ANALYSIS User support For Secondary Analysis Pre-Ingest Controlling Access DM TRAINING DATA DESCRIPTION DATA HANDLING & QA DATA DISCLOSURE ANALYSIS DATA TRANSFORMATION …………………………………………………………………………………………………………………………………….………………………..… UK DATA ARCHIVE
  • 6. DATA SKILLS AND DOMAIN EXPERTISE …………………………………………………………………………………………………………………………..…….…………………………….. • • • • Roles discreet No one person does them Require data skills/ domain expertise Most who interact with data have post grad qualifications in social science • In our organization, ‘data scientist’ role appropriate to those who: • transform data - use scripting to help, analyze integrity • manipulate, link and merge data sources - harmonized and added value products, e.g. European variables on education, or historic census data …………………………………………………………………………………………………………………………………….………………………..… UK DATA ARCHIVE
  • 7. WHO ARE OUR ‘DATA SCIENTISTS?’ …………………………………………………………………………………………………………………………..…….…………………………….. • Only ever one of them at any one time • They are definitely data ‘geeks ‘and are proud of that term • They are highly intelligent social science postgrad /ECRs with experience in analyzing socio-demographic data • They are database/ programming-curious and have picked up these skills along the way. Extract-Transform-Load skills • They have picked up curation skills within the organization, e.g. metadata and preservation requirements • Work with others outside our organization …………………………………………………………………………………………………………………………………….………………………..… UK DATA ARCHIVE
  • 8. AM I A DATA SCIENTIST? …………………………………………………………………………………………………………………………..…….…………………………….. • Currently, I probably am not • Undergrad chemist - experimental data, mass spectrometry readings, but pre-digital days • Postgrad and ECR social scientist – designed and analyzed large-scale national survey data • Apply my own research and methods skills: • data selection and appraisal, • data documentation and metadata • user delivery, support and promotion • I have picked up curation skills along the way • My preferred pathway for a Data Professional …………………………………………………………………………………………………………………………………….………………………..… UK DATA ARCHIVE
  • 9. WE REALLY NEED THESE SKILLS …………………………………………………………………………………………………………………………..…….…………………………….. • For social science, much larger data are on their way • Linking, scaling up, real time feeds, data mining • The Big Data Family is born - David Willetts MP announces the ESRC Big Data Network • Recent £64 million investment • Phase 1 : Administrative Data Research Network • Phase 2: Business and Local Government Data Research Centres • Phase 3: Third Sector and Social Media Data • Data scientists are critical to this programme …………………………………………………………………………………………………………………………………….………………………..… UK DATA ARCHIVE
  • 10. ‘DATA SCIENCE’ TRAINING – DOMAIN ISSUE? …………………………………………………………………………………………………………………………..…….…………………………….. • Increased data management / preservation training in academia • We train a wide range – HE staff, students, research support staff • Why? More/better data, more users • Other social science data archives/ data libraries starting to provide this To draw great insights from data, you have to know the data, know the business, and know the contextual relationships that are built into the business (Pierson, Smart Data Collective) …………………………………………………………………………………………………………………………………….………………………..… UK DATA ARCHIVE