SlideShare a Scribd company logo
1 of 31
Download to read offline
Democratizing data skills to
advance data-driven discovery
Tracy K. Teal, PhD
@tracykteal, @datacarpentry
https://github.com/tracykteal/datacarpentry-presentations
Our increasing capacity to collect data is
changing how we can look at the world
Software and tools allow us to turn
data into information.
People turn information into
knowledge.
We can unleash the potential of data
by empowering people
It’s challenging to manage &
analyze this data
Researchers view the major limiting factor in
research progress as a lack of expertise in how
to handle and analyze data
http://braembl.org.au/news/braembl-community-survey-report-2013
Current unmet needs
Barone L, Williams J and Micklos D. Unmet Needs for Analyzing Biological Big Data:
A Survey of 704 NSF Principal Investigators (2017)
How do we scale data skills along with
data production?
Build local communities and capacity
Software and Data
Carpentry
Building community teaching
universal data literacy
Grassroots training effort
- Developed by practitioners for practitioners
- Identify skills and best practices needed in software
development, data management and analysis in given
domains
- Collaboratively and iteratively developed openly licensed
(CC-BY) training materials (all available on github)
- Organize and deliver two-day, intensive hands-on
workshops in fundamental data analysis skills using a pool
of volunteer helpers and instructors
Software and Data
Carpentry
• Core skills for effective research computing
• Two-day hands-on workshops
• Collaboratively developed, openly licensed
lesson materials
• Over 800 trained volunteer instructors on 6
continents
Software & Data Carpentry workshops
We know we can’t teach everything in two days,
but the goal is to teach foundational skills to
reduce the activation energy for getting started
and for people to know what’s possible.
Workshop goals
• Primary goal is increased confidence in a learners ability to
do computational work and continue to learn more.
• Lower the activation barrier for people to get started
working with data. Often people don’t know where to start.
• The skills we teach should be ones they can immediately
apply to their research.
• People should learn the things we’re teaching.
• We should see a shift in perspective in the value of skills
like scripting for better research and reproducible research.
• People should have a positive workshop experience that
empowers them to do better work with data.
Hands-on intensive workshops
• Two days
• Hands-on
• Qualified instructors
• Helpers
• Post-it notes!
• Friendly learning environment
Collaboratively developed curriculum
• Focused on data - teaches how to manage and analyze data
in an effective and reproducible way.
• Initial focus is on workshops for novices - there are no
prerequisites, and no prior knowledge computational
experience is assumed.
• Domain specific by design – currently have lessons in
ecology, in genomics developed with iPlant, in geospatial
data developed with NEON, and on APIs with rOpenSci
Curriculum
• Project and data organization
• Data cleaning and quality control
• Data analysis and visualization in R or Python
Domain dependent
• Introduction to cloud computing
• Command line
• Text mining
Curriculum
• Command line
• Software development best practices in R, Python or
MATLAB
• Github for version control
• Focused on better scientific programming
practices, for writing software or analysis
scripts
• Novice and more advanced levels
• Domain agnostic
Volunteer instructors
The Software Carpentry Foundation runs the Instructor Training program that
teaches volunteers teaching pedagogy and specific strategies for teaching
computational skills in a workshop format.
Training is a missing piece between
data collection & data-driven discovery
Training
If you want to go fast, go alone.
If you want to go far, go together.
Get Involved
 Request a workshop
 Be a helper at a workshop
 Become a partner or affiliate and train
local instructors
 Contribute to lesson development
Guiding the Carpentries
Software Carpentry Steering Committee:
Karin Lagenstrom (Norway)
Rayna Harris (UT Austin)
Susan McClatchy (Jackson Labs)
Kate Hertweck (UT Tyler)
Christina Koch (University of Wisconsin)
Mateusz Kuzak (eScience Netherlands)
Belinda Weaver (Australia)
Data Carpentry Steering Committee:
Karen Cranston (Duke / OpenTree of Life)
Hilmar Lapp (Duke)
Aleksandra Pawlik (New Zealand eScience)
Karthik Ram (rOpenSci / Berkeley Institute of Data Science Fellow)
Ethan White (University of Florida / Moore DDD Investigator)
Support
Data Carpentry
Software Carpentry
http://software-carpentry.org/scf/partners/
Jason Williams at iPlant

More Related Content

What's hot

Slides | Research data literacy and the library
Slides | Research data literacy and the librarySlides | Research data literacy and the library
Slides | Research data literacy and the libraryColleen DeLory
 
RDAP14: Learning to Curate Panel
RDAP14: Learning to Curate Panel RDAP14: Learning to Curate Panel
RDAP14: Learning to Curate Panel ASIS&T
 
Slides | Targeting the librarian’s role in research services
Slides | Targeting the librarian’s role in research servicesSlides | Targeting the librarian’s role in research services
Slides | Targeting the librarian’s role in research servicesLibrary_Connect
 
RDAP14: Data on a dime, building data services at James Madison University
RDAP14: Data on a dime, building data services at James Madison University RDAP14: Data on a dime, building data services at James Madison University
RDAP14: Data on a dime, building data services at James Madison University ASIS&T
 
SLIDES | 12 time-saving tips for research support
SLIDES | 12 time-saving tips for research supportSLIDES | 12 time-saving tips for research support
SLIDES | 12 time-saving tips for research supportLibrary_Connect
 
RDAP 16: Building the Research Data Community of Practice
RDAP 16: Building the Research Data Community of PracticeRDAP 16: Building the Research Data Community of Practice
RDAP 16: Building the Research Data Community of PracticeASIS&T
 
RDAP14: University-wide Research Data Management Policy
RDAP14: University-wide Research Data Management PolicyRDAP14: University-wide Research Data Management Policy
RDAP14: University-wide Research Data Management PolicyASIS&T
 
RDAP14: Emerging role of UC Libraries in research data management education
RDAP14: Emerging role of UC Libraries in research data management educationRDAP14: Emerging role of UC Libraries in research data management education
RDAP14: Emerging role of UC Libraries in research data management educationASIS&T
 
RDAP14 Poster: Samantha Guss Data management planning and responsible conduct...
RDAP14 Poster: Samantha Guss Data management planning and responsible conduct...RDAP14 Poster: Samantha Guss Data management planning and responsible conduct...
RDAP14 Poster: Samantha Guss Data management planning and responsible conduct...ASIS&T
 
UCLALibrary_DResSUP.pptx
UCLALibrary_DResSUP.pptxUCLALibrary_DResSUP.pptx
UCLALibrary_DResSUP.pptxrobhill123
 
RDAP14 Poster: Evaluation of research data services: What things should we ev...
RDAP14 Poster: Evaluation of research data services: What things should we ev...RDAP14 Poster: Evaluation of research data services: What things should we ev...
RDAP14 Poster: Evaluation of research data services: What things should we ev...ASIS&T
 
Building a Community for Research Data Services: CLIR/DLF E-Research Peer Net...
Building a Community for Research Data Services: CLIR/DLF E-Research Peer Net...Building a Community for Research Data Services: CLIR/DLF E-Research Peer Net...
Building a Community for Research Data Services: CLIR/DLF E-Research Peer Net...Inna Kouper
 

What's hot (20)

Slides | Research data literacy and the library
Slides | Research data literacy and the librarySlides | Research data literacy and the library
Slides | Research data literacy and the library
 
Lee "Supporting Research Data is a Group Effort"
Lee "Supporting Research Data is a Group Effort"Lee "Supporting Research Data is a Group Effort"
Lee "Supporting Research Data is a Group Effort"
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
RDAP14: Learning to Curate Panel
RDAP14: Learning to Curate Panel RDAP14: Learning to Curate Panel
RDAP14: Learning to Curate Panel
 
Slides | Targeting the librarian’s role in research services
Slides | Targeting the librarian’s role in research servicesSlides | Targeting the librarian’s role in research services
Slides | Targeting the librarian’s role in research services
 
RDAP14: Data on a dime, building data services at James Madison University
RDAP14: Data on a dime, building data services at James Madison University RDAP14: Data on a dime, building data services at James Madison University
RDAP14: Data on a dime, building data services at James Madison University
 
Zucca "Technology & Systems"
Zucca "Technology & Systems"Zucca "Technology & Systems"
Zucca "Technology & Systems"
 
Basic database search training for NHS library assistants - Barnard & Pawley
Basic database search training for NHS library assistants - Barnard & Pawley Basic database search training for NHS library assistants - Barnard & Pawley
Basic database search training for NHS library assistants - Barnard & Pawley
 
SLIDES | 12 time-saving tips for research support
SLIDES | 12 time-saving tips for research supportSLIDES | 12 time-saving tips for research support
SLIDES | 12 time-saving tips for research support
 
Strasser "Effective data management and its role in open research"
Strasser "Effective data management and its role in open research"Strasser "Effective data management and its role in open research"
Strasser "Effective data management and its role in open research"
 
RDAP 16: Building the Research Data Community of Practice
RDAP 16: Building the Research Data Community of PracticeRDAP 16: Building the Research Data Community of Practice
RDAP 16: Building the Research Data Community of Practice
 
RDAP14: University-wide Research Data Management Policy
RDAP14: University-wide Research Data Management PolicyRDAP14: University-wide Research Data Management Policy
RDAP14: University-wide Research Data Management Policy
 
Chilton "Collaborative Collection Assessment"
Chilton "Collaborative Collection Assessment"Chilton "Collaborative Collection Assessment"
Chilton "Collaborative Collection Assessment"
 
RDAP14: Emerging role of UC Libraries in research data management education
RDAP14: Emerging role of UC Libraries in research data management educationRDAP14: Emerging role of UC Libraries in research data management education
RDAP14: Emerging role of UC Libraries in research data management education
 
RDAP14 Poster: Samantha Guss Data management planning and responsible conduct...
RDAP14 Poster: Samantha Guss Data management planning and responsible conduct...RDAP14 Poster: Samantha Guss Data management planning and responsible conduct...
RDAP14 Poster: Samantha Guss Data management planning and responsible conduct...
 
UCLALibrary_DResSUP.pptx
UCLALibrary_DResSUP.pptxUCLALibrary_DResSUP.pptx
UCLALibrary_DResSUP.pptx
 
Liaison panelrdap2016
Liaison panelrdap2016Liaison panelrdap2016
Liaison panelrdap2016
 
RDAP14 Poster: Evaluation of research data services: What things should we ev...
RDAP14 Poster: Evaluation of research data services: What things should we ev...RDAP14 Poster: Evaluation of research data services: What things should we ev...
RDAP14 Poster: Evaluation of research data services: What things should we ev...
 
Building a Community for Research Data Services: CLIR/DLF E-Research Peer Net...
Building a Community for Research Data Services: CLIR/DLF E-Research Peer Net...Building a Community for Research Data Services: CLIR/DLF E-Research Peer Net...
Building a Community for Research Data Services: CLIR/DLF E-Research Peer Net...
 
Llebot "Research Data Support for Researchers: Metadata, Challenges, and Oppo...
Llebot "Research Data Support for Researchers: Metadata, Challenges, and Oppo...Llebot "Research Data Support for Researchers: Metadata, Challenges, and Oppo...
Llebot "Research Data Support for Researchers: Metadata, Challenges, and Oppo...
 

Similar to Data carpentry replicathon_2017-03-24

Data carpentry ndic-2015-05-05
Data carpentry ndic-2015-05-05Data carpentry ndic-2015-05-05
Data carpentry ndic-2015-05-05tracykteal
 
Data and Software Carpentry Science Gateways webinar 2017-05-10
Data and Software Carpentry Science Gateways webinar 2017-05-10Data and Software Carpentry Science Gateways webinar 2017-05-10
Data and Software Carpentry Science Gateways webinar 2017-05-10tracykteal
 
NAS Data Science Roundtable: Training as a Pathway to Improve Reproducibility
NAS Data Science Roundtable: Training as a Pathway to Improve ReproducibilityNAS Data Science Roundtable: Training as a Pathway to Improve Reproducibility
NAS Data Science Roundtable: Training as a Pathway to Improve ReproducibilityTracy Teal
 
Data carpentry instructor-onboarding
Data carpentry instructor-onboardingData carpentry instructor-onboarding
Data carpentry instructor-onboardingtracykteal
 
Data carpentry run-a-workshop
Data carpentry run-a-workshopData carpentry run-a-workshop
Data carpentry run-a-workshoptracykteal
 
NCME Big Data in Education
NCME Big Data  in EducationNCME Big Data  in Education
NCME Big Data in EducationPhilip Piety
 
Adopting data-driven strategies in learning analytics
Adopting data-driven strategies in learning analyticsAdopting data-driven strategies in learning analytics
Adopting data-driven strategies in learning analyticsVince Kellen, Ph.D.
 
Data Science Training and Placement
Data Science Training and PlacementData Science Training and Placement
Data Science Training and PlacementAkhilGGM
 
2-6-14 ESI Supplemental Webinar: The Data Information Literacy Project
2-6-14 ESI Supplemental Webinar: The Data Information  Literacy Project2-6-14 ESI Supplemental Webinar: The Data Information  Literacy Project
2-6-14 ESI Supplemental Webinar: The Data Information Literacy ProjectDuraSpace
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification courseKumarNaik21
 
Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)SayyedYusufali
 
Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)SayyedYusufali
 
Data science training in hydpdf converted (1)
Data science training in hydpdf  converted (1)Data science training in hydpdf  converted (1)
Data science training in hydpdf converted (1)SayyedYusufali
 
UCISA Learning Anaytics Pre-Conference Workshop
UCISA Learning Anaytics Pre-Conference WorkshopUCISA Learning Anaytics Pre-Conference Workshop
UCISA Learning Anaytics Pre-Conference WorkshopMike Moore
 
Developing Liaison Librarians Data-Intensive Research Engagement
Developing Liaison Librarians Data-Intensive Research EngagementDeveloping Liaison Librarians Data-Intensive Research Engagement
Developing Liaison Librarians Data-Intensive Research EngagementThe Entrepreneurial Librarian
 
Which institute is best for data science?
Which institute is best for data science?Which institute is best for data science?
Which institute is best for data science?DIGITALSAI1
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification courseKumarNaik21
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)SayyedYusufali
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabadVamsiNihal
 
Data science training in Hyderabad
Data science  training in HyderabadData science  training in Hyderabad
Data science training in Hyderabadsaitejavella
 

Similar to Data carpentry replicathon_2017-03-24 (20)

Data carpentry ndic-2015-05-05
Data carpentry ndic-2015-05-05Data carpentry ndic-2015-05-05
Data carpentry ndic-2015-05-05
 
Data and Software Carpentry Science Gateways webinar 2017-05-10
Data and Software Carpentry Science Gateways webinar 2017-05-10Data and Software Carpentry Science Gateways webinar 2017-05-10
Data and Software Carpentry Science Gateways webinar 2017-05-10
 
NAS Data Science Roundtable: Training as a Pathway to Improve Reproducibility
NAS Data Science Roundtable: Training as a Pathway to Improve ReproducibilityNAS Data Science Roundtable: Training as a Pathway to Improve Reproducibility
NAS Data Science Roundtable: Training as a Pathway to Improve Reproducibility
 
Data carpentry instructor-onboarding
Data carpentry instructor-onboardingData carpentry instructor-onboarding
Data carpentry instructor-onboarding
 
Data carpentry run-a-workshop
Data carpentry run-a-workshopData carpentry run-a-workshop
Data carpentry run-a-workshop
 
NCME Big Data in Education
NCME Big Data  in EducationNCME Big Data  in Education
NCME Big Data in Education
 
Adopting data-driven strategies in learning analytics
Adopting data-driven strategies in learning analyticsAdopting data-driven strategies in learning analytics
Adopting data-driven strategies in learning analytics
 
Data Science Training and Placement
Data Science Training and PlacementData Science Training and Placement
Data Science Training and Placement
 
2-6-14 ESI Supplemental Webinar: The Data Information Literacy Project
2-6-14 ESI Supplemental Webinar: The Data Information  Literacy Project2-6-14 ESI Supplemental Webinar: The Data Information  Literacy Project
2-6-14 ESI Supplemental Webinar: The Data Information Literacy Project
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
 
Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)
 
Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)
 
Data science training in hydpdf converted (1)
Data science training in hydpdf  converted (1)Data science training in hydpdf  converted (1)
Data science training in hydpdf converted (1)
 
UCISA Learning Anaytics Pre-Conference Workshop
UCISA Learning Anaytics Pre-Conference WorkshopUCISA Learning Anaytics Pre-Conference Workshop
UCISA Learning Anaytics Pre-Conference Workshop
 
Developing Liaison Librarians Data-Intensive Research Engagement
Developing Liaison Librarians Data-Intensive Research EngagementDeveloping Liaison Librarians Data-Intensive Research Engagement
Developing Liaison Librarians Data-Intensive Research Engagement
 
Which institute is best for data science?
Which institute is best for data science?Which institute is best for data science?
Which institute is best for data science?
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabad
 
Data science training in Hyderabad
Data science  training in HyderabadData science  training in Hyderabad
Data science training in Hyderabad
 

Recently uploaded

BBA 205 UNIT 3 INDUSTRIAL POLICY dr kanchan.pptx
BBA 205 UNIT 3 INDUSTRIAL POLICY dr kanchan.pptxBBA 205 UNIT 3 INDUSTRIAL POLICY dr kanchan.pptx
BBA 205 UNIT 3 INDUSTRIAL POLICY dr kanchan.pptxProf. Kanchan Kumari
 
Geoffrey Chaucer Works II UGC NET JRF TGT PGT MA PHD Entrance Exam II History...
Geoffrey Chaucer Works II UGC NET JRF TGT PGT MA PHD Entrance Exam II History...Geoffrey Chaucer Works II UGC NET JRF TGT PGT MA PHD Entrance Exam II History...
Geoffrey Chaucer Works II UGC NET JRF TGT PGT MA PHD Entrance Exam II History...DrVipulVKapoor
 
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxCLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxAnupam32727
 
Pastoral Poetry, Definition, Origin, Characteristics and Examples
Pastoral Poetry, Definition, Origin, Characteristics and ExamplesPastoral Poetry, Definition, Origin, Characteristics and Examples
Pastoral Poetry, Definition, Origin, Characteristics and ExamplesDrVipulVKapoor
 
Unit :1 Basics of Professional Intelligence
Unit :1 Basics of Professional IntelligenceUnit :1 Basics of Professional Intelligence
Unit :1 Basics of Professional IntelligenceDr Vijay Vishwakarma
 
How to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command LineHow to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command LineCeline George
 
BÀI TẬP BỔ TRỢ 4 KĨ NĂNG TIẾNG ANH LỚP 11 (CẢ NĂM) - FRIENDS GLOBAL - NĂM HỌC...
BÀI TẬP BỔ TRỢ 4 KĨ NĂNG TIẾNG ANH LỚP 11 (CẢ NĂM) - FRIENDS GLOBAL - NĂM HỌC...BÀI TẬP BỔ TRỢ 4 KĨ NĂNG TIẾNG ANH LỚP 11 (CẢ NĂM) - FRIENDS GLOBAL - NĂM HỌC...
BÀI TẬP BỔ TRỢ 4 KĨ NĂNG TIẾNG ANH LỚP 11 (CẢ NĂM) - FRIENDS GLOBAL - NĂM HỌC...Nguyen Thanh Tu Collection
 
The Emergence of Legislative Behavior in the Colombian Congress
The Emergence of Legislative Behavior in the Colombian CongressThe Emergence of Legislative Behavior in the Colombian Congress
The Emergence of Legislative Behavior in the Colombian CongressMaria Paula Aroca
 
Shark introduction Morphology and its behaviour characteristics
Shark introduction Morphology and its behaviour characteristicsShark introduction Morphology and its behaviour characteristics
Shark introduction Morphology and its behaviour characteristicsArubSultan
 
ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6Vanessa Camilleri
 
16. Discovery, function and commercial uses of different PGRS.pptx
16. Discovery, function and commercial uses of different PGRS.pptx16. Discovery, function and commercial uses of different PGRS.pptx
16. Discovery, function and commercial uses of different PGRS.pptxUmeshTimilsina1
 
Advancing Gender Equality The Crucial Role of Science and Technology 4 April ...
Advancing Gender Equality The Crucial Role of Science and Technology 4 April ...Advancing Gender Equality The Crucial Role of Science and Technology 4 April ...
Advancing Gender Equality The Crucial Role of Science and Technology 4 April ...EduSkills OECD
 
Sulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesSulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesVijayaLaxmi84
 
Transdisciplinary Pathways for Urban Resilience [Work in Progress].pptx
Transdisciplinary Pathways for Urban Resilience [Work in Progress].pptxTransdisciplinary Pathways for Urban Resilience [Work in Progress].pptx
Transdisciplinary Pathways for Urban Resilience [Work in Progress].pptxinfo924062
 
Paul Dobryden In Media Res Media Component
Paul Dobryden In Media Res Media ComponentPaul Dobryden In Media Res Media Component
Paul Dobryden In Media Res Media ComponentInMediaRes1
 
4.9.24 Social Capital and Social Exclusion.pptx
4.9.24 Social Capital and Social Exclusion.pptx4.9.24 Social Capital and Social Exclusion.pptx
4.9.24 Social Capital and Social Exclusion.pptxmary850239
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research DiscourseAnita GoswamiGiri
 

Recently uploaded (20)

BBA 205 UNIT 3 INDUSTRIAL POLICY dr kanchan.pptx
BBA 205 UNIT 3 INDUSTRIAL POLICY dr kanchan.pptxBBA 205 UNIT 3 INDUSTRIAL POLICY dr kanchan.pptx
BBA 205 UNIT 3 INDUSTRIAL POLICY dr kanchan.pptx
 
Geoffrey Chaucer Works II UGC NET JRF TGT PGT MA PHD Entrance Exam II History...
Geoffrey Chaucer Works II UGC NET JRF TGT PGT MA PHD Entrance Exam II History...Geoffrey Chaucer Works II UGC NET JRF TGT PGT MA PHD Entrance Exam II History...
Geoffrey Chaucer Works II UGC NET JRF TGT PGT MA PHD Entrance Exam II History...
 
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxCLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
 
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
 
Pastoral Poetry, Definition, Origin, Characteristics and Examples
Pastoral Poetry, Definition, Origin, Characteristics and ExamplesPastoral Poetry, Definition, Origin, Characteristics and Examples
Pastoral Poetry, Definition, Origin, Characteristics and Examples
 
Unit :1 Basics of Professional Intelligence
Unit :1 Basics of Professional IntelligenceUnit :1 Basics of Professional Intelligence
Unit :1 Basics of Professional Intelligence
 
How to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command LineHow to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command Line
 
BÀI TẬP BỔ TRỢ 4 KĨ NĂNG TIẾNG ANH LỚP 11 (CẢ NĂM) - FRIENDS GLOBAL - NĂM HỌC...
BÀI TẬP BỔ TRỢ 4 KĨ NĂNG TIẾNG ANH LỚP 11 (CẢ NĂM) - FRIENDS GLOBAL - NĂM HỌC...BÀI TẬP BỔ TRỢ 4 KĨ NĂNG TIẾNG ANH LỚP 11 (CẢ NĂM) - FRIENDS GLOBAL - NĂM HỌC...
BÀI TẬP BỔ TRỢ 4 KĨ NĂNG TIẾNG ANH LỚP 11 (CẢ NĂM) - FRIENDS GLOBAL - NĂM HỌC...
 
The Emergence of Legislative Behavior in the Colombian Congress
The Emergence of Legislative Behavior in the Colombian CongressThe Emergence of Legislative Behavior in the Colombian Congress
The Emergence of Legislative Behavior in the Colombian Congress
 
Shark introduction Morphology and its behaviour characteristics
Shark introduction Morphology and its behaviour characteristicsShark introduction Morphology and its behaviour characteristics
Shark introduction Morphology and its behaviour characteristics
 
ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6
 
16. Discovery, function and commercial uses of different PGRS.pptx
16. Discovery, function and commercial uses of different PGRS.pptx16. Discovery, function and commercial uses of different PGRS.pptx
16. Discovery, function and commercial uses of different PGRS.pptx
 
Advancing Gender Equality The Crucial Role of Science and Technology 4 April ...
Advancing Gender Equality The Crucial Role of Science and Technology 4 April ...Advancing Gender Equality The Crucial Role of Science and Technology 4 April ...
Advancing Gender Equality The Crucial Role of Science and Technology 4 April ...
 
Introduction to Research ,Need for research, Need for design of Experiments, ...
Introduction to Research ,Need for research, Need for design of Experiments, ...Introduction to Research ,Need for research, Need for design of Experiments, ...
Introduction to Research ,Need for research, Need for design of Experiments, ...
 
Sulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesSulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their uses
 
CARNAVAL COM MAGIA E EUFORIA _
CARNAVAL COM MAGIA E EUFORIA            _CARNAVAL COM MAGIA E EUFORIA            _
CARNAVAL COM MAGIA E EUFORIA _
 
Transdisciplinary Pathways for Urban Resilience [Work in Progress].pptx
Transdisciplinary Pathways for Urban Resilience [Work in Progress].pptxTransdisciplinary Pathways for Urban Resilience [Work in Progress].pptx
Transdisciplinary Pathways for Urban Resilience [Work in Progress].pptx
 
Paul Dobryden In Media Res Media Component
Paul Dobryden In Media Res Media ComponentPaul Dobryden In Media Res Media Component
Paul Dobryden In Media Res Media Component
 
4.9.24 Social Capital and Social Exclusion.pptx
4.9.24 Social Capital and Social Exclusion.pptx4.9.24 Social Capital and Social Exclusion.pptx
4.9.24 Social Capital and Social Exclusion.pptx
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research Discourse
 

Data carpentry replicathon_2017-03-24

  • 1. Democratizing data skills to advance data-driven discovery Tracy K. Teal, PhD @tracykteal, @datacarpentry https://github.com/tracykteal/datacarpentry-presentations
  • 2. Our increasing capacity to collect data is changing how we can look at the world
  • 3. Software and tools allow us to turn data into information.
  • 4. People turn information into knowledge.
  • 5. We can unleash the potential of data by empowering people
  • 6. It’s challenging to manage & analyze this data
  • 7. Researchers view the major limiting factor in research progress as a lack of expertise in how to handle and analyze data http://braembl.org.au/news/braembl-community-survey-report-2013
  • 8. Current unmet needs Barone L, Williams J and Micklos D. Unmet Needs for Analyzing Biological Big Data: A Survey of 704 NSF Principal Investigators (2017)
  • 9. How do we scale data skills along with data production?
  • 10. Build local communities and capacity
  • 11. Software and Data Carpentry Building community teaching universal data literacy
  • 12. Grassroots training effort - Developed by practitioners for practitioners - Identify skills and best practices needed in software development, data management and analysis in given domains - Collaboratively and iteratively developed openly licensed (CC-BY) training materials (all available on github) - Organize and deliver two-day, intensive hands-on workshops in fundamental data analysis skills using a pool of volunteer helpers and instructors
  • 13. Software and Data Carpentry • Core skills for effective research computing • Two-day hands-on workshops • Collaboratively developed, openly licensed lesson materials • Over 800 trained volunteer instructors on 6 continents
  • 14. Software & Data Carpentry workshops We know we can’t teach everything in two days, but the goal is to teach foundational skills to reduce the activation energy for getting started and for people to know what’s possible.
  • 15. Workshop goals • Primary goal is increased confidence in a learners ability to do computational work and continue to learn more. • Lower the activation barrier for people to get started working with data. Often people don’t know where to start. • The skills we teach should be ones they can immediately apply to their research. • People should learn the things we’re teaching. • We should see a shift in perspective in the value of skills like scripting for better research and reproducible research. • People should have a positive workshop experience that empowers them to do better work with data.
  • 16.
  • 17. Hands-on intensive workshops • Two days • Hands-on • Qualified instructors • Helpers • Post-it notes! • Friendly learning environment
  • 19. • Focused on data - teaches how to manage and analyze data in an effective and reproducible way. • Initial focus is on workshops for novices - there are no prerequisites, and no prior knowledge computational experience is assumed. • Domain specific by design – currently have lessons in ecology, in genomics developed with iPlant, in geospatial data developed with NEON, and on APIs with rOpenSci Curriculum • Project and data organization • Data cleaning and quality control • Data analysis and visualization in R or Python Domain dependent • Introduction to cloud computing • Command line • Text mining
  • 20. Curriculum • Command line • Software development best practices in R, Python or MATLAB • Github for version control • Focused on better scientific programming practices, for writing software or analysis scripts • Novice and more advanced levels • Domain agnostic
  • 21. Volunteer instructors The Software Carpentry Foundation runs the Instructor Training program that teaches volunteers teaching pedagogy and specific strategies for teaching computational skills in a workshop format.
  • 22. Training is a missing piece between data collection & data-driven discovery Training
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28. If you want to go fast, go alone. If you want to go far, go together.
  • 29. Get Involved  Request a workshop  Be a helper at a workshop  Become a partner or affiliate and train local instructors  Contribute to lesson development
  • 30. Guiding the Carpentries Software Carpentry Steering Committee: Karin Lagenstrom (Norway) Rayna Harris (UT Austin) Susan McClatchy (Jackson Labs) Kate Hertweck (UT Tyler) Christina Koch (University of Wisconsin) Mateusz Kuzak (eScience Netherlands) Belinda Weaver (Australia) Data Carpentry Steering Committee: Karen Cranston (Duke / OpenTree of Life) Hilmar Lapp (Duke) Aleksandra Pawlik (New Zealand eScience) Karthik Ram (rOpenSci / Berkeley Institute of Data Science Fellow) Ethan White (University of Florida / Moore DDD Investigator)

Editor's Notes

  1. By making data accessible and putting the data skills and the perspectives in the hands of all, we allow people to answer their own questions and capture their passion and expert knowledge.
  2. Once you get the data back though, often instead of being seen as an exciting opportunity, it’s seen as an insurmountable challenge how to manage, store, analyze or even view the data.
  3. , but for instance … Biomedical researchers in the UK had similar sentiments in a survey https://storify.com/biocrusoe/elixir-uk-node-meeting-2014-10-14 And an internal study at PLOS showed a lack of data management expertise has a dramatic impact on the ability of researchers to share their data
  4. Also we see in our workshops. Fill within hours. For instance a recent workshop at the NIH. Had 20 spots. Had a wait list of 84 people.
  5. Deliverables are lessons and workshops
  6. More than a focus on particular analysis techniques, workshops are focused on data management and what might be called data wrangling, gettting your data in to and out of analysis programs, formatting the data for analysis, conducting visualization We want to teach not only skills but affect attitudes – emphasizing importance and value of these approaches for effective and reproducible research
  7. Researchers need to be trained in how to effectively manage and use this data. Researchers don’t know how to address the questions they want to ask or aren’t even aware of the types of questions they can ask with this data. Story about genomics researcher It’s no longer the case where we’re saying ‘researchers should’. Researcher themselves want this training, feel limited by their own current data analysis skills. You likely have anecdotal evidence of this in your own community or domain
  8. Truly collaborative effort