SlideShare a Scribd company logo
1 of 22
Teaching Data Science to Undergraduate
Students
Nicole Vasilevsky
Oregon Health & Science University Library
From Evidence to Scholarship Conference
Reed College
March 16, 2018
The Data Deluge
http://jooinn.com/the-big-data-deluge.html
Issues
Data
producers
Data
Users
How do we ask the right
research question? ✔ ✔
How do we best manage,
utilize, and interpret new
knowledge from all this
data?
✔ ✔
How do we ensure our
data is reusable and
reproducible?
✔
How do we access and
reuse data from other
sources?
✔
Data science is an emerging interdisciplinary field
where researchers are trained to extract
knowledge from data, to make it more structured
and re-usable, and to garner new insights
https://www.quora.com/Will-learning-Tableau-help-me-become-a-better-data-scientist
Data Science is interdisciplinary
Data and Donuts
OHSU Library Data
Science Institute
Open Educational
Resources
1-2 day workshops
offered to OHSU
summer interns
3 day workshop offered
to researchers and
librarians
Freely available, online
training materials
Addressing the need for data science training
SStudent demographics
S
• OHSU is a biomedical university
• Medicine
• Nursing
• Public Health
• Basic science research
• etc
• OERs were funding by NIH, focus
on biomedical approaches
Open Educational Resources
NIH Big Data to Knowledge (BD2K) Program
One goal of NIH BD2K initiative is to provide training for students and
researchers to address challenges in managing, analyzing and interpreting big
data
Research team in the Library and Department of Medical Informatics and
Clinical Epidemiology (DMICE) developed open educational resources
(OERs) and skills courses
Our Approach
OERs and Skills Courses connect the dots that help researchers understand how to apply
data science techniques in the context of their whole research life cycle
Modules are aimed to fill specific gaps in the research process
Finding
resources/
data
Introduction to
Big Data
Managing data
and applying
data standards
Ethics and
regulatory
issues
Methods for
analysis,
visualization,
interpretation
Collaboration
and team
science
Sharing and
Dissemination
https://github.com/OHSUBD2K/
Data and Donuts
Think like a data scientist - the Data and Donuts workshop
will provide an introduction to data science for those new
to research. Summer interns encouraged to attend!
Topics covered will include
• What is Big Data?
• Asking the right question and getting the right
answers from your data
• Finding data resources in the real world
• Data handling 101
• Ethics of data
• Communicating your science for maximal
impact
June 2 8 & 2 9 | 9 - 1 2 PM | D onut s!
Fr ee Wor kshop!
DataAndDonuts
Interested?
Register at http
:
// bit.ly/ 1sfDeXz
or email wirzj@ohsu.edu
w w w .ohsu.edu/ bd2 k
Topics covered:
 What is big data?
 Asking the right question and
getting the right answers from
your data
 Finding data resources in the
real world
 Data handling 101
 Ethics of data
 Communicating your science for
maximal impact
Lessons learned
Coffee
1
Teaching RDM is
challenging
Interactive exercises 2
3 Games
Donuts are a hit 4
Add image
OHSU Library Data Science Institute
Structure of the institute/schedule
Day 1 Day 2 Day 3
• Introduction to Command
line/GitHub
• Data Exploration and
Statistics
• Data description
• Research Data Standards
• Data Sharing and Reuse
• Mixed methods: Quantitative
and Qualitative research
• Analyzing textual data
• Web scraping
• Mapping and Geospatial
Visualization with QGIS
Data visualization
Lessons learned
Coffee
1 Train the trainer
Targeted audience 2
3 Be adaptable
Coffee! 4
http://sites.nationalacademies.org/deps/bmsa/deps_180066
Conclusion and Future Work
Conclusion
• Successfully have offered short trainings to undergrads, OHSU
students, researcher, librarians and others
• A lot of lessons learned along the way
• Big demand for data science training
• Training sessions should be hands on and interactive
Next steps
• Data and Donuts again this summer for OHSU summer interns
• Plan a future OHSU Library Data Science Institute?
• BioData Club
• Data Jamborees
• Encourage usage of our Open Educational Resources
• We are open to new collaborations
GCC/BOSC 2018, Reed College, June 25-30, 2018
https://galaxyproject.org/events/gccbosc2018/
Resources
• OHSU Library Data Science Institute: https://ohsulibrary-
datascienceinstitute.github.io/
• BioData Club: https://biodata-club.github.io/
• Data Jamboree: http://www.ohsu.edu/xd/education/schools/school-of-
medicine/departments/computational-biology/events/data-jamboree.cfm
• Open Educational Resources: https://github.com/OHSUBD2K/
https://github.com/OHSUBD2K/Presentations
Acknowledgements
The OHSU Library Data Science Institute was supported by NNLM PNR under the National Library of Medicine
(NLM), National Institutes of Health (NIH) cooperative agreement number UG4LM012343. Data and Donuts and the
Open Educational Resources were supported by NIH Grants 1R25EB020379-01 and 1R25GM114820-01.
Bill Hersh Melissa
Haendel
David
Dorr
Shannon
McWeeney
Bjorn
Pederson
Jackie Wirz Robin
Champieux
Letisha
Wyatt
Laura Zeigan
Ted
Laderas
You can find me at:
@n_vasilevsky
vasilevs@ohsu.edu
Thanks!
https://github.com/OHSUBD2K/Presentations

More Related Content

What's hot

RDAP14: Collaboration and tension between institutions and units providing da...
RDAP14: Collaboration and tension between institutions and units providing da...RDAP14: Collaboration and tension between institutions and units providing da...
RDAP14: Collaboration and tension between institutions and units providing da...ASIS&T
 
Building the Future of Research Together
Building the Future of Research TogetherBuilding the Future of Research Together
Building the Future of Research TogetherIUPUI
 
TeachingWithData.org -- Faculty Presentation
TeachingWithData.org -- Faculty PresentationTeachingWithData.org -- Faculty Presentation
TeachingWithData.org -- Faculty PresentationICPSR
 
Figshare for institutions - Jisc Digifest 2016
Figshare for institutions - Jisc Digifest 2016Figshare for institutions - Jisc Digifest 2016
Figshare for institutions - Jisc Digifest 2016Jisc
 
Research Data Services in European Libraries: Current Offerings and Plans for...
Research Data Services in European Libraries: Current Offerings and Plans for...Research Data Services in European Libraries: Current Offerings and Plans for...
Research Data Services in European Libraries: Current Offerings and Plans for...LIBER Europe
 
RDAP14: Data on a dime, building data services at James Madison University
RDAP14: Data on a dime, building data services at James Madison University RDAP14: Data on a dime, building data services at James Madison University
RDAP14: Data on a dime, building data services at James Madison University ASIS&T
 
Capturing and Analyzing Publication, Citation and Usage Data for Contextual C...
Capturing and Analyzing Publication, Citation and Usage Data for Contextual C...Capturing and Analyzing Publication, Citation and Usage Data for Contextual C...
Capturing and Analyzing Publication, Citation and Usage Data for Contextual C...NASIG
 
RDAP14: Policy Recommendations for Institutions to Serve as Trustworthy Stewa...
RDAP14: Policy Recommendations for Institutions to Serve as Trustworthy Stewa...RDAP14: Policy Recommendations for Institutions to Serve as Trustworthy Stewa...
RDAP14: Policy Recommendations for Institutions to Serve as Trustworthy Stewa...ASIS&T
 
Research Support Services ECU Library
Research Support Services ECU LibraryResearch Support Services ECU Library
Research Support Services ECU LibraryJulia Gross
 
Getting on with it (research support at an academic library) presented at Uni...
Getting on with it (research support at an academic library) presented at Uni...Getting on with it (research support at an academic library) presented at Uni...
Getting on with it (research support at an academic library) presented at Uni...Reed Elsevier
 
RDAP14: Developing an RDM Educational Service Using the New England Collabora...
RDAP14: Developing an RDM Educational Service Using the New England Collabora...RDAP14: Developing an RDM Educational Service Using the New England Collabora...
RDAP14: Developing an RDM Educational Service Using the New England Collabora...ASIS&T
 
FAIR for the future: embracing all things data
FAIR for the future: embracing all things dataFAIR for the future: embracing all things data
FAIR for the future: embracing all things dataARDC
 
RDAP14: Comparing disciplinary repositories: tDAR vs. Open Context
RDAP14: Comparing disciplinary repositories: tDAR vs. Open ContextRDAP14: Comparing disciplinary repositories: tDAR vs. Open Context
RDAP14: Comparing disciplinary repositories: tDAR vs. Open ContextASIS&T
 

What's hot (20)

RDAP14: Collaboration and tension between institutions and units providing da...
RDAP14: Collaboration and tension between institutions and units providing da...RDAP14: Collaboration and tension between institutions and units providing da...
RDAP14: Collaboration and tension between institutions and units providing da...
 
Emerging roles and collaborations in research support for academic health lib...
Emerging roles and collaborations in research support for academic health lib...Emerging roles and collaborations in research support for academic health lib...
Emerging roles and collaborations in research support for academic health lib...
 
Building the Future of Research Together
Building the Future of Research TogetherBuilding the Future of Research Together
Building the Future of Research Together
 
Lee "Supporting Research Data is a Group Effort"
Lee "Supporting Research Data is a Group Effort"Lee "Supporting Research Data is a Group Effort"
Lee "Supporting Research Data is a Group Effort"
 
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
 
TeachingWithData.org -- Faculty Presentation
TeachingWithData.org -- Faculty PresentationTeachingWithData.org -- Faculty Presentation
TeachingWithData.org -- Faculty Presentation
 
Figshare for institutions - Jisc Digifest 2016
Figshare for institutions - Jisc Digifest 2016Figshare for institutions - Jisc Digifest 2016
Figshare for institutions - Jisc Digifest 2016
 
Research Data Services in European Libraries: Current Offerings and Plans for...
Research Data Services in European Libraries: Current Offerings and Plans for...Research Data Services in European Libraries: Current Offerings and Plans for...
Research Data Services in European Libraries: Current Offerings and Plans for...
 
RDAP14: Data on a dime, building data services at James Madison University
RDAP14: Data on a dime, building data services at James Madison University RDAP14: Data on a dime, building data services at James Madison University
RDAP14: Data on a dime, building data services at James Madison University
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
Data are the New Black
Data are the New BlackData are the New Black
Data are the New Black
 
Capturing and Analyzing Publication, Citation and Usage Data for Contextual C...
Capturing and Analyzing Publication, Citation and Usage Data for Contextual C...Capturing and Analyzing Publication, Citation and Usage Data for Contextual C...
Capturing and Analyzing Publication, Citation and Usage Data for Contextual C...
 
RDAP14: Policy Recommendations for Institutions to Serve as Trustworthy Stewa...
RDAP14: Policy Recommendations for Institutions to Serve as Trustworthy Stewa...RDAP14: Policy Recommendations for Institutions to Serve as Trustworthy Stewa...
RDAP14: Policy Recommendations for Institutions to Serve as Trustworthy Stewa...
 
Research Support Services ECU Library
Research Support Services ECU LibraryResearch Support Services ECU Library
Research Support Services ECU Library
 
Maxwell "Lessons Learned from Developing a Predictive Analytics Data Model"
Maxwell "Lessons Learned from Developing a Predictive Analytics Data Model"Maxwell "Lessons Learned from Developing a Predictive Analytics Data Model"
Maxwell "Lessons Learned from Developing a Predictive Analytics Data Model"
 
Getting on with it (research support at an academic library) presented at Uni...
Getting on with it (research support at an academic library) presented at Uni...Getting on with it (research support at an academic library) presented at Uni...
Getting on with it (research support at an academic library) presented at Uni...
 
Strasser "Effective data management and its role in open research"
Strasser "Effective data management and its role in open research"Strasser "Effective data management and its role in open research"
Strasser "Effective data management and its role in open research"
 
RDAP14: Developing an RDM Educational Service Using the New England Collabora...
RDAP14: Developing an RDM Educational Service Using the New England Collabora...RDAP14: Developing an RDM Educational Service Using the New England Collabora...
RDAP14: Developing an RDM Educational Service Using the New England Collabora...
 
FAIR for the future: embracing all things data
FAIR for the future: embracing all things dataFAIR for the future: embracing all things data
FAIR for the future: embracing all things data
 
RDAP14: Comparing disciplinary repositories: tDAR vs. Open Context
RDAP14: Comparing disciplinary repositories: tDAR vs. Open ContextRDAP14: Comparing disciplinary repositories: tDAR vs. Open Context
RDAP14: Comparing disciplinary repositories: tDAR vs. Open Context
 

Similar to Teaching Data Science to Undergraduate Students

Immersive informatics - research data management at Pitt iSchool and Carnegie...
Immersive informatics - research data management at Pitt iSchool and Carnegie...Immersive informatics - research data management at Pitt iSchool and Carnegie...
Immersive informatics - research data management at Pitt iSchool and Carnegie...Keith Webster
 
Research Data Management in Academic Libraries: Meeting the Challenge
Research Data Management in Academic Libraries: Meeting the ChallengeResearch Data Management in Academic Libraries: Meeting the Challenge
Research Data Management in Academic Libraries: Meeting the ChallengeSpencer Keralis
 
Research data support: a growth area for academic libraries?
Research data support: a growth area for academic libraries?Research data support: a growth area for academic libraries?
Research data support: a growth area for academic libraries? Robin Rice
 
Data science education resources for everyone
Data science education resources for everyoneData science education resources for everyone
Data science education resources for everyoneNicole Vasilevsky
 
Next generation data services at the Marriott Library
Next generation data services at the Marriott LibraryNext generation data services at the Marriott Library
Next generation data services at the Marriott LibraryRebekah Cummings
 
Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014Susanna-Assunta Sansone
 
Workshop 4: Open Science & Open Data for Librarians/Ina Smith
Workshop 4: Open Science & Open Data for Librarians/Ina SmithWorkshop 4: Open Science & Open Data for Librarians/Ina Smith
Workshop 4: Open Science & Open Data for Librarians/Ina SmithAfrican Open Science Platform
 
Open Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsOpen Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsMartin Donnelly
 
Fsci 2018 monday30_july_am6
Fsci 2018 monday30_july_am6Fsci 2018 monday30_july_am6
Fsci 2018 monday30_july_am6ARDC
 
Supporting the development of a national Research Data Discovery Service – a ...
Supporting the development of a national Research Data Discovery Service – a ...Supporting the development of a national Research Data Discovery Service – a ...
Supporting the development of a national Research Data Discovery Service – a ...EDINA, University of Edinburgh
 
Designing and delivering an international MOOC on Research Data Management an...
Designing and delivering an international MOOC on Research Data Management an...Designing and delivering an international MOOC on Research Data Management an...
Designing and delivering an international MOOC on Research Data Management an...Robin Rice
 
Supporting the development of a national Research Data Discovery Service - A ...
Supporting the development of a national Research Data Discovery Service - A ...Supporting the development of a national Research Data Discovery Service - A ...
Supporting the development of a national Research Data Discovery Service - A ...Historic Environment Scotland
 
Data management: The new frontier for libraries
Data management: The new frontier for librariesData management: The new frontier for libraries
Data management: The new frontier for librariesLEARN Project
 
RDAP14: Learning to Curate Panel
RDAP14: Learning to Curate Panel RDAP14: Learning to Curate Panel
RDAP14: Learning to Curate Panel ASIS&T
 
Ucla july 2018 natasha simons
Ucla july 2018 natasha simonsUcla july 2018 natasha simons
Ucla july 2018 natasha simonsARDC
 
ICPSR presentation for Research Advisory
ICPSR presentation for Research AdvisoryICPSR presentation for Research Advisory
ICPSR presentation for Research AdvisoryLynda Kellam
 
Big Data Curricula at the UW eScience Institute, JSM 2013
Big Data Curricula at the UW eScience Institute, JSM 2013Big Data Curricula at the UW eScience Institute, JSM 2013
Big Data Curricula at the UW eScience Institute, JSM 2013University of Washington
 
The Challenges of Making Data Travel, by Sabina Leonelli
The Challenges of Making Data Travel, by Sabina LeonelliThe Challenges of Making Data Travel, by Sabina Leonelli
The Challenges of Making Data Travel, by Sabina LeonelliLEARN Project
 

Similar to Teaching Data Science to Undergraduate Students (20)

Immersive informatics - research data management at Pitt iSchool and Carnegie...
Immersive informatics - research data management at Pitt iSchool and Carnegie...Immersive informatics - research data management at Pitt iSchool and Carnegie...
Immersive informatics - research data management at Pitt iSchool and Carnegie...
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
Research Data Management in Academic Libraries: Meeting the Challenge
Research Data Management in Academic Libraries: Meeting the ChallengeResearch Data Management in Academic Libraries: Meeting the Challenge
Research Data Management in Academic Libraries: Meeting the Challenge
 
Research data support: a growth area for academic libraries?
Research data support: a growth area for academic libraries?Research data support: a growth area for academic libraries?
Research data support: a growth area for academic libraries?
 
Data science education resources for everyone
Data science education resources for everyoneData science education resources for everyone
Data science education resources for everyone
 
Next generation data services at the Marriott Library
Next generation data services at the Marriott LibraryNext generation data services at the Marriott Library
Next generation data services at the Marriott Library
 
Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014
 
Workshop 4: Open Science & Open Data for Librarians/Ina Smith
Workshop 4: Open Science & Open Data for Librarians/Ina SmithWorkshop 4: Open Science & Open Data for Librarians/Ina Smith
Workshop 4: Open Science & Open Data for Librarians/Ina Smith
 
Open Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsOpen Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and Solutions
 
Fsci 2018 monday30_july_am6
Fsci 2018 monday30_july_am6Fsci 2018 monday30_july_am6
Fsci 2018 monday30_july_am6
 
Supporting the development of a national Research Data Discovery Service – a ...
Supporting the development of a national Research Data Discovery Service – a ...Supporting the development of a national Research Data Discovery Service – a ...
Supporting the development of a national Research Data Discovery Service – a ...
 
Designing and delivering an international MOOC on Research Data Management an...
Designing and delivering an international MOOC on Research Data Management an...Designing and delivering an international MOOC on Research Data Management an...
Designing and delivering an international MOOC on Research Data Management an...
 
Supporting the development of a national Research Data Discovery Service - A ...
Supporting the development of a national Research Data Discovery Service - A ...Supporting the development of a national Research Data Discovery Service - A ...
Supporting the development of a national Research Data Discovery Service - A ...
 
Data management: The new frontier for libraries
Data management: The new frontier for librariesData management: The new frontier for libraries
Data management: The new frontier for libraries
 
RDAP14: Learning to Curate Panel
RDAP14: Learning to Curate Panel RDAP14: Learning to Curate Panel
RDAP14: Learning to Curate Panel
 
Pace "How the Community Wants to Serve Its Constituents"
Pace "How the Community Wants to Serve Its Constituents"Pace "How the Community Wants to Serve Its Constituents"
Pace "How the Community Wants to Serve Its Constituents"
 
Ucla july 2018 natasha simons
Ucla july 2018 natasha simonsUcla july 2018 natasha simons
Ucla july 2018 natasha simons
 
ICPSR presentation for Research Advisory
ICPSR presentation for Research AdvisoryICPSR presentation for Research Advisory
ICPSR presentation for Research Advisory
 
Big Data Curricula at the UW eScience Institute, JSM 2013
Big Data Curricula at the UW eScience Institute, JSM 2013Big Data Curricula at the UW eScience Institute, JSM 2013
Big Data Curricula at the UW eScience Institute, JSM 2013
 
The Challenges of Making Data Travel, by Sabina Leonelli
The Challenges of Making Data Travel, by Sabina LeonelliThe Challenges of Making Data Travel, by Sabina Leonelli
The Challenges of Making Data Travel, by Sabina Leonelli
 

More from Nicole Vasilevsky

CRDC-H Draft Model Presentation to Nodes
CRDC-H Draft Model Presentation to NodesCRDC-H Draft Model Presentation to Nodes
CRDC-H Draft Model Presentation to NodesNicole Vasilevsky
 
Improving Knowledge Discovery Through Development of Big Data to Knowledge S...
Improving Knowledge Discovery Through Development of  Big Data to Knowledge S...Improving Knowledge Discovery Through Development of  Big Data to Knowledge S...
Improving Knowledge Discovery Through Development of Big Data to Knowledge S...Nicole Vasilevsky
 
Empowering patients by increasing accessibility to clinical terminology
Empowering patients by increasing accessibility to clinical terminologyEmpowering patients by increasing accessibility to clinical terminology
Empowering patients by increasing accessibility to clinical terminologyNicole Vasilevsky
 
Enhancing the Human Phenotype Ontology for Use by the Layperson
Enhancing the Human Phenotype Ontology for Use by the LaypersonEnhancing the Human Phenotype Ontology for Use by the Layperson
Enhancing the Human Phenotype Ontology for Use by the LaypersonNicole Vasilevsky
 
Enhancing the Human Phenotype Ontology for Use by the Layperson
Enhancing the Human Phenotype Ontology for Use by the LaypersonEnhancing the Human Phenotype Ontology for Use by the Layperson
Enhancing the Human Phenotype Ontology for Use by the LaypersonNicole Vasilevsky
 
Couture Curricula - BD2K Data Science Tailored to Your Needs
Couture Curricula - BD2K Data Science Tailored to Your NeedsCouture Curricula - BD2K Data Science Tailored to Your Needs
Couture Curricula - BD2K Data Science Tailored to Your NeedsNicole Vasilevsky
 
Monarch Initiative Poster - Rare Disease Symposium 2015
Monarch Initiative Poster - Rare Disease Symposium 2015Monarch Initiative Poster - Rare Disease Symposium 2015
Monarch Initiative Poster - Rare Disease Symposium 2015Nicole Vasilevsky
 
The Role of Libraries in Data Management and Curation
The Role of Libraries in Data Management and CurationThe Role of Libraries in Data Management and Curation
The Role of Libraries in Data Management and CurationNicole Vasilevsky
 
Resource Identification Initiative_RDA_March2014
Resource Identification Initiative_RDA_March2014 Resource Identification Initiative_RDA_March2014
Resource Identification Initiative_RDA_March2014 Nicole Vasilevsky
 
On the Reproducibility of Science: Unique Identification of Research Resourc...
On the Reproducibility of Science: Unique Identification of  Research Resourc...On the Reproducibility of Science: Unique Identification of  Research Resourc...
On the Reproducibility of Science: Unique Identification of Research Resourc...Nicole Vasilevsky
 
Research resources: curating the new eagle-i discovery system
Research resources: curating the new eagle-i discovery systemResearch resources: curating the new eagle-i discovery system
Research resources: curating the new eagle-i discovery systemNicole Vasilevsky
 

More from Nicole Vasilevsky (13)

CRDC-H Draft Model Presentation to Nodes
CRDC-H Draft Model Presentation to NodesCRDC-H Draft Model Presentation to Nodes
CRDC-H Draft Model Presentation to Nodes
 
An Introduction to CCDH
An Introduction to CCDHAn Introduction to CCDH
An Introduction to CCDH
 
Improving Knowledge Discovery Through Development of Big Data to Knowledge S...
Improving Knowledge Discovery Through Development of  Big Data to Knowledge S...Improving Knowledge Discovery Through Development of  Big Data to Knowledge S...
Improving Knowledge Discovery Through Development of Big Data to Knowledge S...
 
Empowering patients by increasing accessibility to clinical terminology
Empowering patients by increasing accessibility to clinical terminologyEmpowering patients by increasing accessibility to clinical terminology
Empowering patients by increasing accessibility to clinical terminology
 
Enhancing the Human Phenotype Ontology for Use by the Layperson
Enhancing the Human Phenotype Ontology for Use by the LaypersonEnhancing the Human Phenotype Ontology for Use by the Layperson
Enhancing the Human Phenotype Ontology for Use by the Layperson
 
Enhancing the Human Phenotype Ontology for Use by the Layperson
Enhancing the Human Phenotype Ontology for Use by the LaypersonEnhancing the Human Phenotype Ontology for Use by the Layperson
Enhancing the Human Phenotype Ontology for Use by the Layperson
 
Couture Curricula - BD2K Data Science Tailored to Your Needs
Couture Curricula - BD2K Data Science Tailored to Your NeedsCouture Curricula - BD2K Data Science Tailored to Your Needs
Couture Curricula - BD2K Data Science Tailored to Your Needs
 
Monarch Initiative Poster - Rare Disease Symposium 2015
Monarch Initiative Poster - Rare Disease Symposium 2015Monarch Initiative Poster - Rare Disease Symposium 2015
Monarch Initiative Poster - Rare Disease Symposium 2015
 
Acrl march2015 final
Acrl march2015 finalAcrl march2015 final
Acrl march2015 final
 
The Role of Libraries in Data Management and Curation
The Role of Libraries in Data Management and CurationThe Role of Libraries in Data Management and Curation
The Role of Libraries in Data Management and Curation
 
Resource Identification Initiative_RDA_March2014
Resource Identification Initiative_RDA_March2014 Resource Identification Initiative_RDA_March2014
Resource Identification Initiative_RDA_March2014
 
On the Reproducibility of Science: Unique Identification of Research Resourc...
On the Reproducibility of Science: Unique Identification of  Research Resourc...On the Reproducibility of Science: Unique Identification of  Research Resourc...
On the Reproducibility of Science: Unique Identification of Research Resourc...
 
Research resources: curating the new eagle-i discovery system
Research resources: curating the new eagle-i discovery systemResearch resources: curating the new eagle-i discovery system
Research resources: curating the new eagle-i discovery system
 

Recently uploaded

6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Rithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfRithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfrahulyadav957181
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataTecnoIncentive
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxHaritikaChhatwal1
 
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdfWorld Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdfsimulationsindia
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaManalVerma4
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 

Recently uploaded (20)

6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Rithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfRithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdf
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded data
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptx
 
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdfWorld Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in India
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 

Teaching Data Science to Undergraduate Students

  • 1. Teaching Data Science to Undergraduate Students Nicole Vasilevsky Oregon Health & Science University Library From Evidence to Scholarship Conference Reed College March 16, 2018
  • 2. The Data Deluge http://jooinn.com/the-big-data-deluge.html Issues Data producers Data Users How do we ask the right research question? ✔ ✔ How do we best manage, utilize, and interpret new knowledge from all this data? ✔ ✔ How do we ensure our data is reusable and reproducible? ✔ How do we access and reuse data from other sources? ✔
  • 3. Data science is an emerging interdisciplinary field where researchers are trained to extract knowledge from data, to make it more structured and re-usable, and to garner new insights
  • 5. Data and Donuts OHSU Library Data Science Institute Open Educational Resources 1-2 day workshops offered to OHSU summer interns 3 day workshop offered to researchers and librarians Freely available, online training materials Addressing the need for data science training
  • 6. SStudent demographics S • OHSU is a biomedical university • Medicine • Nursing • Public Health • Basic science research • etc • OERs were funding by NIH, focus on biomedical approaches
  • 8. NIH Big Data to Knowledge (BD2K) Program One goal of NIH BD2K initiative is to provide training for students and researchers to address challenges in managing, analyzing and interpreting big data Research team in the Library and Department of Medical Informatics and Clinical Epidemiology (DMICE) developed open educational resources (OERs) and skills courses
  • 9. Our Approach OERs and Skills Courses connect the dots that help researchers understand how to apply data science techniques in the context of their whole research life cycle
  • 10. Modules are aimed to fill specific gaps in the research process Finding resources/ data Introduction to Big Data Managing data and applying data standards Ethics and regulatory issues Methods for analysis, visualization, interpretation Collaboration and team science Sharing and Dissemination https://github.com/OHSUBD2K/
  • 12. Think like a data scientist - the Data and Donuts workshop will provide an introduction to data science for those new to research. Summer interns encouraged to attend! Topics covered will include • What is Big Data? • Asking the right question and getting the right answers from your data • Finding data resources in the real world • Data handling 101 • Ethics of data • Communicating your science for maximal impact June 2 8 & 2 9 | 9 - 1 2 PM | D onut s! Fr ee Wor kshop! DataAndDonuts Interested? Register at http : // bit.ly/ 1sfDeXz or email wirzj@ohsu.edu w w w .ohsu.edu/ bd2 k Topics covered:  What is big data?  Asking the right question and getting the right answers from your data  Finding data resources in the real world  Data handling 101  Ethics of data  Communicating your science for maximal impact
  • 13. Lessons learned Coffee 1 Teaching RDM is challenging Interactive exercises 2 3 Games Donuts are a hit 4 Add image
  • 14. OHSU Library Data Science Institute
  • 15.
  • 16. Structure of the institute/schedule Day 1 Day 2 Day 3 • Introduction to Command line/GitHub • Data Exploration and Statistics • Data description • Research Data Standards • Data Sharing and Reuse • Mixed methods: Quantitative and Qualitative research • Analyzing textual data • Web scraping • Mapping and Geospatial Visualization with QGIS Data visualization
  • 17. Lessons learned Coffee 1 Train the trainer Targeted audience 2 3 Be adaptable Coffee! 4 http://sites.nationalacademies.org/deps/bmsa/deps_180066
  • 18. Conclusion and Future Work Conclusion • Successfully have offered short trainings to undergrads, OHSU students, researcher, librarians and others • A lot of lessons learned along the way • Big demand for data science training • Training sessions should be hands on and interactive Next steps • Data and Donuts again this summer for OHSU summer interns • Plan a future OHSU Library Data Science Institute? • BioData Club • Data Jamborees • Encourage usage of our Open Educational Resources • We are open to new collaborations
  • 19. GCC/BOSC 2018, Reed College, June 25-30, 2018 https://galaxyproject.org/events/gccbosc2018/
  • 20. Resources • OHSU Library Data Science Institute: https://ohsulibrary- datascienceinstitute.github.io/ • BioData Club: https://biodata-club.github.io/ • Data Jamboree: http://www.ohsu.edu/xd/education/schools/school-of- medicine/departments/computational-biology/events/data-jamboree.cfm • Open Educational Resources: https://github.com/OHSUBD2K/ https://github.com/OHSUBD2K/Presentations
  • 21. Acknowledgements The OHSU Library Data Science Institute was supported by NNLM PNR under the National Library of Medicine (NLM), National Institutes of Health (NIH) cooperative agreement number UG4LM012343. Data and Donuts and the Open Educational Resources were supported by NIH Grants 1R25EB020379-01 and 1R25GM114820-01. Bill Hersh Melissa Haendel David Dorr Shannon McWeeney Bjorn Pederson Jackie Wirz Robin Champieux Letisha Wyatt Laura Zeigan Ted Laderas
  • 22. You can find me at: @n_vasilevsky vasilevs@ohsu.edu Thanks! https://github.com/OHSUBD2K/Presentations

Editor's Notes

  1. Image credits: Big data: Josh, The Noun Project (https://thenounproject.com/jkdubb/) Finding resources, By creative outlet, The Noun Project (https://thenounproject.com/creativeoutlet/) Scale: Ralf Schmitzer, The Noun Project (https://thenounproject.com/ralfschmitzer/) Information download: By Vectors Market Analysis: By Yamini Ahluwalia, GB Share: By Anand A Nair Team: By Rockicon, The Noun Project (https://thenounproject.com/rockicon/)
  2. Add text on slide