More Related Content

Slideshows for you(20)

Similar to Data science education resources for everyone(20)


Data science education resources for everyone

  1. Data science education resources for everyone Nicole Vasilevsky, Jackie Wirz, Bjorn Pederson, Ted Laderas, Shannon McWeeney, William Hersh, David A. Dorr, Melissa Haendel Oregon Health & Science University MLA/PNC 2016
  2. The problem Major challenge: how to manage, analyze and interpret vast amounts of data being generated in biomedical research One goal of NIH Big Data to Knowledge (BD2K) initiative: provide training for students and researchers to address this Research team in the Library and Department of Medical Informatics and Clinical Epidemiology (DMICE) is developing skills courses and open educational resources (OERs)
  3. Our Approach Skills courses and OERs connect the dots that help researchers understand how to apply data science techniques in the context of their whole research life cycle Skills courses and OER topics are aimed to fill specific gaps
  4. BD2K Skills Courses Taught by BD2K Faculty, Post-doc and Staff In person format Targeted to a variety of students
  5. Defining The Problem Wrangling Data Data Identification And Resources  Problems amenable to analytics  Importance of question  Team definitions  Scope  When we do this wrong: methods don't match  Finding the right data  Search methods  Use of metadata  Data management  Exploratory Data Analysis  Data Dictionary  As you touch data, what can go wrong? Methods, Tools And Analysis Scientific Communication  Visualization  Matching algorithms to problems …  Reporting Findings and Limitations  Giving “Elevator Speech” on ideas of how to approach problem  Critique of related problem
  6. Course Offerings Course Length Who WhatWhen Intro Course Week long course (~40 hrs) July 2015 Interns and undergraduates Taught basics of data science in the context of the research life cycle Data After Dark 2 evening course (4 hrs/nt) January 2016 OHSU students, staff and faculty Emerging data science activities/research impact Data and Donuts 2 morning course (3 hrs/day) June 2016 OHSU Summer interns Basics of data science Advanced Course 4 evening course (2 hrs/nt) May 2016 OHSU students, staff and faculty Hands on Data viz / Data wrangling Data and Donuts West 4 hour course July 2016 OHSU summer interns (West Campus) Basics of data science
  7. Think like a data scientis t - the Data and Donuts workshop will provide an introduction to data science for those new to research. Summer interns encouraged to attend! Topics covered will include • What is Big Data? • Asking the right question and getting the right answers from your data • Finding data resources in the real world • Data handling 101 • Ethics of data• Communicating your science for maximal impact June 2 8 & 2 9 | 9 - 1 2 PM | D onut s! Fr ee Wor kshop! DataAndDonuts Interested? Register at http : // 1sfDeXz or email wirzj@ohsu.eduw w w bd2 k Hands-on! Learn by Doing! Join us for a 4 evening workshop: · Data Wrangling with Python and Pandas · Interactive visualization with R/ Shiny · Supervised Learning Algorithms + Kaggle Challenge Familiarity with R and Git is required. Bring your laptop! ! May 23-26th 5-7pm Register at http:/ / 1pFVvLv Department of Medical Informatics + Clinical Epidemiology + OHSU Library Funding: NIH 5R25EB020379 For more information, e-mail FREE OHSU BD2K ADVANCED DATA AFTER DARK WORKSHOP
  8. Evaluation of Skills Courses 0% 20% 40% 60% 80% 100% Evaluation Summary from Beginnner Students Beginner Percent 6 & 7 Beginner Percent 3, 4 & 5 Beginner Percent 1 & 2 0% 20% 40% 60% 80% 100% Evaluation Summary from Advanced Students Advanced Percent 6 & 7 Advanced Percent 3, 4 & 5 Advanced Percent 1 & 2 The instructors clearly presented the skills to be learned The instructors presented content in an organized manner The instructors effectively presented concepts and techniques
  9. OER Modules 01 | Biomedical Big Data Science 02 | Introduction to Big Data in Biology and Medicine 03 | Ethical Issues in Use of Big Data 04 | Clinical Standards Related to Big Data 05 | Basic Research Data Standards 06 | Public Health and Big Data 07 | Team Science 08 | Secondary Use (Reuse) of Clinical Data 09 | Publication and Peer Review 10 | Information Retrieval 11 | Version Control and Identifiers 12 | Data annotation and curation 13 | Data Tools and Landscape 14 | Ontologies 101 15 | Data metadata and provenance 16 | Semantic data interoperability 17 | Choice of Algorithms and Algorithm Dynamics 18 | Visualization and Interpretation 19 | Replication, Validation and the spectrum of Reproducibility Semantic data interoperability 20 | Regulatory Issues in Big Data for Genomics and Health Semantic Web data 21 | Hosting data dissemination and data stewardship workshops 22 | Hosting data dissemination and data stewardship workshops 23 | Terminology of Biomedical, Clinical, and Translational Research 24 | Computing Concepts for Big Data 25 | Data modeling 26 | Semantic Web data 27 | Context-based selection of data 28 | Translating the Question 29 | Implications of Provenance and Pre-processing 30 | Data tells a story 31 | Statistical Significance, P-hacking and Multiple-testing 32 | Displaying Confidence and Uncertainty
  10. What is available in the modules? Module Overview Online viewing Powerpoint files Audio files Exercises References Resources
  11. MLA- Professional Competencies For Health Sciences Librarians Competency #1 Understand the health sciences and health care environment and the policies, issues, and trends that impact that environment BDK02 - Introduction To Big Data In Biology And Medicine BDK03 - Ethical Issues In Use Of Big Data BDK07- Team Science Competency #3 Understand the principles and practices related to providing information services to meet users' needs BDK10 - Information Retrieval BDK22 - Guidelines For Reporting, Publications, And Data Sharing Competency #4 Have the ability to manage health information resources in a broad range of formats BDK09 - Publication And Peer Review BDK12 - Data Annotation And Curation BDK14 - Ontologies 101 BDK15 - Data Metadata And Provenance Competency #5 Understand and use technology and systems to manage all forms of information BDK10 - Information Retrieval BDK12 - Data Annotation And Curation BDK13 - Data and tools landscape BDK14 - Ontologies 101 BDK26 - Introduction to Semantic Web data Competency #6 Understand curricular design and instruction and have the ability to teach ways to access, organize, and use information BDK21 - Hosting Data Dissemination And Data Stewardship Workshops Competency #7 Understand scientific research methods and have the ability to critically examine and filter research literature from many related disciplines BDK07- Team Science BDK18 - Visualization And Interpretation BDK19 - Replication, Validation And The Spectrum Of Reproducibility BDK01 - Biomedical Big Data Science BDK04 - Clinical Data And Standards Related To Big Data BDK05 - Basic Research Data Standards BDK04 - Clinical Data And Standards Related To Big Data BDK05 - Basic Research Data Standards
  12. Challenges Scope Images Style Dissemination How to scope generic curricula for different levels of users How to translate diverse teaching styles into general materials How to maximize dissemination while protecting intellectual property How to incorporate images and other copyrighted materials into open resources
  13. Who are these resources for? EVERYONE! Undergraduate Students Graduate Students Clinicians Post-docs Librarians Staff Faculty
  14. Help review our modules:
  15. Acknowledgements Bill Hersh, PI Melissa Haendel, PI Shannon McWeeney, PI David Dorr, PI Ted Laderas, Instructor Jackie Wirz, Instructor Nicole Vasilevsky, Instructor Bjorn Pederson, Instructional Designer This work is supported by NIH Grants 1R25EB020379-01 and 1R25GM114820-01.
  16. You can find me at: @n_vasilevsky Thanks!

Editor's Notes

  1. I won’t read all the content on this slide, the point will just be that we mapped the MLA professional competencies to the BD2K modules. For 6 of the 7 MLA professional competencies, there are BD2K modules that could help train Librarians in these areas.