1. Teaching Data Science to Undergraduate
Students
Nicole Vasilevsky
Oregon Health & Science University Library
From Evidence to Scholarship Conference
Reed College
March 16, 2018
3. Data science is an emerging interdisciplinary field
where researchers are trained to extract
knowledge from data, to make it more structured
and re-usable, and to garner new insights
5. Data and Donuts
OHSU Library Data
Science Institute
Open Educational
Resources
1-2 day workshops
offered to OHSU
summer interns
3 day workshop offered
to researchers and
librarians
Freely available, online
training materials
Addressing the need for data science training
6. SStudent demographics
S
• OHSU is a biomedical university
• Medicine
• Nursing
• Public Health
• Basic science research
• etc
• OERs were funding by NIH, focus
on biomedical approaches
8. NIH Big Data to Knowledge (BD2K) Program
One goal of NIH BD2K initiative is to provide training for students and
researchers to address challenges in managing, analyzing and interpreting big
data
Research team in the Library and Department of Medical Informatics and
Clinical Epidemiology (DMICE) developed open educational resources
(OERs) and skills courses
9. Our Approach
OERs and Skills Courses connect the dots that help researchers understand how to apply
data science techniques in the context of their whole research life cycle
10. Modules are aimed to fill specific gaps in the research process
Finding
resources/
data
Introduction to
Big Data
Managing data
and applying
data standards
Ethics and
regulatory
issues
Methods for
analysis,
visualization,
interpretation
Collaboration
and team
science
Sharing and
Dissemination
https://github.com/OHSUBD2K/
12. Think like a data scientist - the Data and Donuts workshop
will provide an introduction to data science for those new
to research. Summer interns encouraged to attend!
Topics covered will include
• What is Big Data?
• Asking the right question and getting the right
answers from your data
• Finding data resources in the real world
• Data handling 101
• Ethics of data
• Communicating your science for maximal
impact
June 2 8 & 2 9 | 9 - 1 2 PM | D onut s!
Fr ee Wor kshop!
DataAndDonuts
Interested?
Register at http
:
// bit.ly/ 1sfDeXz
or email wirzj@ohsu.edu
w w w .ohsu.edu/ bd2 k
Topics covered:
What is big data?
Asking the right question and
getting the right answers from
your data
Finding data resources in the
real world
Data handling 101
Ethics of data
Communicating your science for
maximal impact
16. Structure of the institute/schedule
Day 1 Day 2 Day 3
• Introduction to Command
line/GitHub
• Data Exploration and
Statistics
• Data description
• Research Data Standards
• Data Sharing and Reuse
• Mixed methods: Quantitative
and Qualitative research
• Analyzing textual data
• Web scraping
• Mapping and Geospatial
Visualization with QGIS
Data visualization
17. Lessons learned
Coffee
1 Train the trainer
Targeted audience 2
3 Be adaptable
Coffee! 4
http://sites.nationalacademies.org/deps/bmsa/deps_180066
18. Conclusion and Future Work
Conclusion
• Successfully have offered short trainings to undergrads, OHSU
students, researcher, librarians and others
• A lot of lessons learned along the way
• Big demand for data science training
• Training sessions should be hands on and interactive
Next steps
• Data and Donuts again this summer for OHSU summer interns
• Plan a future OHSU Library Data Science Institute?
• BioData Club
• Data Jamborees
• Encourage usage of our Open Educational Resources
• We are open to new collaborations
19. GCC/BOSC 2018, Reed College, June 25-30, 2018
https://galaxyproject.org/events/gccbosc2018/
20. Resources
• OHSU Library Data Science Institute: https://ohsulibrary-
datascienceinstitute.github.io/
• BioData Club: https://biodata-club.github.io/
• Data Jamboree: http://www.ohsu.edu/xd/education/schools/school-of-
medicine/departments/computational-biology/events/data-jamboree.cfm
• Open Educational Resources: https://github.com/OHSUBD2K/
https://github.com/OHSUBD2K/Presentations
21. Acknowledgements
The OHSU Library Data Science Institute was supported by NNLM PNR under the National Library of Medicine
(NLM), National Institutes of Health (NIH) cooperative agreement number UG4LM012343. Data and Donuts and the
Open Educational Resources were supported by NIH Grants 1R25EB020379-01 and 1R25GM114820-01.
Bill Hersh Melissa
Haendel
David
Dorr
Shannon
McWeeney
Bjorn
Pederson
Jackie Wirz Robin
Champieux
Letisha
Wyatt
Laura Zeigan
Ted
Laderas
22. You can find me at:
@n_vasilevsky
vasilevs@ohsu.edu
Thanks!
https://github.com/OHSUBD2K/Presentations
Editor's Notes
Image credits:
Big data: Josh, The Noun Project (https://thenounproject.com/jkdubb/)
Finding resources, By creative outlet, The Noun Project (https://thenounproject.com/creativeoutlet/)
Scale: Ralf Schmitzer, The Noun Project (https://thenounproject.com/ralfschmitzer/)
Information download: By Vectors Market
Analysis: By Yamini Ahluwalia, GB
Share: By Anand A Nair
Team: By Rockicon, The Noun Project (https://thenounproject.com/rockicon/)