SlideShare a Scribd company logo
What Data Science Will Mean to You -
One Person’s View
Philip E. Bourne PhD
peb6a@virginia.edu
https://www.slideshare.net/pebourne
September 28, 2022 UNC
Punchline – in 45+ Years in Academia I Have
Never Seen Anything Like It
• It is a response to the digital transformation of
society
• It is touching every discipline (aka vertical)
• We can’t keep the students out of our classes
• Cause – large amounts of digital data
• Effect – interdisciplinarity, openness, translation,
search for responsibility and more
In summary, it is disruptive and higher ed. better pay attention
My Perspective aka Biases
• Practical Science Long standing computational biomedical researcher
• Open Access Co-Founder and Founding Editor in Chief PLOS
Computational Biology
• Open Knowledge First President of FORCE11
• Data are Value Involved in FAIR
• Translation First Associate Vice Chancellor for Innovation and
Industrial Alliances
• Funders as Lever First Associate Director for Data Science NIH – preprints,
data sharing, BD2K, etc.
• Change Higher Ed Founding Dean School of Data Science
In My World There was a Precedent 20-30
Years Ago Which Points to What is Coming
http://www.ornl.gov/hgmis
• High throughput DNA digital data changed how
we think about biomedicine
• Spawned a new field – bioinformatics /
computational biology/ systems biology /
biomedical data science
• Spawned a multi-billion dollar industry
Is Bioinformatics Dead? PLOS Biology 2021
Big data and data science are like the Internet…
If I asked you to define them you would all say
something different, yet you use them every day…
http://vadlo.com/cartoons.php?id=357
Given these precedents about data science we
should start with a definition/framework
In the context of a new school this gets everyone on
the same page and helps in starting to build a culture
One Definition of Data Science –
The 4+1 Model (aka domains)
• Value – assuring societal
benefit
• Design - Communication
of the value of data
• Systems – the means to
communicate and
convey benefit
• Analytics – models and
methods
• Practice – where
everything happens
[From Raf Alvarado]
The Data Science Interplay
• Value + Design = Openness,
responsibility
• Value + Analytics = Human
centered AI, algorithmic bias
• Value + Systems =
sustainability, access,
environmental impact
• Design + Analytics = literate
programming, visualization
• Design + Systems =
dashboards, engineering
design
• Analytics + Systems = ML
engineering
[From Raf Alvarado]
Thinking of data as a science unto itself is novel and controversial
Okay, so we have a definition to ground what we
do now we need a set of principles to act as the
guard rails:
• Excellence
• Inclusivity
• Openness/
Transparency
• Be FAIR
Inclusivity
Openness/FAIR
Data Science would not exist if it were not for open
data and methods. It would be wrong for us to take
and not give back
https://sparcopen.org/
https://datascience.virginia.edu/policies
Openness/FAIR
https://doi.org/10.1038/sdata.2016.18
https://www.heliosopen.org/
So we have a definition of data science and we
have a set of guiding principles, where does this
take us?
Stated another way, what do we want to be
recognized for in 10 years?
https://pebourne.wordpress.com/
But wait there are more lessons to be learned….
https://medium.com/proteinqure/welcome-into-the-fold-bbd3f3b19fdd
Google’s DeepMind’s AlphaFold2 makes gigantic leap in solving
protein structures
AlphaFold2
Numerical optimization – differential programming
Overall gradient descent trained to win CASP
Jumper et al.., 2021. Nature, 596 (7873),
pp.583-589
Transformer models using attention
Geometry invariant to
translation/rotation
Logistics Behind the Win
● Nothing fundamentally new from an AI perspective
● Data Integration
● Collaboration not competition
● Engineering challenge beyond most labs
● Compute power beyond most labs
● Team size beyond most labs
● Worked with protein structure specialists
Downstream Implications
• Cooperation rather than competition
• Public-private partnership
• Translational possibilities are endless
• Made possible by curated open data
• Appreciate engineering
What do these lessons tell us about how we think
about our data science school?
Databases
organize data
around a project.
Data warehouses
organize the data
for an organization
Data commons
organize the data
for a scientific
discipline or field
Data
Warehouse
Data Ecosystems
How we think about our
infrastructure is important
Challenges
Fixed level of funding
Opportunities
data commons
Data commons co-locate data
with cloud computing
infrastructure and commonly
used software services, tools &
apps for managing, analyzing and
sharing data to create an
interoperable resource for the
research community.*
*Robert L. Grossman, Allison Heath, Mark Murphy, Maria Patterson and Walt Wells, A Case for Data Commons Towards Data Science as a Service, IEEE
Computing in Science and Engineer, 2016. Source of image: The CDIS, GDC, & OCC data commons infrastructure at a University of Chicago data center.
Bonazzi VR, Bourne PE (2017) Should biomedical research be like Airbnb? PLoS Biol 15(4): e2001818.
Systems
[Adapted from Bob Grossman]
But wait the picture is more complicated….
Data Science versus Data Engineering – How
Much Emphasis Where?
A Data Integration Poster Child
Researcher and Assistant Professor of
Medicine Dr. Thomas Hartka, also a
current online Masters in Data Science
student, is combining two disparate
data sets—electronic health records
and DMV crash data—to save lives
after motor vehicle crashes.
“I enrolled in the MSDS program to
expand my research on automotive
safety. I have already used
techniques from classes in my work.
I hope to expand my research to
real-time analytics to improve
emergency room care.”
— Dr. Thomas Hartka, UVA School
of Medicine
Coming back to the question…
So we have a definition of data science and we
have a set of guiding principles, where does this
take us?
Stated another way, what do we want to be
recognized for in 10 years?
https://pebourne.wordpress.com/
Another way of thinking is alignment with the
university’s strengths….
Research ethics
committees (RECs) review
the ethical acceptability
of research involving
human participants.
Historically, the principal
emphases of RECs have
been to protect
participants from physical
harms and to provide
assurance as to
participants’ interests and
welfare.*
[The Framework] is
guided by, Article 27
of the 1948 Universal
Declaration of Human
Rights. Article 27
guarantees the rights
of every individual in
the world "to share in
scientific
advancement and its
benefits" (including to
freely engage in
responsible scientific
inquiry)…*
Protect human
subject data
The right of human
subjects to benefit
from research.
*GA4GH Framework for Responsible Sharing of Genomic and Health-Related Data, see goo.gl/CTavQR
Data sharing with protections provides the evidence
so patients can benefit from advances in research.
Balance protecting human subject data
with open research that benefits
patients
[Adapted from Bob Grossman]
Value
Why Responsible Data Science?
• A defining feature
• A partnership between STEM, social
sciences and the humanities
• Where UVA has strength
https://en.wikipedia.org/wiki/Jim_Gray_(computer_scientist)
https://www.microsoft.com/en-us/research/wp-
content/uploads/2009/10/Fourth_Paradigm.pdf
https://twitter.com/aip_publishing/status/856825353645559808
Yet another way of thinking
about it – the fifth paradigm..
Model
Transportability
Horizontal
Integration
Multi-scale
Integration
human
mouse
zebrafish
DNA
Gene/Protein
Network
Cell
Tissue
Organ
Body
Population
CNV SNP methylation
3D structure Gene
expression Proteomics
Metabolomics
Metabolic
Signaling
transduction
Gene
regulation
Hepatic Myoepithelial Erythrocyte
Epithelial Muscle Nervous
Liver Kidney Pancreas Heart
Physiologically based
pharmacokinetics
GWAS
Population
dynamics
Microbiota
From Harnessing Big Data for Systems Pharmacology 2017
https://doi.org/10.1146/annurev-pharmtox-010716-104659
Current roadblocks are more cultural than technical
The Fifth Paradigm: Integration Across Scales?
Gohlke et al. 2022
https://onlinelibrary.wiley.com/doi/10.1002/ctm2.726
Real World Evidence for Preventive Effects of Statins on
Cancer Incidence: A Transatlantic Analysis
EHR
Animal Models
Pathways
Daily Challenges
• Deciding what not to do
• Competition for the best team members (faculty and staff)
• Establishing a diverse team
• Lack of a comprehensive enterprise-wide data infrastructure
• Its easier to conform
During my 5-year interview as dean I was asked,
“Will we need a school of data science in 10 years
wont it be ubiquitous throughout the university?”
My response,
“Will we need a university in ten years? Wont it be
one big school of data science?”
https://pebourne.wordpress.com/2022/06/29/deans-blog-
data-science-ten-years-from-now/
Questions I Leave You With ….
• Have I overstated the case for data science?
• Are we currently doing the best by our students?
• Are the models we propose the right ones?
• Where do we go from here?
Questions?
Growing the School
M.S. IN DATA SCIENCE
Residential & Online
202
0
2020-
2023
UNDERGRADUATE
MINOR
2022
PH.D. PROGRAM
2023
UNDERGRADUATE
MAJOR
Building occupied
Team Size (FTEs)
5
40
60
80
120
Research
$5M
$10M
$20M
$30M
SDS Current Research Portfolio
12
7
4
3
2
3
3
Research Areas
Healthcare/Life Sciences
Technology/Software
Defense/Cybersecurity
Finance/Fintech
Energy/Environment
Education & Digital
Humanities
SDS strives to be a connector – a place where interdisciplinary
research driven by common data, methods and expertise
comes together

More Related Content

Similar to What Data Science Will Mean to You - One Person's View

Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
ssuser1a4f0f
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
wahiba ben abdessalem
 
The UVA School of Data Science
The UVA School of Data ScienceThe UVA School of Data Science
The UVA School of Data Science
Philip Bourne
 
AMIA 2014
AMIA 2014AMIA 2014
AMIA 2014
Philip Bourne
 
Data_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdfData_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdf
vishal choudhary
 
Information entanglement
Information entanglementInformation entanglement
Information entanglement
Willard Van De Bogart
 
UVA School of Data Science
UVA School of Data ScienceUVA School of Data Science
UVA School of Data Science
Philip Bourne
 
Tragedy of the Data Commons (ODSC-East, 2021)
Tragedy of the Data Commons (ODSC-East, 2021)Tragedy of the Data Commons (ODSC-East, 2021)
Tragedy of the Data Commons (ODSC-East, 2021)
James Hendler
 
A Successful Academic Medical Center Must be a Truly Digital Enterprise
A Successful Academic Medical Center Must be a Truly Digital EnterpriseA Successful Academic Medical Center Must be a Truly Digital Enterprise
A Successful Academic Medical Center Must be a Truly Digital Enterprise
Philip Bourne
 
My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018
Susanna-Assunta Sansone
 
Data at the NIH
Data at the NIHData at the NIH
Data at the NIH
Philip Bourne
 
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The Hyve
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The HyveOpen Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The Hyve
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The Hyve
Kees van Bochove
 
The Thinking Behind Big Data at the NIH
The Thinking Behind Big Data at the NIHThe Thinking Behind Big Data at the NIH
The Thinking Behind Big Data at the NIH
Philip Bourne
 
Holmes "Institutional Infrastructure for Data Sharing"
Holmes "Institutional Infrastructure for Data Sharing"Holmes "Institutional Infrastructure for Data Sharing"
Holmes "Institutional Infrastructure for Data Sharing"
National Information Standards Organization (NISO)
 
Tragedy of the (Data) Commons
Tragedy of the (Data) CommonsTragedy of the (Data) Commons
Tragedy of the (Data) Commons
James Hendler
 
Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020
Joanne Luciano
 
The Role of Automated Function Prediction in the Era of Big Data and Small Bu...
The Role of Automated Function Prediction in the Era of Big Data and Small Bu...The Role of Automated Function Prediction in the Era of Big Data and Small Bu...
The Role of Automated Function Prediction in the Era of Big Data and Small Bu...
Philip Bourne
 
The Analytics and Data Science Landscape
The Analytics and Data Science LandscapeThe Analytics and Data Science Landscape
The Analytics and Data Science Landscape
Philip Bourne
 
Yale Day of Data
Yale Day of Data Yale Day of Data
Yale Day of Data
Philip Bourne
 
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Carole Goble
 

Similar to What Data Science Will Mean to You - One Person's View (20)

Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
 
The UVA School of Data Science
The UVA School of Data ScienceThe UVA School of Data Science
The UVA School of Data Science
 
AMIA 2014
AMIA 2014AMIA 2014
AMIA 2014
 
Data_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdfData_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdf
 
Information entanglement
Information entanglementInformation entanglement
Information entanglement
 
UVA School of Data Science
UVA School of Data ScienceUVA School of Data Science
UVA School of Data Science
 
Tragedy of the Data Commons (ODSC-East, 2021)
Tragedy of the Data Commons (ODSC-East, 2021)Tragedy of the Data Commons (ODSC-East, 2021)
Tragedy of the Data Commons (ODSC-East, 2021)
 
A Successful Academic Medical Center Must be a Truly Digital Enterprise
A Successful Academic Medical Center Must be a Truly Digital EnterpriseA Successful Academic Medical Center Must be a Truly Digital Enterprise
A Successful Academic Medical Center Must be a Truly Digital Enterprise
 
My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018
 
Data at the NIH
Data at the NIHData at the NIH
Data at the NIH
 
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The Hyve
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The HyveOpen Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The Hyve
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The Hyve
 
The Thinking Behind Big Data at the NIH
The Thinking Behind Big Data at the NIHThe Thinking Behind Big Data at the NIH
The Thinking Behind Big Data at the NIH
 
Holmes "Institutional Infrastructure for Data Sharing"
Holmes "Institutional Infrastructure for Data Sharing"Holmes "Institutional Infrastructure for Data Sharing"
Holmes "Institutional Infrastructure for Data Sharing"
 
Tragedy of the (Data) Commons
Tragedy of the (Data) CommonsTragedy of the (Data) Commons
Tragedy of the (Data) Commons
 
Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020
 
The Role of Automated Function Prediction in the Era of Big Data and Small Bu...
The Role of Automated Function Prediction in the Era of Big Data and Small Bu...The Role of Automated Function Prediction in the Era of Big Data and Small Bu...
The Role of Automated Function Prediction in the Era of Big Data and Small Bu...
 
The Analytics and Data Science Landscape
The Analytics and Data Science LandscapeThe Analytics and Data Science Landscape
The Analytics and Data Science Landscape
 
Yale Day of Data
Yale Day of Data Yale Day of Data
Yale Day of Data
 
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
 

More from Philip Bourne

AI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a ConversationAI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a Conversation
Philip Bourne
 
AI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We GoingAI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We Going
Philip Bourne
 
Thoughts on Biological Data Sustainability
Thoughts on Biological Data SustainabilityThoughts on Biological Data Sustainability
Thoughts on Biological Data Sustainability
Philip Bourne
 
What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?
Philip Bourne
 
Data Science Meets Drug Discovery
Data Science Meets Drug DiscoveryData Science Meets Drug Discovery
Data Science Meets Drug Discovery
Philip Bourne
 
BIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in ResearchBIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in Research
Philip Bourne
 
Novo Nordisk 080522.pptx
Novo Nordisk 080522.pptxNovo Nordisk 080522.pptx
Novo Nordisk 080522.pptx
Philip Bourne
 
Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)
Philip Bourne
 
COVID and Precision Education
COVID and Precision EducationCOVID and Precision Education
COVID and Precision Education
Philip Bourne
 
Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?
Philip Bourne
 
Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?
Philip Bourne
 
Data to Advance Sustainability
Data to Advance SustainabilityData to Advance Sustainability
Data to Advance Sustainability
Philip Bourne
 
Frontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular ScalesFrontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular Scales
Philip Bourne
 
Social Responsibility in Research
Social Responsibility in ResearchSocial Responsibility in Research
Social Responsibility in Research
Philip Bourne
 
SWOT Analysis - What Does it Tell Us?
SWOT Analysis - What Does it Tell Us?SWOT Analysis - What Does it Tell Us?
SWOT Analysis - What Does it Tell Us?
Philip Bourne
 
The Most Important Ten Simple Rules
The Most Important Ten Simple RulesThe Most Important Ten Simple Rules
The Most Important Ten Simple Rules
Philip Bourne
 
Capstone Experience - SWOT Analysis
Capstone Experience - SWOT AnalysisCapstone Experience - SWOT Analysis
Capstone Experience - SWOT Analysis
Philip Bourne
 
Data Science During and After COVID-19
Data Science During and After COVID-19Data Science During and After COVID-19
Data Science During and After COVID-19
Philip Bourne
 
Lessons in Modeling from 3-D Structural & Data Science Perspectives
Lessons in Modeling from 3-D Structural & Data Science PerspectivesLessons in Modeling from 3-D Structural & Data Science Perspectives
Lessons in Modeling from 3-D Structural & Data Science Perspectives
Philip Bourne
 
University of Virginia School of Data Science
University of Virginia School of Data ScienceUniversity of Virginia School of Data Science
University of Virginia School of Data Science
Philip Bourne
 

More from Philip Bourne (20)

AI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a ConversationAI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a Conversation
 
AI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We GoingAI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We Going
 
Thoughts on Biological Data Sustainability
Thoughts on Biological Data SustainabilityThoughts on Biological Data Sustainability
Thoughts on Biological Data Sustainability
 
What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?
 
Data Science Meets Drug Discovery
Data Science Meets Drug DiscoveryData Science Meets Drug Discovery
Data Science Meets Drug Discovery
 
BIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in ResearchBIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in Research
 
Novo Nordisk 080522.pptx
Novo Nordisk 080522.pptxNovo Nordisk 080522.pptx
Novo Nordisk 080522.pptx
 
Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)
 
COVID and Precision Education
COVID and Precision EducationCOVID and Precision Education
COVID and Precision Education
 
Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?
 
Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?
 
Data to Advance Sustainability
Data to Advance SustainabilityData to Advance Sustainability
Data to Advance Sustainability
 
Frontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular ScalesFrontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular Scales
 
Social Responsibility in Research
Social Responsibility in ResearchSocial Responsibility in Research
Social Responsibility in Research
 
SWOT Analysis - What Does it Tell Us?
SWOT Analysis - What Does it Tell Us?SWOT Analysis - What Does it Tell Us?
SWOT Analysis - What Does it Tell Us?
 
The Most Important Ten Simple Rules
The Most Important Ten Simple RulesThe Most Important Ten Simple Rules
The Most Important Ten Simple Rules
 
Capstone Experience - SWOT Analysis
Capstone Experience - SWOT AnalysisCapstone Experience - SWOT Analysis
Capstone Experience - SWOT Analysis
 
Data Science During and After COVID-19
Data Science During and After COVID-19Data Science During and After COVID-19
Data Science During and After COVID-19
 
Lessons in Modeling from 3-D Structural & Data Science Perspectives
Lessons in Modeling from 3-D Structural & Data Science PerspectivesLessons in Modeling from 3-D Structural & Data Science Perspectives
Lessons in Modeling from 3-D Structural & Data Science Perspectives
 
University of Virginia School of Data Science
University of Virginia School of Data ScienceUniversity of Virginia School of Data Science
University of Virginia School of Data Science
 

Recently uploaded

Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
How to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERPHow to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERP
Celine George
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
joachimlavalley1
 
PART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePART A. Introduction to Costumer Service
PART A. Introduction to Costumer Service
PedroFerreira53928
 
How to Break the cycle of negative Thoughts
How to Break the cycle of negative ThoughtsHow to Break the cycle of negative Thoughts
How to Break the cycle of negative Thoughts
Col Mukteshwar Prasad
 
Template Jadual Bertugas Kelas (Boleh Edit)
Template Jadual Bertugas Kelas (Boleh Edit)Template Jadual Bertugas Kelas (Boleh Edit)
Template Jadual Bertugas Kelas (Boleh Edit)
rosedainty
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
Sectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfSectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdf
Vivekanand Anglo Vedic Academy
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
Celine George
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
siemaillard
 
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdfESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
Fundacja Rozwoju Społeczeństwa Przedsiębiorczego
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
Basic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumersBasic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumers
PedroFerreira53928
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
MIRIAMSALINAS13
 
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxStudents, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
EduSkills OECD
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
Jheel Barad
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
Nguyen Thanh Tu Collection
 
Fish and Chips - have they had their chips
Fish and Chips - have they had their chipsFish and Chips - have they had their chips
Fish and Chips - have they had their chips
GeoBlogs
 

Recently uploaded (20)

Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
How to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERPHow to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERP
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
 
PART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePART A. Introduction to Costumer Service
PART A. Introduction to Costumer Service
 
How to Break the cycle of negative Thoughts
How to Break the cycle of negative ThoughtsHow to Break the cycle of negative Thoughts
How to Break the cycle of negative Thoughts
 
Template Jadual Bertugas Kelas (Boleh Edit)
Template Jadual Bertugas Kelas (Boleh Edit)Template Jadual Bertugas Kelas (Boleh Edit)
Template Jadual Bertugas Kelas (Boleh Edit)
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
Sectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfSectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdf
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdfESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
Basic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumersBasic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumers
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
 
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxStudents, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
 
Fish and Chips - have they had their chips
Fish and Chips - have they had their chipsFish and Chips - have they had their chips
Fish and Chips - have they had their chips
 

What Data Science Will Mean to You - One Person's View

  • 1. What Data Science Will Mean to You - One Person’s View Philip E. Bourne PhD peb6a@virginia.edu https://www.slideshare.net/pebourne September 28, 2022 UNC
  • 2. Punchline – in 45+ Years in Academia I Have Never Seen Anything Like It • It is a response to the digital transformation of society • It is touching every discipline (aka vertical) • We can’t keep the students out of our classes • Cause – large amounts of digital data • Effect – interdisciplinarity, openness, translation, search for responsibility and more In summary, it is disruptive and higher ed. better pay attention
  • 3. My Perspective aka Biases • Practical Science Long standing computational biomedical researcher • Open Access Co-Founder and Founding Editor in Chief PLOS Computational Biology • Open Knowledge First President of FORCE11 • Data are Value Involved in FAIR • Translation First Associate Vice Chancellor for Innovation and Industrial Alliances • Funders as Lever First Associate Director for Data Science NIH – preprints, data sharing, BD2K, etc. • Change Higher Ed Founding Dean School of Data Science
  • 4. In My World There was a Precedent 20-30 Years Ago Which Points to What is Coming http://www.ornl.gov/hgmis • High throughput DNA digital data changed how we think about biomedicine • Spawned a new field – bioinformatics / computational biology/ systems biology / biomedical data science • Spawned a multi-billion dollar industry Is Bioinformatics Dead? PLOS Biology 2021
  • 5. Big data and data science are like the Internet… If I asked you to define them you would all say something different, yet you use them every day… http://vadlo.com/cartoons.php?id=357
  • 6. Given these precedents about data science we should start with a definition/framework In the context of a new school this gets everyone on the same page and helps in starting to build a culture
  • 7. One Definition of Data Science – The 4+1 Model (aka domains) • Value – assuring societal benefit • Design - Communication of the value of data • Systems – the means to communicate and convey benefit • Analytics – models and methods • Practice – where everything happens [From Raf Alvarado]
  • 8. The Data Science Interplay • Value + Design = Openness, responsibility • Value + Analytics = Human centered AI, algorithmic bias • Value + Systems = sustainability, access, environmental impact • Design + Analytics = literate programming, visualization • Design + Systems = dashboards, engineering design • Analytics + Systems = ML engineering [From Raf Alvarado] Thinking of data as a science unto itself is novel and controversial
  • 9. Okay, so we have a definition to ground what we do now we need a set of principles to act as the guard rails: • Excellence • Inclusivity • Openness/ Transparency • Be FAIR
  • 11. Openness/FAIR Data Science would not exist if it were not for open data and methods. It would be wrong for us to take and not give back https://sparcopen.org/ https://datascience.virginia.edu/policies
  • 13. So we have a definition of data science and we have a set of guiding principles, where does this take us? Stated another way, what do we want to be recognized for in 10 years? https://pebourne.wordpress.com/
  • 14. But wait there are more lessons to be learned…. https://medium.com/proteinqure/welcome-into-the-fold-bbd3f3b19fdd
  • 15.
  • 16. Google’s DeepMind’s AlphaFold2 makes gigantic leap in solving protein structures
  • 17. AlphaFold2 Numerical optimization – differential programming Overall gradient descent trained to win CASP Jumper et al.., 2021. Nature, 596 (7873), pp.583-589 Transformer models using attention Geometry invariant to translation/rotation
  • 18. Logistics Behind the Win ● Nothing fundamentally new from an AI perspective ● Data Integration ● Collaboration not competition ● Engineering challenge beyond most labs ● Compute power beyond most labs ● Team size beyond most labs ● Worked with protein structure specialists
  • 19. Downstream Implications • Cooperation rather than competition • Public-private partnership • Translational possibilities are endless • Made possible by curated open data • Appreciate engineering
  • 20. What do these lessons tell us about how we think about our data science school?
  • 21. Databases organize data around a project. Data warehouses organize the data for an organization Data commons organize the data for a scientific discipline or field Data Warehouse Data Ecosystems How we think about our infrastructure is important
  • 22. Challenges Fixed level of funding Opportunities data commons Data commons co-locate data with cloud computing infrastructure and commonly used software services, tools & apps for managing, analyzing and sharing data to create an interoperable resource for the research community.* *Robert L. Grossman, Allison Heath, Mark Murphy, Maria Patterson and Walt Wells, A Case for Data Commons Towards Data Science as a Service, IEEE Computing in Science and Engineer, 2016. Source of image: The CDIS, GDC, & OCC data commons infrastructure at a University of Chicago data center. Bonazzi VR, Bourne PE (2017) Should biomedical research be like Airbnb? PLoS Biol 15(4): e2001818. Systems [Adapted from Bob Grossman]
  • 23. But wait the picture is more complicated….
  • 24. Data Science versus Data Engineering – How Much Emphasis Where?
  • 25. A Data Integration Poster Child Researcher and Assistant Professor of Medicine Dr. Thomas Hartka, also a current online Masters in Data Science student, is combining two disparate data sets—electronic health records and DMV crash data—to save lives after motor vehicle crashes. “I enrolled in the MSDS program to expand my research on automotive safety. I have already used techniques from classes in my work. I hope to expand my research to real-time analytics to improve emergency room care.” — Dr. Thomas Hartka, UVA School of Medicine
  • 26.
  • 27. Coming back to the question… So we have a definition of data science and we have a set of guiding principles, where does this take us? Stated another way, what do we want to be recognized for in 10 years? https://pebourne.wordpress.com/
  • 28. Another way of thinking is alignment with the university’s strengths….
  • 29. Research ethics committees (RECs) review the ethical acceptability of research involving human participants. Historically, the principal emphases of RECs have been to protect participants from physical harms and to provide assurance as to participants’ interests and welfare.* [The Framework] is guided by, Article 27 of the 1948 Universal Declaration of Human Rights. Article 27 guarantees the rights of every individual in the world "to share in scientific advancement and its benefits" (including to freely engage in responsible scientific inquiry)…* Protect human subject data The right of human subjects to benefit from research. *GA4GH Framework for Responsible Sharing of Genomic and Health-Related Data, see goo.gl/CTavQR Data sharing with protections provides the evidence so patients can benefit from advances in research. Balance protecting human subject data with open research that benefits patients [Adapted from Bob Grossman] Value
  • 30. Why Responsible Data Science? • A defining feature • A partnership between STEM, social sciences and the humanities • Where UVA has strength
  • 32. Model Transportability Horizontal Integration Multi-scale Integration human mouse zebrafish DNA Gene/Protein Network Cell Tissue Organ Body Population CNV SNP methylation 3D structure Gene expression Proteomics Metabolomics Metabolic Signaling transduction Gene regulation Hepatic Myoepithelial Erythrocyte Epithelial Muscle Nervous Liver Kidney Pancreas Heart Physiologically based pharmacokinetics GWAS Population dynamics Microbiota From Harnessing Big Data for Systems Pharmacology 2017 https://doi.org/10.1146/annurev-pharmtox-010716-104659 Current roadblocks are more cultural than technical The Fifth Paradigm: Integration Across Scales?
  • 33. Gohlke et al. 2022 https://onlinelibrary.wiley.com/doi/10.1002/ctm2.726 Real World Evidence for Preventive Effects of Statins on Cancer Incidence: A Transatlantic Analysis EHR Animal Models Pathways
  • 34. Daily Challenges • Deciding what not to do • Competition for the best team members (faculty and staff) • Establishing a diverse team • Lack of a comprehensive enterprise-wide data infrastructure • Its easier to conform
  • 35. During my 5-year interview as dean I was asked, “Will we need a school of data science in 10 years wont it be ubiquitous throughout the university?” My response, “Will we need a university in ten years? Wont it be one big school of data science?” https://pebourne.wordpress.com/2022/06/29/deans-blog- data-science-ten-years-from-now/
  • 36. Questions I Leave You With …. • Have I overstated the case for data science? • Are we currently doing the best by our students? • Are the models we propose the right ones? • Where do we go from here?
  • 38. Growing the School M.S. IN DATA SCIENCE Residential & Online 202 0 2020- 2023 UNDERGRADUATE MINOR 2022 PH.D. PROGRAM 2023 UNDERGRADUATE MAJOR Building occupied Team Size (FTEs) 5 40 60 80 120 Research $5M $10M $20M $30M
  • 39. SDS Current Research Portfolio 12 7 4 3 2 3 3 Research Areas Healthcare/Life Sciences Technology/Software Defense/Cybersecurity Finance/Fintech Energy/Environment Education & Digital Humanities SDS strives to be a connector – a place where interdisciplinary research driven by common data, methods and expertise comes together

Editor's Notes

  1. I will introduce the concept of data science with a story that illustrates - citizen engagement, merging of unexpected data and societal benefit