SlideShare a Scribd company logo
1 of 26
What is FAIR Data and Who Needs It?
Philip E. Bourne PhD
peb6a@virginia.edu
https://www.slideshare.net/pebourne
November 2, 2023 COGGE @ NASM
My
Perspective/Bias
• Data resource developer
• Open science/scholarship advocate
• An author of the FAIR Principles
• First Chief Data Officer for NIH
• Founding dean of a school of data science
• Biologist not a Geotech person
11/02/23 2
COGGE
What is Open Data?
11/02/23 3
Open data refers to data that is freely available to
the public without any restrictions on its use or
distribution. The idea behind open data is to
promote transparency, enable research, and foster
innovation by allowing individuals and
organizations to access and use datasets without
encountering legal, technical, or financial barriers.
ChatGPT
COGGE
How is Open Data Licensed?
License Types
• Public Domain Dedication and
License (PDDL)
• Open Data Commons
Attribution License (ODC-BY):
• Open Data Commons Open
Database License (ODbL)
• Creative Commons Licenses:
• MIT License and BSD Licenses
• Government-specific Licenses
Conditions
• Public access
• Attribution
• Copyright, patent, IP rights
• Share-alike
• Commercial/non-commercial
11/02/23 4
COGGE
11/02/23 5
COGGE
Perceptions and Reality
11/02/23 6
From the Center for Open Science Survey
https://www.cos.io/initiatives/open-scholarship-survey
COGGE
11/02/23 7
From the Center for Open Science Survey
https://www.cos.io/initiatives/open-scholarship-survey
COGGE
11/02/23 8
From the Center for Open Science Survey
https://www.cos.io/initiatives/open-scholarship-survey
COGGE
https://en.wikipedia.org/wiki/Jim_Gray_(computer_scientist)
https://www.microsoft.com/en-us/research/wp-
content/uploads/2009/10/Fourth_Paradigm.pdf
https://twitter.com/aip_publishing/status/856825353645559808
Why Now?
11/02/23 9
COGGE
Basic Premise …..
We are at a new tipping point
11/02/23 10
COGGE
Data Science
As a Driver Its Just the Beginning….
https://zenodo.org/record/6497693
45 Members Data scientist jobs are predicted to experience 36
percent growth between 2021 and 2031, according
to the US Bureau of Labor Statistics.
The global data science platform market size was
valued at USD 64.14 billion in 2021 and is projected
to grow from USD 81.47 billion in 2022 to USD
484.17 billion by 2029, exhibiting a CAGR of 29.0%
during the forecast period.
Data science is the fastest emerging field around the
globe.
11/02/23 11
COGGE
Data Science –
In 45+ Years in Academia I Have Never Seen Anything Like It
• It is a response to the digital transformation of
society
• It is touching every discipline (aka vertical)
• We can’t keep the students out of our classes
• Cause – large amounts of digital data
• Effect – interdisciplinarity, openness, translation,
search for responsibility and more
In summary, it is disruptive and soon {now?} the driver of what you do
11/02/23 12
COGGE
How Disruptive Will it Be?
11/02/23 13
COGGE
It Starts with Digitization
14
11/02/23 COGGE
Digitization
Deception
Disruption
Demonetization
Dematerialization
Democratization
Time
Volume,
Velocity,
Variety
Digital camera invented by
Kodak but shelved
Megapixels & quality improve slowly;
Kodak slow to react
Film market collapses;
Kodak goes bankrupt
Phones replace
cameras
Instagram,
Flickr become the
value proposition
Digital media becomes bona fide
form of communication
Example - Photography
15
11/02/23 COGGE
How Disruptive – Witness AlphaGo
https://www.alphagomovie.com/
1. Even the programmers were
disquieted by creating
something better than any
human
2. AlphaGo made a move that no
human Go expert nor
programmer anticipated
3. It takes a lot of resources to
defeat the world champion
Go has more moves than there are atoms in the universe
11/02/23 16
COGGE
Proteins have ~20**300 combinations also more than the
number of atoms in the universe
11/02/23 17
COGGE
Science Games….
https://medium.com/proteinqure/welcome-into-the-fold-bbd3f3b19fdd
11/02/23 18
COGGE
11/02/23 19
COGGE
AlphaFold2 Makes Significant Leap
20
Logistics Behind the Win
● Nothing fundamentally new from an AI perspective
● FAIR Data
● Collaboration not competition
● Engineering challenge beyond most labs
● Compute power beyond most labs
● Team size beyond most labs
● Worked with protein structure specialists
21
Downstream Implications
• Cooperation rather than competition
• Public-private partnership
• Translational possibilities are endless
• Made possible by curated open data
• Appreciate engineering
11/02/23 22
COGGE
OK if data are
important, they
need to be FAIR
11/02/23 23
COGGE
OK if data are
important, they
need to be FAIR
11/02/23 24
COGGE
A FAIR Poster Child
Researcher and Assistant Professor of
Medicine Dr. Thomas Hartka, also a
current online Masters in Data Science
student, is combining two disparate
data sets—electronic health records
and DMV crash data—to save lives
after motor vehicle crashes.
“I enrolled in the MSDS program to
expand my research on automotive
safety. I have already used
techniques from classes in my work.
I hope to expand my research to
real-time analytics to improve
emergency room care.”
— Dr. Thomas Hartka, UVA School
of Medicine
11/02/23 25
COGGE
Conversation Cards
• Is the disruption as profound as I indicate? If so,
• What is in it for you?
• How much will it cost?
• If you sustain FAIR data
• What is in it for you?
• How much will it cost?
11/02/23 26
COGGE

More Related Content

Similar to What is FAIR Data and Who Needs It?

From the Benchtop to the Datacenter: IT and Converged Infrastructure in Life ...
From the Benchtop to the Datacenter: IT and Converged Infrastructure in Life ...From the Benchtop to the Datacenter: IT and Converged Infrastructure in Life ...
From the Benchtop to the Datacenter: IT and Converged Infrastructure in Life ...
Ari Berman
 
Kelly presentation ARIN6912
Kelly presentation ARIN6912Kelly presentation ARIN6912
Kelly presentation ARIN6912
KellyJStock
 

Similar to What is FAIR Data and Who Needs It? (20)

Data-Ed Webinar: Demystifying Big Data
Data-Ed Webinar: Demystifying Big Data Data-Ed Webinar: Demystifying Big Data
Data-Ed Webinar: Demystifying Big Data
 
Data-Ed: Demystifying Big Data
Data-Ed: Demystifying Big Data Data-Ed: Demystifying Big Data
Data-Ed: Demystifying Big Data
 
Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...
 
The NIH as a Digital Enterprise: Implications for PAG
The NIH as a Digital Enterprise: Implications for PAGThe NIH as a Digital Enterprise: Implications for PAG
The NIH as a Digital Enterprise: Implications for PAG
 
Open Data in a Global Ecosystem
Open Data in a Global EcosystemOpen Data in a Global Ecosystem
Open Data in a Global Ecosystem
 
GlobusWorld 2021: Managing Genomics Data at the DOE Joint Genomics Institute
GlobusWorld 2021: Managing Genomics Data at the DOE Joint Genomics InstituteGlobusWorld 2021: Managing Genomics Data at the DOE Joint Genomics Institute
GlobusWorld 2021: Managing Genomics Data at the DOE Joint Genomics Institute
 
Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...
Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...
Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...
 
Data Science Meets Academia - What Comes Next?
Data Science Meets Academia - What Comes Next?Data Science Meets Academia - What Comes Next?
Data Science Meets Academia - What Comes Next?
 
Better Data for a Better World
Better Data for a Better WorldBetter Data for a Better World
Better Data for a Better World
 
AI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data ScienceAI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data Science
 
Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...
 
The Thinking Behind Big Data at the NIH
The Thinking Behind Big Data at the NIHThe Thinking Behind Big Data at the NIH
The Thinking Behind Big Data at the NIH
 
Data driven innovation for education
Data driven innovation for education Data driven innovation for education
Data driven innovation for education
 
From the Benchtop to the Datacenter: IT and Converged Infrastructure in Life ...
From the Benchtop to the Datacenter: IT and Converged Infrastructure in Life ...From the Benchtop to the Datacenter: IT and Converged Infrastructure in Life ...
From the Benchtop to the Datacenter: IT and Converged Infrastructure in Life ...
 
A coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonA coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon Hodson
 
Data & Technological Citizenship
Data & Technological CitizenshipData & Technological Citizenship
Data & Technological Citizenship
 
Rdaeu russia_fg_1_july2014_final
Rdaeu  russia_fg_1_july2014_finalRdaeu  russia_fg_1_july2014_final
Rdaeu russia_fg_1_july2014_final
 
Blockchain: Information Tracking - Manion AFCEA/GMU C4i
Blockchain: Information Tracking - Manion AFCEA/GMU C4iBlockchain: Information Tracking - Manion AFCEA/GMU C4i
Blockchain: Information Tracking - Manion AFCEA/GMU C4i
 
GODAN presentation with South Chinese Scientific Institutions
GODAN presentation with South Chinese Scientific InstitutionsGODAN presentation with South Chinese Scientific Institutions
GODAN presentation with South Chinese Scientific Institutions
 
Kelly presentation ARIN6912
Kelly presentation ARIN6912Kelly presentation ARIN6912
Kelly presentation ARIN6912
 

More from Philip Bourne

More from Philip Bourne (20)

AI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a ConversationAI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a Conversation
 
AI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We GoingAI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We Going
 
Thoughts on Biological Data Sustainability
Thoughts on Biological Data SustainabilityThoughts on Biological Data Sustainability
Thoughts on Biological Data Sustainability
 
Data Science Meets Drug Discovery
Data Science Meets Drug DiscoveryData Science Meets Drug Discovery
Data Science Meets Drug Discovery
 
BIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in ResearchBIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in Research
 
What Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewWhat Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's View
 
Novo Nordisk 080522.pptx
Novo Nordisk 080522.pptxNovo Nordisk 080522.pptx
Novo Nordisk 080522.pptx
 
Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)
 
COVID and Precision Education
COVID and Precision EducationCOVID and Precision Education
COVID and Precision Education
 
Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?
 
Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?
 
Data to Advance Sustainability
Data to Advance SustainabilityData to Advance Sustainability
Data to Advance Sustainability
 
Frontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular ScalesFrontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular Scales
 
Social Responsibility in Research
Social Responsibility in ResearchSocial Responsibility in Research
Social Responsibility in Research
 
SWOT Analysis - What Does it Tell Us?
SWOT Analysis - What Does it Tell Us?SWOT Analysis - What Does it Tell Us?
SWOT Analysis - What Does it Tell Us?
 
The Analytics and Data Science Landscape
The Analytics and Data Science LandscapeThe Analytics and Data Science Landscape
The Analytics and Data Science Landscape
 
The UVA School of Data Science
The UVA School of Data ScienceThe UVA School of Data Science
The UVA School of Data Science
 
The Most Important Ten Simple Rules
The Most Important Ten Simple RulesThe Most Important Ten Simple Rules
The Most Important Ten Simple Rules
 
UVA School of Data Science
UVA School of Data ScienceUVA School of Data Science
UVA School of Data Science
 
Capstone Experience - SWOT Analysis
Capstone Experience - SWOT AnalysisCapstone Experience - SWOT Analysis
Capstone Experience - SWOT Analysis
 

Recently uploaded

QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lessonQUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
httgc7rh9c
 

Recently uploaded (20)

Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
PANDITA RAMABAI- Indian political thought GENDER.pptx
PANDITA RAMABAI- Indian political thought GENDER.pptxPANDITA RAMABAI- Indian political thought GENDER.pptx
PANDITA RAMABAI- Indian political thought GENDER.pptx
 
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lessonQUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
What is 3 Way Matching Process in Odoo 17.pptx
What is 3 Way Matching Process in Odoo 17.pptxWhat is 3 Way Matching Process in Odoo 17.pptx
What is 3 Way Matching Process in Odoo 17.pptx
 
Our Environment Class 10 Science Notes pdf
Our Environment Class 10 Science Notes pdfOur Environment Class 10 Science Notes pdf
Our Environment Class 10 Science Notes pdf
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
OS-operating systems- ch05 (CPU Scheduling) ...
OS-operating systems- ch05 (CPU Scheduling) ...OS-operating systems- ch05 (CPU Scheduling) ...
OS-operating systems- ch05 (CPU Scheduling) ...
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & Systems
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
dusjagr & nano talk on open tools for agriculture research and learning
dusjagr & nano talk on open tools for agriculture research and learningdusjagr & nano talk on open tools for agriculture research and learning
dusjagr & nano talk on open tools for agriculture research and learning
 
Simple, Complex, and Compound Sentences Exercises.pdf
Simple, Complex, and Compound Sentences Exercises.pdfSimple, Complex, and Compound Sentences Exercises.pdf
Simple, Complex, and Compound Sentences Exercises.pdf
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 

What is FAIR Data and Who Needs It?

  • 1. What is FAIR Data and Who Needs It? Philip E. Bourne PhD peb6a@virginia.edu https://www.slideshare.net/pebourne November 2, 2023 COGGE @ NASM
  • 2. My Perspective/Bias • Data resource developer • Open science/scholarship advocate • An author of the FAIR Principles • First Chief Data Officer for NIH • Founding dean of a school of data science • Biologist not a Geotech person 11/02/23 2 COGGE
  • 3. What is Open Data? 11/02/23 3 Open data refers to data that is freely available to the public without any restrictions on its use or distribution. The idea behind open data is to promote transparency, enable research, and foster innovation by allowing individuals and organizations to access and use datasets without encountering legal, technical, or financial barriers. ChatGPT COGGE
  • 4. How is Open Data Licensed? License Types • Public Domain Dedication and License (PDDL) • Open Data Commons Attribution License (ODC-BY): • Open Data Commons Open Database License (ODbL) • Creative Commons Licenses: • MIT License and BSD Licenses • Government-specific Licenses Conditions • Public access • Attribution • Copyright, patent, IP rights • Share-alike • Commercial/non-commercial 11/02/23 4 COGGE
  • 6. Perceptions and Reality 11/02/23 6 From the Center for Open Science Survey https://www.cos.io/initiatives/open-scholarship-survey COGGE
  • 7. 11/02/23 7 From the Center for Open Science Survey https://www.cos.io/initiatives/open-scholarship-survey COGGE
  • 8. 11/02/23 8 From the Center for Open Science Survey https://www.cos.io/initiatives/open-scholarship-survey COGGE
  • 10. Basic Premise ….. We are at a new tipping point 11/02/23 10 COGGE
  • 11. Data Science As a Driver Its Just the Beginning…. https://zenodo.org/record/6497693 45 Members Data scientist jobs are predicted to experience 36 percent growth between 2021 and 2031, according to the US Bureau of Labor Statistics. The global data science platform market size was valued at USD 64.14 billion in 2021 and is projected to grow from USD 81.47 billion in 2022 to USD 484.17 billion by 2029, exhibiting a CAGR of 29.0% during the forecast period. Data science is the fastest emerging field around the globe. 11/02/23 11 COGGE
  • 12. Data Science – In 45+ Years in Academia I Have Never Seen Anything Like It • It is a response to the digital transformation of society • It is touching every discipline (aka vertical) • We can’t keep the students out of our classes • Cause – large amounts of digital data • Effect – interdisciplinarity, openness, translation, search for responsibility and more In summary, it is disruptive and soon {now?} the driver of what you do 11/02/23 12 COGGE
  • 13. How Disruptive Will it Be? 11/02/23 13 COGGE
  • 14. It Starts with Digitization 14 11/02/23 COGGE
  • 15. Digitization Deception Disruption Demonetization Dematerialization Democratization Time Volume, Velocity, Variety Digital camera invented by Kodak but shelved Megapixels & quality improve slowly; Kodak slow to react Film market collapses; Kodak goes bankrupt Phones replace cameras Instagram, Flickr become the value proposition Digital media becomes bona fide form of communication Example - Photography 15 11/02/23 COGGE
  • 16. How Disruptive – Witness AlphaGo https://www.alphagomovie.com/ 1. Even the programmers were disquieted by creating something better than any human 2. AlphaGo made a move that no human Go expert nor programmer anticipated 3. It takes a lot of resources to defeat the world champion Go has more moves than there are atoms in the universe 11/02/23 16 COGGE
  • 17. Proteins have ~20**300 combinations also more than the number of atoms in the universe 11/02/23 17 COGGE
  • 21. Logistics Behind the Win ● Nothing fundamentally new from an AI perspective ● FAIR Data ● Collaboration not competition ● Engineering challenge beyond most labs ● Compute power beyond most labs ● Team size beyond most labs ● Worked with protein structure specialists 21
  • 22. Downstream Implications • Cooperation rather than competition • Public-private partnership • Translational possibilities are endless • Made possible by curated open data • Appreciate engineering 11/02/23 22 COGGE
  • 23. OK if data are important, they need to be FAIR 11/02/23 23 COGGE
  • 24. OK if data are important, they need to be FAIR 11/02/23 24 COGGE
  • 25. A FAIR Poster Child Researcher and Assistant Professor of Medicine Dr. Thomas Hartka, also a current online Masters in Data Science student, is combining two disparate data sets—electronic health records and DMV crash data—to save lives after motor vehicle crashes. “I enrolled in the MSDS program to expand my research on automotive safety. I have already used techniques from classes in my work. I hope to expand my research to real-time analytics to improve emergency room care.” — Dr. Thomas Hartka, UVA School of Medicine 11/02/23 25 COGGE
  • 26. Conversation Cards • Is the disruption as profound as I indicate? If so, • What is in it for you? • How much will it cost? • If you sustain FAIR data • What is in it for you? • How much will it cost? 11/02/23 26 COGGE

Editor's Notes

  1. I will introduce the concept of data science with a story that illustrates - citizen engagement, merging of unexpected data and societal benefit