SlideShare a Scribd company logo
The emerging biodiversity data ecosystem Cynthia Parr, Katja Schulz, Jennifer Hammock  Smithsonian Institution  Nathan Wilson, Patrick Leary Marine Biological Laboratory Richard Allen Environmental Protection Agency
Today’s story What is EOL Core questions Network analysis Hotlist development Page richness algorithm Conclusion: improving the health and richness of our knowledge network advances understanding
What is EOL http://www.eol.org ,[object Object]
All species
Freely accessible & reusable: open access, open source
Available from a single portal in a common format
Quality
Always growing,[object Object]
EOL is a content curation community Content providers Databases 	Journals LifeDesks 	Public contributions Curating Aggregation Commenting Tagging http://www.eol.org
Core questions Where is our knowledge about biodiversity? Where are the gaps? What are the most effective ways to fill gaps given our limited resources?
Network analysis with Anne Bowser, University of Maryland EOL GBIF NCBI EOL connects hubs
The GBIF hub has subnetworks
Key individuals seek out hubs TOLWeb
Implications and next steps Need more data Identify isolated projects & mechanisms for connecting them to the network Improve resilience & redundancy Distribute annotation & quality control  Model data flow quantity and impact
Viewer of Life on EOL – Kris Urie
Low % of descendents with text  in Arthropods
Within arthropods coverage varies  . . . Perhaps as expected http://synthesis.eol.org/media/treemap/
Developing the EOL hot list Consultation with taxonomic experts Development of criteria Assembly of critical lists Establishing targets for rich taxon pages, lesser known pages
EOL’s hot lists Hot List	 Red Hot List 70,000 taxa Conservation concern Invasives Model organisms Ecologically important Pests Charismatics Data availability 2,800 taxa Most searched Top 100 invasives Crops (food) Zoos & aquaria High traffic Higher taxa
Taxon page richness algorithm 60% 30% 10% Breadth: Images, topics of text objects, references, maps, videos, sounds, conservation status Depth: # words per text object, # words total Diversity: Sources (partners) + + a (Breadth) b (Depth) c (Diversity) 0 – 1, Threshold 0.4
Summary of EOL page richness Overall Hot List 640,000 have content 2 % are rich 25 % have only links  to literature 28 % of 75K are rich Average richness = 0.30 Red Hot List 56 % of 3K are rich Average richness = 0.43
Strategies for improving richness Crowd-sourcing Leveraging Collections Communities Mobile apps Enabling platforms Enabling journals Data mining BHL etc. Version 2 Coming in Fall 2011!

More Related Content

What's hot

SHARE Update for CASRAI, November 2014
SHARE Update for CASRAI, November 2014SHARE Update for CASRAI, November 2014
SHARE Update for CASRAI, November 2014
SHARE
 
evaluating the quality of open access content
evaluating the quality of open access contentevaluating the quality of open access content
evaluating the quality of open access content
Brian Bot
 
Linking biodiversity data for ecology
Linking biodiversity data for ecologyLinking biodiversity data for ecology
Linking biodiversity data for ecology
Anne Thessen
 
Data dialogue - Human Genomic Data Discovery
Data dialogue - Human Genomic Data DiscoveryData dialogue - Human Genomic Data Discovery
Data dialogue - Human Genomic Data Discovery
Fiona Nielsen
 
Nigel Robinson - ZooBank and Zoological Record: a partnership for success
Nigel Robinson - ZooBank and Zoological Record: a partnership for successNigel Robinson - ZooBank and Zoological Record: a partnership for success
Nigel Robinson - ZooBank and Zoological Record: a partnership for success
ICZN
 
2018 04-03-shorthouse
2018 04-03-shorthouse2018 04-03-shorthouse
2018 04-03-shorthouse
David Shorthouse
 
Tyler poster v2
Tyler poster  v2Tyler poster  v2
Tyler poster v2
Tyler A. Elliott
 
The Road to TraitBank: What's Next for the Encyclopedia of Life
The Road to TraitBank: What's Next for the Encyclopedia of LifeThe Road to TraitBank: What's Next for the Encyclopedia of Life
The Road to TraitBank: What's Next for the Encyclopedia of Life
Cyndy Parr
 
Citizen Science: Association of American Medical Colleges conference
Citizen Science: Association of American Medical Colleges conferenceCitizen Science: Association of American Medical Colleges conference
Citizen Science: Association of American Medical Colleges conference
Darlene Cavalier
 
Integrative Biology Summit
Integrative Biology SummitIntegrative Biology Summit
Living in a Microbial World
Living in a Microbial WorldLiving in a Microbial World
Living in a Microbial World
Larry Smarr
 
Sleeping Beauty Transposon: Awakening a new approach to cancer treatment
Sleeping Beauty Transposon: Awakening a new approach to cancer treatmentSleeping Beauty Transposon: Awakening a new approach to cancer treatment
Sleeping Beauty Transposon: Awakening a new approach to cancer treatment
Julie Kendrick
 
Using the Semantic Web to Support Ecoinformatics
Using the Semantic Web to Support EcoinformaticsUsing the Semantic Web to Support Ecoinformatics
Using the Semantic Web to Support Ecoinformatics
ebiquity
 
Bradley Research Sept 2007
Bradley Research Sept 2007Bradley Research Sept 2007
Bradley Research Sept 2007
Jean-Claude Bradley
 
Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...
Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...
Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...
TERN Australia
 
Bccvl hallgren
Bccvl hallgrenBccvl hallgren
Bccvl hallgren
TERN Australia
 
Open Virus Indian Presentation
Open Virus Indian PresentationOpen Virus Indian Presentation
Open Virus Indian Presentation
petermurrayrust
 
1476-4598-3-23
1476-4598-3-231476-4598-3-23
1476-4598-3-23
Christian Schmidt
 
WikiGenomes Poster (ISMB)
WikiGenomes Poster (ISMB)WikiGenomes Poster (ISMB)
WikiGenomes Poster (ISMB)
Andrew Su
 

What's hot (19)

SHARE Update for CASRAI, November 2014
SHARE Update for CASRAI, November 2014SHARE Update for CASRAI, November 2014
SHARE Update for CASRAI, November 2014
 
evaluating the quality of open access content
evaluating the quality of open access contentevaluating the quality of open access content
evaluating the quality of open access content
 
Linking biodiversity data for ecology
Linking biodiversity data for ecologyLinking biodiversity data for ecology
Linking biodiversity data for ecology
 
Data dialogue - Human Genomic Data Discovery
Data dialogue - Human Genomic Data DiscoveryData dialogue - Human Genomic Data Discovery
Data dialogue - Human Genomic Data Discovery
 
Nigel Robinson - ZooBank and Zoological Record: a partnership for success
Nigel Robinson - ZooBank and Zoological Record: a partnership for successNigel Robinson - ZooBank and Zoological Record: a partnership for success
Nigel Robinson - ZooBank and Zoological Record: a partnership for success
 
2018 04-03-shorthouse
2018 04-03-shorthouse2018 04-03-shorthouse
2018 04-03-shorthouse
 
Tyler poster v2
Tyler poster  v2Tyler poster  v2
Tyler poster v2
 
The Road to TraitBank: What's Next for the Encyclopedia of Life
The Road to TraitBank: What's Next for the Encyclopedia of LifeThe Road to TraitBank: What's Next for the Encyclopedia of Life
The Road to TraitBank: What's Next for the Encyclopedia of Life
 
Citizen Science: Association of American Medical Colleges conference
Citizen Science: Association of American Medical Colleges conferenceCitizen Science: Association of American Medical Colleges conference
Citizen Science: Association of American Medical Colleges conference
 
Integrative Biology Summit
Integrative Biology SummitIntegrative Biology Summit
Integrative Biology Summit
 
Living in a Microbial World
Living in a Microbial WorldLiving in a Microbial World
Living in a Microbial World
 
Sleeping Beauty Transposon: Awakening a new approach to cancer treatment
Sleeping Beauty Transposon: Awakening a new approach to cancer treatmentSleeping Beauty Transposon: Awakening a new approach to cancer treatment
Sleeping Beauty Transposon: Awakening a new approach to cancer treatment
 
Using the Semantic Web to Support Ecoinformatics
Using the Semantic Web to Support EcoinformaticsUsing the Semantic Web to Support Ecoinformatics
Using the Semantic Web to Support Ecoinformatics
 
Bradley Research Sept 2007
Bradley Research Sept 2007Bradley Research Sept 2007
Bradley Research Sept 2007
 
Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...
Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...
Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...
 
Bccvl hallgren
Bccvl hallgrenBccvl hallgren
Bccvl hallgren
 
Open Virus Indian Presentation
Open Virus Indian PresentationOpen Virus Indian Presentation
Open Virus Indian Presentation
 
1476-4598-3-23
1476-4598-3-231476-4598-3-23
1476-4598-3-23
 
WikiGenomes Poster (ISMB)
WikiGenomes Poster (ISMB)WikiGenomes Poster (ISMB)
WikiGenomes Poster (ISMB)
 

Viewers also liked

Mibbi workshop-isa-project
Mibbi workshop-isa-projectMibbi workshop-isa-project
Mibbi workshop-isa-project
MIBBI Checklists
 
R interface to TreeBASE
R interface to TreeBASER interface to TreeBASE
R interface to TreeBASE
Carl Boettiger
 
GIATE mibbi2010
GIATE mibbi2010GIATE mibbi2010
GIATE mibbi2010
MIBBI Checklists
 
Sansone bio sharing introduction
Sansone bio sharing introductionSansone bio sharing introduction
Sansone bio sharing introduction
MIBBI Checklists
 
Sansone mibbi-intro
Sansone mibbi-introSansone mibbi-intro
Sansone mibbi-intro
MIBBI Checklists
 
2011Field talk at iEVOBIO 2011
2011Field talk at iEVOBIO 20112011Field talk at iEVOBIO 2011
2011Field talk at iEVOBIO 2011
MIBBI Checklists
 
The TNRS: a Taxonomic Name Resolution Service for Plants
The TNRS: a Taxonomic Name Resolution Service for PlantsThe TNRS: a Taxonomic Name Resolution Service for Plants
The TNRS: a Taxonomic Name Resolution Service for Plants
Naim Matasci
 

Viewers also liked (7)

Mibbi workshop-isa-project
Mibbi workshop-isa-projectMibbi workshop-isa-project
Mibbi workshop-isa-project
 
R interface to TreeBASE
R interface to TreeBASER interface to TreeBASE
R interface to TreeBASE
 
GIATE mibbi2010
GIATE mibbi2010GIATE mibbi2010
GIATE mibbi2010
 
Sansone bio sharing introduction
Sansone bio sharing introductionSansone bio sharing introduction
Sansone bio sharing introduction
 
Sansone mibbi-intro
Sansone mibbi-introSansone mibbi-intro
Sansone mibbi-intro
 
2011Field talk at iEVOBIO 2011
2011Field talk at iEVOBIO 20112011Field talk at iEVOBIO 2011
2011Field talk at iEVOBIO 2011
 
The TNRS: a Taxonomic Name Resolution Service for Plants
The TNRS: a Taxonomic Name Resolution Service for PlantsThe TNRS: a Taxonomic Name Resolution Service for Plants
The TNRS: a Taxonomic Name Resolution Service for Plants
 

Similar to The emerging biodiversity data ecosystem

Shorthouse
ShorthouseShorthouse
Shorthouse
David Shorthouse
 
Writing The Encyclopedia Of Life (not EoL.org)
Writing The Encyclopedia Of Life (not EoL.org)Writing The Encyclopedia Of Life (not EoL.org)
Writing The Encyclopedia Of Life (not EoL.org)
Vince Smith
 
Developing data services: a tale from two Oregon universities
Developing data services: a tale from two Oregon universitiesDeveloping data services: a tale from two Oregon universities
Developing data services: a tale from two Oregon universities
Amanda Whitmire
 
Encyclopedia of Life: Use cases for phenotypes
Encyclopedia of Life: Use cases for phenotypesEncyclopedia of Life: Use cases for phenotypes
Encyclopedia of Life: Use cases for phenotypes
Cyndy Parr
 
Beacon Network: A System for Global Genomic Data Sharing
Beacon Network: A System for Global Genomic Data SharingBeacon Network: A System for Global Genomic Data Sharing
Beacon Network: A System for Global Genomic Data Sharing
Miro Cupak
 
Introduction to EOL.org for scientists
Introduction to EOL.org for scientistsIntroduction to EOL.org for scientists
Introduction to EOL.org for scientists
Cyndy Parr
 
Stories of “Glocality"—Nations in a Global Infrastructure
Stories of “Glocality"—Nations in a Global InfrastructureStories of “Glocality"—Nations in a Global Infrastructure
Stories of “Glocality"—Nations in a Global Infrastructure
Research Data Alliance
 
iEvoBio Keynote Talk 2010
iEvoBio Keynote Talk 2010iEvoBio Keynote Talk 2010
iEvoBio Keynote Talk 2010
Rob Guralnick
 
RPG iEvoBio 2010 Keynote
RPG iEvoBio 2010 KeynoteRPG iEvoBio 2010 Keynote
RPG iEvoBio 2010 Keynote
Rob Guralnick
 
Global patterns of insect diiversity, distribution and evolutionary distinctness
Global patterns of insect diiversity, distribution and evolutionary distinctnessGlobal patterns of insect diiversity, distribution and evolutionary distinctness
Global patterns of insect diiversity, distribution and evolutionary distinctness
Alison Specht
 
Microbial Phylogenomics (EVE161) Class 5
Microbial Phylogenomics (EVE161) Class 5Microbial Phylogenomics (EVE161) Class 5
Microbial Phylogenomics (EVE161) Class 5
Jonathan Eisen
 
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
Cyndy Parr
 
Biodiversity Informatics: An Interdisciplinary Challenge
Biodiversity Informatics: An Interdisciplinary ChallengeBiodiversity Informatics: An Interdisciplinary Challenge
Biodiversity Informatics: An Interdisciplinary Challenge
Bryan Heidorn
 
Big data nebraska
Big data nebraskaBig data nebraska
Big data nebraska
Adina Chuang Howe
 
Frontiers of discovery with Encyclopedia of Life
Frontiers of discovery with Encyclopedia of LifeFrontiers of discovery with Encyclopedia of Life
Frontiers of discovery with Encyclopedia of Life
Cyndy Parr
 
Scratchpads introductory presentation 45mins
Scratchpads introductory presentation   45minsScratchpads introductory presentation   45mins
Scratchpads introductory presentation 45mins
Dimitrios Koureas
 
BioOne Keynote
BioOne KeynoteBioOne Keynote
BioOne Keynote
drielinger
 
Behavior ontology workshop princeton
Behavior ontology workshop princetonBehavior ontology workshop princeton
Behavior ontology workshop princeton
Cyndy Parr
 
AB3ACBS 2016: EMBL Australia Bioinformatics Resource
AB3ACBS 2016: EMBL Australia Bioinformatics ResourceAB3ACBS 2016: EMBL Australia Bioinformatics Resource
AB3ACBS 2016: EMBL Australia Bioinformatics Resource
Philippa Griffin
 
Cranston Evolution 2013
Cranston Evolution 2013Cranston Evolution 2013
Cranston Evolution 2013
Karen Cranston
 

Similar to The emerging biodiversity data ecosystem (20)

Shorthouse
ShorthouseShorthouse
Shorthouse
 
Writing The Encyclopedia Of Life (not EoL.org)
Writing The Encyclopedia Of Life (not EoL.org)Writing The Encyclopedia Of Life (not EoL.org)
Writing The Encyclopedia Of Life (not EoL.org)
 
Developing data services: a tale from two Oregon universities
Developing data services: a tale from two Oregon universitiesDeveloping data services: a tale from two Oregon universities
Developing data services: a tale from two Oregon universities
 
Encyclopedia of Life: Use cases for phenotypes
Encyclopedia of Life: Use cases for phenotypesEncyclopedia of Life: Use cases for phenotypes
Encyclopedia of Life: Use cases for phenotypes
 
Beacon Network: A System for Global Genomic Data Sharing
Beacon Network: A System for Global Genomic Data SharingBeacon Network: A System for Global Genomic Data Sharing
Beacon Network: A System for Global Genomic Data Sharing
 
Introduction to EOL.org for scientists
Introduction to EOL.org for scientistsIntroduction to EOL.org for scientists
Introduction to EOL.org for scientists
 
Stories of “Glocality"—Nations in a Global Infrastructure
Stories of “Glocality"—Nations in a Global InfrastructureStories of “Glocality"—Nations in a Global Infrastructure
Stories of “Glocality"—Nations in a Global Infrastructure
 
iEvoBio Keynote Talk 2010
iEvoBio Keynote Talk 2010iEvoBio Keynote Talk 2010
iEvoBio Keynote Talk 2010
 
RPG iEvoBio 2010 Keynote
RPG iEvoBio 2010 KeynoteRPG iEvoBio 2010 Keynote
RPG iEvoBio 2010 Keynote
 
Global patterns of insect diiversity, distribution and evolutionary distinctness
Global patterns of insect diiversity, distribution and evolutionary distinctnessGlobal patterns of insect diiversity, distribution and evolutionary distinctness
Global patterns of insect diiversity, distribution and evolutionary distinctness
 
Microbial Phylogenomics (EVE161) Class 5
Microbial Phylogenomics (EVE161) Class 5Microbial Phylogenomics (EVE161) Class 5
Microbial Phylogenomics (EVE161) Class 5
 
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
 
Biodiversity Informatics: An Interdisciplinary Challenge
Biodiversity Informatics: An Interdisciplinary ChallengeBiodiversity Informatics: An Interdisciplinary Challenge
Biodiversity Informatics: An Interdisciplinary Challenge
 
Big data nebraska
Big data nebraskaBig data nebraska
Big data nebraska
 
Frontiers of discovery with Encyclopedia of Life
Frontiers of discovery with Encyclopedia of LifeFrontiers of discovery with Encyclopedia of Life
Frontiers of discovery with Encyclopedia of Life
 
Scratchpads introductory presentation 45mins
Scratchpads introductory presentation   45minsScratchpads introductory presentation   45mins
Scratchpads introductory presentation 45mins
 
BioOne Keynote
BioOne KeynoteBioOne Keynote
BioOne Keynote
 
Behavior ontology workshop princeton
Behavior ontology workshop princetonBehavior ontology workshop princeton
Behavior ontology workshop princeton
 
AB3ACBS 2016: EMBL Australia Bioinformatics Resource
AB3ACBS 2016: EMBL Australia Bioinformatics ResourceAB3ACBS 2016: EMBL Australia Bioinformatics Resource
AB3ACBS 2016: EMBL Australia Bioinformatics Resource
 
Cranston Evolution 2013
Cranston Evolution 2013Cranston Evolution 2013
Cranston Evolution 2013
 

More from Cyndy Parr

Open data and the ag data commons
Open data and the ag data commonsOpen data and the ag data commons
Open data and the ag data commons
Cyndy Parr
 
Ag Data Commons for AgBioData
Ag Data Commons for AgBioDataAg Data Commons for AgBioData
Ag Data Commons for AgBioData
Cyndy Parr
 
Biodiversity informatics and the agricultural data landscape
Biodiversity informatics and the agricultural data landscapeBiodiversity informatics and the agricultural data landscape
Biodiversity informatics and the agricultural data landscape
Cyndy Parr
 
Public access to research results at USDA
Public access to research results at USDAPublic access to research results at USDA
Public access to research results at USDA
Cyndy Parr
 
Ag Data Commons: Agricultural research metadata and data
Ag Data Commons: Agricultural research metadata and dataAg Data Commons: Agricultural research metadata and data
Ag Data Commons: Agricultural research metadata and data
Cyndy Parr
 
Ag Data Commons: A new USDA catalog and repository for agricultural research ...
Ag Data Commons: A new USDA catalog and repository for agricultural research ...Ag Data Commons: A new USDA catalog and repository for agricultural research ...
Ag Data Commons: A new USDA catalog and repository for agricultural research ...
Cyndy Parr
 
Preparing for data-intensive science across domains.
Preparing for data-intensive science across domains.Preparing for data-intensive science across domains.
Preparing for data-intensive science across domains.
Cyndy Parr
 
Parr ag datacommonsnal_brownbag
Parr ag datacommonsnal_brownbagParr ag datacommonsnal_brownbag
Parr ag datacommonsnal_brownbag
Cyndy Parr
 
Ag Data Commons: Adding Value to open agricultural research data
Ag Data Commons: Adding Value to open agricultural research dataAg Data Commons: Adding Value to open agricultural research data
Ag Data Commons: Adding Value to open agricultural research data
Cyndy Parr
 
Big Data Initiatives for Agroecosystems
Big Data Initiatives for AgroecosystemsBig Data Initiatives for Agroecosystems
Big Data Initiatives for Agroecosystems
Cyndy Parr
 
TDWG 2014 opening talk: Chair's Welcome
TDWG 2014 opening talk: Chair's WelcomeTDWG 2014 opening talk: Chair's Welcome
TDWG 2014 opening talk: Chair's Welcome
Cyndy Parr
 
Practical interoperability across semantic stores of data for ecological, tax...
Practical interoperability across semantic stores of data for ecological, tax...Practical interoperability across semantic stores of data for ecological, tax...
Practical interoperability across semantic stores of data for ecological, tax...
Cyndy Parr
 
Using and extending Darwin Core for structured attribute data
Using and extending Darwin Core for structured attribute dataUsing and extending Darwin Core for structured attribute data
Using and extending Darwin Core for structured attribute data
Cyndy Parr
 
Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...
Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...
Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...
Cyndy Parr
 
Species pages and portals
Species pages and portals Species pages and portals
Species pages and portals
Cyndy Parr
 
Building EOL species pages
Building EOL species pagesBuilding EOL species pages
Building EOL species pages
Cyndy Parr
 
Leveraging an international infrastructure: Case studies from the Encyclopeda...
Leveraging an international infrastructure: Case studies from the Encyclopeda...Leveraging an international infrastructure: Case studies from the Encyclopeda...
Leveraging an international infrastructure: Case studies from the Encyclopeda...
Cyndy Parr
 
EOL and Science: Yes we can!
EOL and Science: Yes we can!EOL and Science: Yes we can!
EOL and Science: Yes we can!
Cyndy Parr
 
EOL China Center status
EOL China Center statusEOL China Center status
EOL China Center status
Cyndy Parr
 
Western Ghats Portal
Western Ghats PortalWestern Ghats Portal
Western Ghats Portal
Cyndy Parr
 

More from Cyndy Parr (20)

Open data and the ag data commons
Open data and the ag data commonsOpen data and the ag data commons
Open data and the ag data commons
 
Ag Data Commons for AgBioData
Ag Data Commons for AgBioDataAg Data Commons for AgBioData
Ag Data Commons for AgBioData
 
Biodiversity informatics and the agricultural data landscape
Biodiversity informatics and the agricultural data landscapeBiodiversity informatics and the agricultural data landscape
Biodiversity informatics and the agricultural data landscape
 
Public access to research results at USDA
Public access to research results at USDAPublic access to research results at USDA
Public access to research results at USDA
 
Ag Data Commons: Agricultural research metadata and data
Ag Data Commons: Agricultural research metadata and dataAg Data Commons: Agricultural research metadata and data
Ag Data Commons: Agricultural research metadata and data
 
Ag Data Commons: A new USDA catalog and repository for agricultural research ...
Ag Data Commons: A new USDA catalog and repository for agricultural research ...Ag Data Commons: A new USDA catalog and repository for agricultural research ...
Ag Data Commons: A new USDA catalog and repository for agricultural research ...
 
Preparing for data-intensive science across domains.
Preparing for data-intensive science across domains.Preparing for data-intensive science across domains.
Preparing for data-intensive science across domains.
 
Parr ag datacommonsnal_brownbag
Parr ag datacommonsnal_brownbagParr ag datacommonsnal_brownbag
Parr ag datacommonsnal_brownbag
 
Ag Data Commons: Adding Value to open agricultural research data
Ag Data Commons: Adding Value to open agricultural research dataAg Data Commons: Adding Value to open agricultural research data
Ag Data Commons: Adding Value to open agricultural research data
 
Big Data Initiatives for Agroecosystems
Big Data Initiatives for AgroecosystemsBig Data Initiatives for Agroecosystems
Big Data Initiatives for Agroecosystems
 
TDWG 2014 opening talk: Chair's Welcome
TDWG 2014 opening talk: Chair's WelcomeTDWG 2014 opening talk: Chair's Welcome
TDWG 2014 opening talk: Chair's Welcome
 
Practical interoperability across semantic stores of data for ecological, tax...
Practical interoperability across semantic stores of data for ecological, tax...Practical interoperability across semantic stores of data for ecological, tax...
Practical interoperability across semantic stores of data for ecological, tax...
 
Using and extending Darwin Core for structured attribute data
Using and extending Darwin Core for structured attribute dataUsing and extending Darwin Core for structured attribute data
Using and extending Darwin Core for structured attribute data
 
Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...
Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...
Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...
 
Species pages and portals
Species pages and portals Species pages and portals
Species pages and portals
 
Building EOL species pages
Building EOL species pagesBuilding EOL species pages
Building EOL species pages
 
Leveraging an international infrastructure: Case studies from the Encyclopeda...
Leveraging an international infrastructure: Case studies from the Encyclopeda...Leveraging an international infrastructure: Case studies from the Encyclopeda...
Leveraging an international infrastructure: Case studies from the Encyclopeda...
 
EOL and Science: Yes we can!
EOL and Science: Yes we can!EOL and Science: Yes we can!
EOL and Science: Yes we can!
 
EOL China Center status
EOL China Center statusEOL China Center status
EOL China Center status
 
Western Ghats Portal
Western Ghats PortalWestern Ghats Portal
Western Ghats Portal
 

Recently uploaded

Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
c5vrf27qcz
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Public CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptxPublic CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptx
marufrahmanstratejm
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
DianaGray10
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
Neo4j
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
Fwdays
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
Miro Wengner
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Precisely
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
AstuteBusiness
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 

Recently uploaded (20)

Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Public CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptxPublic CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptx
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 

The emerging biodiversity data ecosystem

  • 1. The emerging biodiversity data ecosystem Cynthia Parr, Katja Schulz, Jennifer Hammock Smithsonian Institution Nathan Wilson, Patrick Leary Marine Biological Laboratory Richard Allen Environmental Protection Agency
  • 2. Today’s story What is EOL Core questions Network analysis Hotlist development Page richness algorithm Conclusion: improving the health and richness of our knowledge network advances understanding
  • 3.
  • 5. Freely accessible & reusable: open access, open source
  • 6. Available from a single portal in a common format
  • 8.
  • 9. EOL is a content curation community Content providers Databases Journals LifeDesks Public contributions Curating Aggregation Commenting Tagging http://www.eol.org
  • 10. Core questions Where is our knowledge about biodiversity? Where are the gaps? What are the most effective ways to fill gaps given our limited resources?
  • 11. Network analysis with Anne Bowser, University of Maryland EOL GBIF NCBI EOL connects hubs
  • 12. The GBIF hub has subnetworks
  • 13. Key individuals seek out hubs TOLWeb
  • 14. Implications and next steps Need more data Identify isolated projects & mechanisms for connecting them to the network Improve resilience & redundancy Distribute annotation & quality control Model data flow quantity and impact
  • 15. Viewer of Life on EOL – Kris Urie
  • 16. Low % of descendents with text in Arthropods
  • 17. Within arthropods coverage varies . . . Perhaps as expected http://synthesis.eol.org/media/treemap/
  • 18. Developing the EOL hot list Consultation with taxonomic experts Development of criteria Assembly of critical lists Establishing targets for rich taxon pages, lesser known pages
  • 19. EOL’s hot lists Hot List Red Hot List 70,000 taxa Conservation concern Invasives Model organisms Ecologically important Pests Charismatics Data availability 2,800 taxa Most searched Top 100 invasives Crops (food) Zoos & aquaria High traffic Higher taxa
  • 20. Taxon page richness algorithm 60% 30% 10% Breadth: Images, topics of text objects, references, maps, videos, sounds, conservation status Depth: # words per text object, # words total Diversity: Sources (partners) + + a (Breadth) b (Depth) c (Diversity) 0 – 1, Threshold 0.4
  • 21. Summary of EOL page richness Overall Hot List 640,000 have content 2 % are rich 25 % have only links to literature 28 % of 75K are rich Average richness = 0.30 Red Hot List 56 % of 3K are rich Average richness = 0.43
  • 22. Strategies for improving richness Crowd-sourcing Leveraging Collections Communities Mobile apps Enabling platforms Enabling journals Data mining BHL etc. Version 2 Coming in Fall 2011!
  • 23. The page richness index Helps fill gaps with existing knowledge Helps prioritize funding and training so that it has maximum impact on closing true gaps Will be available via API Computing and storing richness index on EOL is a step towards storing and serving computable data
  • 24. Dynamic data summaries = new knowledge Summarize data within a partner, then across partners. For example: compute an average value for one taxon (x specimens), compare to range of values across all taxa (621,393 samples) Atlantic Cod Gadusmorhua Jen Hammock (EOL) Edward van den Berge (OBIS)
  • 25.
  • 27. Richness assessment Large-scale data summaries can foster gap-filling and standing, dynamic knowledge analyses
  • 28. Thank you http://www.eol.org 160+ content partners 2000 Flickr contributors 1000s Wikipedia contributors 43,000 EOL members Funding:John D. and Catherine T. MacArthur Foundation, Alfred P. Sloan Foundation, Cornerstone Institutions, Private Donors See Demo and Version 2 sneak peak in Software Bazaar Leadership: Erick Mata, Bob Corrigan, Mark Westneat, Marie Studer, Tom Garnett, Jim Edwards, David Patterson, Developers: Peter Mangiafico, Jeremy Rice, DimitriMozzherin, David Shorthouse, Lisa Whalley and others Biologists: Tanya Dewey, Audrey Aronowsky, Leo Shapiro

Editor's Notes

  1. Conclusion is that there is value to treating all the biodiversity information systems as part of an interconnected ecosystem. We can study the connections, we can assess depth of infomraiton in the network. I’ll focus on EOL’s role in the system, but I hope to make observations that will be generally useful too
  2. Objects such as these are essentially chunks of text sorted by topic. Span biology from physiology to ecology to evolutionEach of these credits the source, and can receive comments or ratings, or can be trusted or untrusted by curators.
  3. So, the approach of EOL is rather different than many other sites. EOL is a giant mashup that creates pages, that are then available for curators (mostly credentialed scientists) to assess and rate, or for anybody to provide comments or tags.160+ partner databases700 curators/1000s contributors/46,000 members2.8 million pages600 thousand pages with Creative Commons contentOver 2 million data objects and >1 million pages with links to research literatureTraffic in past year: 1.7 million unique users, 6.2 million page views
  4. Represents about 1600 projects, and 1700 instances of data flow or hyperlinks between them. Size of the vertex, or node, reflects degree, or how many links the node has. We used the Claust-Newman-Moore algorithm to determine which vertices grouped together, then gave each group a color code. Those nodes with a degree of 15 or higher are labeled, and their edges are shown thicker than the others. These are the hubsThese are the hubs of this network, and they are reasonably well connected to each other. (go through and expand the acronyms)
  5. Daphne Fautin’sHexacorallians of the world
  6. With this as a baseline, how connected and resilient is the network? Over time we want it to become more connected and resilient, both to enable discovery and recovery in case of catastrophic problems.We can also use this to develop effective mechanisms to annotate data and improve data quality. If the same data appear on different parts of the network, and someone reports an error, the repair of that data needs to propagate effectively. What are the factors that influence data flow quantity and effectiveness…
  7. Brighter green has higher % descendents with text, size of square is number of descendents square root scaled
  8. Ecologically important – keystone species, indicator species
  9. Inspired by community ecology & measures of species diversity, which of course were originally inspired by information theory, but we haven’t used those measures. Instead we put together these factors in a way that we could assign weights to different factors based on how well they capture “a rich page”We sampled dozens of pages and had team members assess them for their gestalt “richness” based on their own criteria. Then we compared those scores to those generated by the algorithm, and iteratively changed weights until we achieved a set of weights that appeared to reflect human perception of “richness.”Note that there’s a penalty that unvetted material is only worth about 75% of vetted materialAlso there are maximums for many of these input values – having 200 images may not make a page much more rich than having 25 images.Reserve the right to change this to ensure that the index is as useful as possible. Like Google PageRank, want to ensure that nobody can game the system.
  10. Also note that there is an implication that a “rich page” is a “high quality page” – not necessarily true but often it is.As EOL goes forward with our version 2 we’ll be gathering other inputs that can tell us if a page is successful – ratings of its objects, for example.
  11. Here’s what we are already doing – for the OBIS specimens which have rich environmental data associated with themCould add simllar values from other partners, for example from GenBank where some samples that are sequenced are collected from known envorinments, or from ecological studies that aren’t part of the specimen based system.Could subscribe to this value and get alerts if new values that come in that are outside this range.Could set up an model for this taxon and its relatives, predicting expected values, then if new values are aggregratedfrom any of EOL’s partners that violate the model, the scientist who has published the model gets a notification, could be there’s a flaw in the data integration, some violation of assumptions about the measurement workflow. Or could be that there’s something we truly didn’t understand before.Truly leveraging the scientific output of many researchers, better use of resources, more rapid advances in understanding of biological systems.
  12. Analogousto the study of ecosystems where we seek to build an understanding of entire systems with many kinds of inputs, both biotic and abiotic
  13. In addition to the authors…