SlideShare a Scribd company logo
eol.org
@eol
@cydparr
How the Encyclopedia of Life is
wrangling organismal attribute data
How EOL works
EOL
Crowds
Harvest
Third party applications
EOL Today
Key Milestones in 2013
1.1 million species pages
240+ content providers
3.3 million unique annual
visitors from 235
countries
0 100000 200000 300000 400000 500000 600000 700000 800000
Distribution
MolecularBiology
Multiple topics
TypeInformation
Habitat
ConservationStatus
Threats
Morphology
Conservation
Management
Trends
Size
Associations
Uses
TrophicStrategy
Cyclicity & Life Cycle
PopulationBiology
Reproduction
Migration
Taxonomy
LifeExpectancy
Identification
Behaviour
Ecology
Diseases
Number of text objectsSubjectoftextobject
Text mining, crowdsourcing, standardizing
see http://eol.org/info/fellows
Co-occurrence, term extraction &
linked data
Thessen & Devries
EnvO habitat terms Pafilis et al.
Altitude Specificity of Flower
Coloration
Wright
Morphological impacts of extinction
risk in fish
Chang
Butterfly-hostplant associations Ferrer-Parris et al.
Species Interactions Poelen & Mungall
et al.
14 datasets containing 25k
taxa, 422k
interactions, for 3k
locations
alpha version of
ingestion, normalization,
aggregation
alpha version of web API
alpha version of data
exports
Dr. Katy Börner led
Information Visualization
MOOC
GLoBI http://globalbioticinteractions.wordpress.com/
EOL TraitBank
Funded: Marine focus
Virtuoso triple store, re-using URIs where possible
5 datasets 128,050 data points for 20,896 taxa
Harvest and display on data tab
Downloads, fancy searching
Machine access
Uploads & harvests will be by spreadsheet
and Darwin Core Archive
Support for annotation and curation
Please contact me to be part of the private beta
Easy access to analyzable trait data
“Are blue organisms more common in high altitudes?”
“Does the evolution of mammalian bacula appear to be
related to the pattern of promiscuous mating?”
“What organisms should I collect to fill in gaps in genome
quality tissue collections?”
• Look for trait, download for all taxa
• Create a collection of taxa, download all data
• Use Reol: an R interface to EOL (Banbury, O’Meara)
http://reolblog.wordpress.com/
• Find more specialized data repositories
But also . . .
Thanks
Funding & other contributions
Sloan Foundation
Smithsonian Institution
David Rubenstein
Marine Biological Laboratory
Harvard University
Our content partners
Thousands of individual
contributors, and hundreds of
volunteer curators
Image credits
Jenny from Taipei
Cynthia Parr
Chief Scientist @eol
@cydparr parrc@si.edu
Alexandria Archive: Sarah Kansa, Eric Kansa, 34 othe
zooarchaeologists
GLoBI: Jorrit Poelen (lead/software), Chris Mungall
(ontologies), James Simons (biologist) and Robert
Reiz (software). Datasets shared by: Peter D.
Roopnarine, Rachel Hertog, Carlos García-
Robledo, James Simons, Jenny L. Wrast, C.
Barnes, International Council for the Exploration of
the Sea (ICES), Jose R. Ferrer Paris, Senol
Akin, Malcolm Storey (BioInfo.org.uk), Ivy E.
Baremore, Joel Sachs (SPIRE), Colt W. Cook, David A.
Blewett
Quick math
In Phenoscape
57 publications had 565,158 anatomical trait
descriptions for 2,527 kinds of organisms
= 223 traits/organism
In ZFIN
38,189 trait descriptions for 4,727 genes for Zebra
Fish
1.9 million species on the planet
= LOTS OF TRAITS
Anatolia Zooarchaeology Case Study led by
Alexandria Archive Institute
1. 14 different sites
2. 34+ zooarchaeologists
3. Decoding, cleanup, metadata documentation
4. 220,000+ specimens
5. 450 entities linked to 143 EOL taxon concepts
6. Anatomical entities linked to Uberon.org
7. Biometrics linked to measurement ontology
8. Collaborative analysis
http://opencontext.org/

More Related Content

What's hot

Encyclopedia of Life: Use cases for phenotypes
Encyclopedia of Life: Use cases for phenotypesEncyclopedia of Life: Use cases for phenotypes
Encyclopedia of Life: Use cases for phenotypesCyndy Parr
 
Quentin D. Wheeler - ZooBank and the Taxonomic Renaissance
Quentin D. Wheeler - ZooBank and the Taxonomic RenaissanceQuentin D. Wheeler - ZooBank and the Taxonomic Renaissance
Quentin D. Wheeler - ZooBank and the Taxonomic RenaissanceICZN
 
Tomlinson et al (2016) - sediment & biota
Tomlinson et al (2016) - sediment & biotaTomlinson et al (2016) - sediment & biota
Tomlinson et al (2016) - sediment & biotaMSTomlinson
 
Linking biodiversity data for ecology
Linking biodiversity data for ecologyLinking biodiversity data for ecology
Linking biodiversity data for ecology
Anne Thessen
 
Austin ecn2013
Austin ecn2013Austin ecn2013
Austin ecn2013ECNOfficer
 
GloBI @ Berkeley Institute for Data Science Feb 5, 2015
GloBI @ Berkeley Institute for Data Science Feb 5, 2015GloBI @ Berkeley Institute for Data Science Feb 5, 2015
GloBI @ Berkeley Institute for Data Science Feb 5, 2015
jhpoelen245
 
Tony Rees IRMNG 2015 presentation
Tony Rees IRMNG 2015 presentationTony Rees IRMNG 2015 presentation
Tony Rees IRMNG 2015 presentation
Tony Rees
 
EMODnet 2015
EMODnet 2015EMODnet 2015
EMODnet 2015
hammockj
 
Cobb, Seltmann, Franz. 2014. The Current State of Arthropod Biodiversity Data...
Cobb, Seltmann, Franz. 2014. The Current State of Arthropod Biodiversity Data...Cobb, Seltmann, Franz. 2014. The Current State of Arthropod Biodiversity Data...
Cobb, Seltmann, Franz. 2014. The Current State of Arthropod Biodiversity Data...
taxonbytes
 
Danita CV 2015 July
Danita CV 2015 JulyDanita CV 2015 July
Danita CV 2015 JulyDanita Mayer
 
FISHLink Presentation at JISC MRD Workshop
FISHLink Presentation at JISC MRD WorkshopFISHLink Presentation at JISC MRD Workshop
FISHLink Presentation at JISC MRD Workshopseanb
 
marine environment system
marine environment systemmarine environment system
marine environment system
Sugheidi27
 
Biodiverse - Rosauer talk @ iEvoBio conference June 2010
Biodiverse - Rosauer talk @ iEvoBio conference June 2010Biodiverse - Rosauer talk @ iEvoBio conference June 2010
Biodiverse - Rosauer talk @ iEvoBio conference June 2010Dan Rosauer
 
uBio presentation to Jim Edwards 2006
uBio presentation to Jim Edwards 2006uBio presentation to Jim Edwards 2006
uBio presentation to Jim Edwards 2006
David Remsen
 
ANL Soil Metagenomics 2014 Soil Reference Database - Let's do this
ANL Soil Metagenomics 2014 Soil Reference Database - Let's do thisANL Soil Metagenomics 2014 Soil Reference Database - Let's do this
ANL Soil Metagenomics 2014 Soil Reference Database - Let's do this
Adina Chuang Howe
 
NYC Audubon Conservation
NYC Audubon ConservationNYC Audubon Conservation
NYC Audubon Conservation
gphillips
 
Challenge of Semantics for the Encyclopedia of Life
Challenge of Semantics for the Encyclopedia of LifeChallenge of Semantics for the Encyclopedia of Life
Challenge of Semantics for the Encyclopedia of Life
Cyndy Parr
 
phylosmith
phylosmithphylosmith
phylosmith
sdsmith1390
 
Frontiers of discovery with Encyclopedia of Life
Frontiers of discovery with Encyclopedia of LifeFrontiers of discovery with Encyclopedia of Life
Frontiers of discovery with Encyclopedia of Life
Cyndy Parr
 

What's hot (20)

Encyclopedia of Life: Use cases for phenotypes
Encyclopedia of Life: Use cases for phenotypesEncyclopedia of Life: Use cases for phenotypes
Encyclopedia of Life: Use cases for phenotypes
 
Quentin D. Wheeler - ZooBank and the Taxonomic Renaissance
Quentin D. Wheeler - ZooBank and the Taxonomic RenaissanceQuentin D. Wheeler - ZooBank and the Taxonomic Renaissance
Quentin D. Wheeler - ZooBank and the Taxonomic Renaissance
 
Tomlinson et al (2016) - sediment & biota
Tomlinson et al (2016) - sediment & biotaTomlinson et al (2016) - sediment & biota
Tomlinson et al (2016) - sediment & biota
 
Linking biodiversity data for ecology
Linking biodiversity data for ecologyLinking biodiversity data for ecology
Linking biodiversity data for ecology
 
Austin ecn2013
Austin ecn2013Austin ecn2013
Austin ecn2013
 
GloBI @ Berkeley Institute for Data Science Feb 5, 2015
GloBI @ Berkeley Institute for Data Science Feb 5, 2015GloBI @ Berkeley Institute for Data Science Feb 5, 2015
GloBI @ Berkeley Institute for Data Science Feb 5, 2015
 
Tony Rees IRMNG 2015 presentation
Tony Rees IRMNG 2015 presentationTony Rees IRMNG 2015 presentation
Tony Rees IRMNG 2015 presentation
 
EMODnet 2015
EMODnet 2015EMODnet 2015
EMODnet 2015
 
Cobb, Seltmann, Franz. 2014. The Current State of Arthropod Biodiversity Data...
Cobb, Seltmann, Franz. 2014. The Current State of Arthropod Biodiversity Data...Cobb, Seltmann, Franz. 2014. The Current State of Arthropod Biodiversity Data...
Cobb, Seltmann, Franz. 2014. The Current State of Arthropod Biodiversity Data...
 
Danita CV 2015 July
Danita CV 2015 JulyDanita CV 2015 July
Danita CV 2015 July
 
FISHLink Presentation at JISC MRD Workshop
FISHLink Presentation at JISC MRD WorkshopFISHLink Presentation at JISC MRD Workshop
FISHLink Presentation at JISC MRD Workshop
 
Plant names: Obstacles and Solutions to access information about plants
Plant names: Obstacles and Solutions to access information about plantsPlant names: Obstacles and Solutions to access information about plants
Plant names: Obstacles and Solutions to access information about plants
 
marine environment system
marine environment systemmarine environment system
marine environment system
 
Biodiverse - Rosauer talk @ iEvoBio conference June 2010
Biodiverse - Rosauer talk @ iEvoBio conference June 2010Biodiverse - Rosauer talk @ iEvoBio conference June 2010
Biodiverse - Rosauer talk @ iEvoBio conference June 2010
 
uBio presentation to Jim Edwards 2006
uBio presentation to Jim Edwards 2006uBio presentation to Jim Edwards 2006
uBio presentation to Jim Edwards 2006
 
ANL Soil Metagenomics 2014 Soil Reference Database - Let's do this
ANL Soil Metagenomics 2014 Soil Reference Database - Let's do thisANL Soil Metagenomics 2014 Soil Reference Database - Let's do this
ANL Soil Metagenomics 2014 Soil Reference Database - Let's do this
 
NYC Audubon Conservation
NYC Audubon ConservationNYC Audubon Conservation
NYC Audubon Conservation
 
Challenge of Semantics for the Encyclopedia of Life
Challenge of Semantics for the Encyclopedia of LifeChallenge of Semantics for the Encyclopedia of Life
Challenge of Semantics for the Encyclopedia of Life
 
phylosmith
phylosmithphylosmith
phylosmith
 
Frontiers of discovery with Encyclopedia of Life
Frontiers of discovery with Encyclopedia of LifeFrontiers of discovery with Encyclopedia of Life
Frontiers of discovery with Encyclopedia of Life
 

Viewers also liked

Requerimento de Eduardo da Fonte
Requerimento de Eduardo da FonteRequerimento de Eduardo da Fonte
Requerimento de Eduardo da FontePortal NE10
 
Classifications in EOL
Classifications in EOLClassifications in EOL
Classifications in EOL
Cyndy Parr
 
Processo judicial eletronico
Processo judicial eletronicoProcesso judicial eletronico
Processo judicial eletronicoPortal NE10
 
Locais provisorios de votação
Locais provisorios de votaçãoLocais provisorios de votação
Locais provisorios de votação
Portal NE10
 
Programação Festival de Quadrilhas 2014
Programação Festival de Quadrilhas 2014Programação Festival de Quadrilhas 2014
Programação Festival de Quadrilhas 2014Portal NE10
 
Grade Ciclo Natalino 2013
Grade Ciclo Natalino 2013Grade Ciclo Natalino 2013
Grade Ciclo Natalino 2013Portal NE10
 
Relação de CTS para a Copa do Mundo de 2014
Relação de CTS para a Copa do Mundo de 2014Relação de CTS para a Copa do Mundo de 2014
Relação de CTS para a Copa do Mundo de 2014Portal NE10
 

Viewers also liked (7)

Requerimento de Eduardo da Fonte
Requerimento de Eduardo da FonteRequerimento de Eduardo da Fonte
Requerimento de Eduardo da Fonte
 
Classifications in EOL
Classifications in EOLClassifications in EOL
Classifications in EOL
 
Processo judicial eletronico
Processo judicial eletronicoProcesso judicial eletronico
Processo judicial eletronico
 
Locais provisorios de votação
Locais provisorios de votaçãoLocais provisorios de votação
Locais provisorios de votação
 
Programação Festival de Quadrilhas 2014
Programação Festival de Quadrilhas 2014Programação Festival de Quadrilhas 2014
Programação Festival de Quadrilhas 2014
 
Grade Ciclo Natalino 2013
Grade Ciclo Natalino 2013Grade Ciclo Natalino 2013
Grade Ciclo Natalino 2013
 
Relação de CTS para a Copa do Mundo de 2014
Relação de CTS para a Copa do Mundo de 2014Relação de CTS para a Copa do Mundo de 2014
Relação de CTS para a Copa do Mundo de 2014
 

Similar to How the Encyclopedia of Life is wrangling organismal attribute data

iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
Cyndy Parr
 
Global patterns of insect diiversity, distribution and evolutionary distinctness
Global patterns of insect diiversity, distribution and evolutionary distinctnessGlobal patterns of insect diiversity, distribution and evolutionary distinctness
Global patterns of insect diiversity, distribution and evolutionary distinctness
Alison Specht
 
Parfrey smbe euk_2013_final
Parfrey smbe euk_2013_finalParfrey smbe euk_2013_final
Parfrey smbe euk_2013_finalLaura_Parfrey
 
Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...
Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...
Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...
Cyndy Parr
 
Semantics of and for the diversity of life:
 Opportunities and perils of tryi...
Semantics of and for the diversity of life:
 Opportunities and perils of tryi...Semantics of and for the diversity of life:
 Opportunities and perils of tryi...
Semantics of and for the diversity of life:
 Opportunities and perils of tryi...
Hilmar Lapp
 
Rapid Impact Assessment of Climatic and Physio-graphic Changes on Flagship G...
Rapid Impact Assessment of Climatic and Physio-graphic Changes  on Flagship G...Rapid Impact Assessment of Climatic and Physio-graphic Changes  on Flagship G...
Rapid Impact Assessment of Climatic and Physio-graphic Changes on Flagship G...
Arvinder Singh
 
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...
Larry Smarr
 
Using Supercomputers and Supernetworks to Explore the Ocean of Life
Using Supercomputers and Supernetworks to Explore the Ocean of LifeUsing Supercomputers and Supernetworks to Explore the Ocean of Life
Using Supercomputers and Supernetworks to Explore the Ocean of Life
Larry Smarr
 
Introduction to EOL.org for scientists
Introduction to EOL.org for scientistsIntroduction to EOL.org for scientists
Introduction to EOL.org for scientists
Cyndy Parr
 
Microbial Phylogenomics (EVE161) Class 5
Microbial Phylogenomics (EVE161) Class 5Microbial Phylogenomics (EVE161) Class 5
Microbial Phylogenomics (EVE161) Class 5
Jonathan Eisen
 
PENSOFT ARTICLE COLLECTION ABOUT MYANMAR
PENSOFT ARTICLE COLLECTION ABOUT MYANMARPENSOFT ARTICLE COLLECTION ABOUT MYANMAR
PENSOFT ARTICLE COLLECTION ABOUT MYANMAR
MYO AUNG Myanmar
 
Microbial Metagenomics Drives a New Cyberinfrastructure
Microbial Metagenomics Drives a New CyberinfrastructureMicrobial Metagenomics Drives a New Cyberinfrastructure
Microbial Metagenomics Drives a New Cyberinfrastructure
Larry Smarr
 
Neaq june.4.10
Neaq june.4.10Neaq june.4.10
Neaq june.4.10
tbarbaro
 
Sophomore Proposal Presentation 03172011
Sophomore Proposal Presentation 03172011Sophomore Proposal Presentation 03172011
Sophomore Proposal Presentation 03172011Amy Chen
 
CESAB-ACTIAS-sfe2018
CESAB-ACTIAS-sfe2018CESAB-ACTIAS-sfe2018
CESAB-ACTIAS-sfe2018
CESAB-FRB
 
pro-iBiosphere Towards Open Biodiversity Knowledge COOPEUS 2013
pro-iBiosphere Towards Open Biodiversity Knowledge COOPEUS 2013pro-iBiosphere Towards Open Biodiversity Knowledge COOPEUS 2013
pro-iBiosphere Towards Open Biodiversity Knowledge COOPEUS 2013
millerjeremya
 
Big Data Field Museum
Big Data Field MuseumBig Data Field Museum
Big Data Field Museum
Adina Chuang Howe
 

Similar to How the Encyclopedia of Life is wrangling organismal attribute data (20)

iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
 
Global patterns of insect diiversity, distribution and evolutionary distinctness
Global patterns of insect diiversity, distribution and evolutionary distinctnessGlobal patterns of insect diiversity, distribution and evolutionary distinctness
Global patterns of insect diiversity, distribution and evolutionary distinctness
 
Parfrey smbe euk_2013_final
Parfrey smbe euk_2013_finalParfrey smbe euk_2013_final
Parfrey smbe euk_2013_final
 
Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...
Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...
Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...
 
Shorthouse
ShorthouseShorthouse
Shorthouse
 
Semantics of and for the diversity of life:
 Opportunities and perils of tryi...
Semantics of and for the diversity of life:
 Opportunities and perils of tryi...Semantics of and for the diversity of life:
 Opportunities and perils of tryi...
Semantics of and for the diversity of life:
 Opportunities and perils of tryi...
 
Rapid Impact Assessment of Climatic and Physio-graphic Changes on Flagship G...
Rapid Impact Assessment of Climatic and Physio-graphic Changes  on Flagship G...Rapid Impact Assessment of Climatic and Physio-graphic Changes  on Flagship G...
Rapid Impact Assessment of Climatic and Physio-graphic Changes on Flagship G...
 
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...
 
Using Supercomputers and Supernetworks to Explore the Ocean of Life
Using Supercomputers and Supernetworks to Explore the Ocean of LifeUsing Supercomputers and Supernetworks to Explore the Ocean of Life
Using Supercomputers and Supernetworks to Explore the Ocean of Life
 
Big data nebraska
Big data nebraskaBig data nebraska
Big data nebraska
 
Introduction to EOL.org for scientists
Introduction to EOL.org for scientistsIntroduction to EOL.org for scientists
Introduction to EOL.org for scientists
 
Microbial Phylogenomics (EVE161) Class 5
Microbial Phylogenomics (EVE161) Class 5Microbial Phylogenomics (EVE161) Class 5
Microbial Phylogenomics (EVE161) Class 5
 
PENSOFT ARTICLE COLLECTION ABOUT MYANMAR
PENSOFT ARTICLE COLLECTION ABOUT MYANMARPENSOFT ARTICLE COLLECTION ABOUT MYANMAR
PENSOFT ARTICLE COLLECTION ABOUT MYANMAR
 
Microbial Metagenomics Drives a New Cyberinfrastructure
Microbial Metagenomics Drives a New CyberinfrastructureMicrobial Metagenomics Drives a New Cyberinfrastructure
Microbial Metagenomics Drives a New Cyberinfrastructure
 
Neaq june.4.10
Neaq june.4.10Neaq june.4.10
Neaq june.4.10
 
Sophomore Proposal Presentation 03172011
Sophomore Proposal Presentation 03172011Sophomore Proposal Presentation 03172011
Sophomore Proposal Presentation 03172011
 
CESAB-ACTIAS-sfe2018
CESAB-ACTIAS-sfe2018CESAB-ACTIAS-sfe2018
CESAB-ACTIAS-sfe2018
 
Big data nebraska
Big data nebraskaBig data nebraska
Big data nebraska
 
pro-iBiosphere Towards Open Biodiversity Knowledge COOPEUS 2013
pro-iBiosphere Towards Open Biodiversity Knowledge COOPEUS 2013pro-iBiosphere Towards Open Biodiversity Knowledge COOPEUS 2013
pro-iBiosphere Towards Open Biodiversity Knowledge COOPEUS 2013
 
Big Data Field Museum
Big Data Field MuseumBig Data Field Museum
Big Data Field Museum
 

More from Cyndy Parr

Open data and the ag data commons
Open data and the ag data commonsOpen data and the ag data commons
Open data and the ag data commons
Cyndy Parr
 
Ag Data Commons for AgBioData
Ag Data Commons for AgBioDataAg Data Commons for AgBioData
Ag Data Commons for AgBioData
Cyndy Parr
 
Biodiversity informatics and the agricultural data landscape
Biodiversity informatics and the agricultural data landscapeBiodiversity informatics and the agricultural data landscape
Biodiversity informatics and the agricultural data landscape
Cyndy Parr
 
Public access to research results at USDA
Public access to research results at USDAPublic access to research results at USDA
Public access to research results at USDA
Cyndy Parr
 
Ag Data Commons: Agricultural research metadata and data
Ag Data Commons: Agricultural research metadata and dataAg Data Commons: Agricultural research metadata and data
Ag Data Commons: Agricultural research metadata and data
Cyndy Parr
 
Ag Data Commons: A new USDA catalog and repository for agricultural research ...
Ag Data Commons: A new USDA catalog and repository for agricultural research ...Ag Data Commons: A new USDA catalog and repository for agricultural research ...
Ag Data Commons: A new USDA catalog and repository for agricultural research ...
Cyndy Parr
 
Preparing for data-intensive science across domains.
Preparing for data-intensive science across domains.Preparing for data-intensive science across domains.
Preparing for data-intensive science across domains.
Cyndy Parr
 
Parr ag datacommonsnal_brownbag
Parr ag datacommonsnal_brownbagParr ag datacommonsnal_brownbag
Parr ag datacommonsnal_brownbagCyndy Parr
 
Ag Data Commons: Adding Value to open agricultural research data
Ag Data Commons: Adding Value to open agricultural research dataAg Data Commons: Adding Value to open agricultural research data
Ag Data Commons: Adding Value to open agricultural research data
Cyndy Parr
 
Big Data Initiatives for Agroecosystems
Big Data Initiatives for AgroecosystemsBig Data Initiatives for Agroecosystems
Big Data Initiatives for Agroecosystems
Cyndy Parr
 
TDWG 2014 opening talk: Chair's Welcome
TDWG 2014 opening talk: Chair's WelcomeTDWG 2014 opening talk: Chair's Welcome
TDWG 2014 opening talk: Chair's Welcome
Cyndy Parr
 
Behavior ontology workshop princeton
Behavior ontology workshop princetonBehavior ontology workshop princeton
Behavior ontology workshop princeton
Cyndy Parr
 
Practical interoperability across semantic stores of data for ecological, tax...
Practical interoperability across semantic stores of data for ecological, tax...Practical interoperability across semantic stores of data for ecological, tax...
Practical interoperability across semantic stores of data for ecological, tax...
Cyndy Parr
 
Using and extending Darwin Core for structured attribute data
Using and extending Darwin Core for structured attribute dataUsing and extending Darwin Core for structured attribute data
Using and extending Darwin Core for structured attribute data
Cyndy Parr
 
Species pages and portals
Species pages and portals Species pages and portals
Species pages and portals
Cyndy Parr
 
Building EOL species pages
Building EOL species pagesBuilding EOL species pages
Building EOL species pages
Cyndy Parr
 
Leveraging an international infrastructure: Case studies from the Encyclopeda...
Leveraging an international infrastructure: Case studies from the Encyclopeda...Leveraging an international infrastructure: Case studies from the Encyclopeda...
Leveraging an international infrastructure: Case studies from the Encyclopeda...
Cyndy Parr
 
EOL and Science: Yes we can!
EOL and Science: Yes we can!EOL and Science: Yes we can!
EOL and Science: Yes we can!
Cyndy Parr
 
EOL China Center status
EOL China Center statusEOL China Center status
EOL China Center status
Cyndy Parr
 
Western Ghats Portal
Western Ghats PortalWestern Ghats Portal
Western Ghats Portal
Cyndy Parr
 

More from Cyndy Parr (20)

Open data and the ag data commons
Open data and the ag data commonsOpen data and the ag data commons
Open data and the ag data commons
 
Ag Data Commons for AgBioData
Ag Data Commons for AgBioDataAg Data Commons for AgBioData
Ag Data Commons for AgBioData
 
Biodiversity informatics and the agricultural data landscape
Biodiversity informatics and the agricultural data landscapeBiodiversity informatics and the agricultural data landscape
Biodiversity informatics and the agricultural data landscape
 
Public access to research results at USDA
Public access to research results at USDAPublic access to research results at USDA
Public access to research results at USDA
 
Ag Data Commons: Agricultural research metadata and data
Ag Data Commons: Agricultural research metadata and dataAg Data Commons: Agricultural research metadata and data
Ag Data Commons: Agricultural research metadata and data
 
Ag Data Commons: A new USDA catalog and repository for agricultural research ...
Ag Data Commons: A new USDA catalog and repository for agricultural research ...Ag Data Commons: A new USDA catalog and repository for agricultural research ...
Ag Data Commons: A new USDA catalog and repository for agricultural research ...
 
Preparing for data-intensive science across domains.
Preparing for data-intensive science across domains.Preparing for data-intensive science across domains.
Preparing for data-intensive science across domains.
 
Parr ag datacommonsnal_brownbag
Parr ag datacommonsnal_brownbagParr ag datacommonsnal_brownbag
Parr ag datacommonsnal_brownbag
 
Ag Data Commons: Adding Value to open agricultural research data
Ag Data Commons: Adding Value to open agricultural research dataAg Data Commons: Adding Value to open agricultural research data
Ag Data Commons: Adding Value to open agricultural research data
 
Big Data Initiatives for Agroecosystems
Big Data Initiatives for AgroecosystemsBig Data Initiatives for Agroecosystems
Big Data Initiatives for Agroecosystems
 
TDWG 2014 opening talk: Chair's Welcome
TDWG 2014 opening talk: Chair's WelcomeTDWG 2014 opening talk: Chair's Welcome
TDWG 2014 opening talk: Chair's Welcome
 
Behavior ontology workshop princeton
Behavior ontology workshop princetonBehavior ontology workshop princeton
Behavior ontology workshop princeton
 
Practical interoperability across semantic stores of data for ecological, tax...
Practical interoperability across semantic stores of data for ecological, tax...Practical interoperability across semantic stores of data for ecological, tax...
Practical interoperability across semantic stores of data for ecological, tax...
 
Using and extending Darwin Core for structured attribute data
Using and extending Darwin Core for structured attribute dataUsing and extending Darwin Core for structured attribute data
Using and extending Darwin Core for structured attribute data
 
Species pages and portals
Species pages and portals Species pages and portals
Species pages and portals
 
Building EOL species pages
Building EOL species pagesBuilding EOL species pages
Building EOL species pages
 
Leveraging an international infrastructure: Case studies from the Encyclopeda...
Leveraging an international infrastructure: Case studies from the Encyclopeda...Leveraging an international infrastructure: Case studies from the Encyclopeda...
Leveraging an international infrastructure: Case studies from the Encyclopeda...
 
EOL and Science: Yes we can!
EOL and Science: Yes we can!EOL and Science: Yes we can!
EOL and Science: Yes we can!
 
EOL China Center status
EOL China Center statusEOL China Center status
EOL China Center status
 
Western Ghats Portal
Western Ghats PortalWestern Ghats Portal
Western Ghats Portal
 

Recently uploaded

IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 

Recently uploaded (20)

IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 

How the Encyclopedia of Life is wrangling organismal attribute data

  • 1. eol.org @eol @cydparr How the Encyclopedia of Life is wrangling organismal attribute data
  • 3. EOL Today Key Milestones in 2013 1.1 million species pages 240+ content providers 3.3 million unique annual visitors from 235 countries
  • 4. 0 100000 200000 300000 400000 500000 600000 700000 800000 Distribution MolecularBiology Multiple topics TypeInformation Habitat ConservationStatus Threats Morphology Conservation Management Trends Size Associations Uses TrophicStrategy Cyclicity & Life Cycle PopulationBiology Reproduction Migration Taxonomy LifeExpectancy Identification Behaviour Ecology Diseases Number of text objectsSubjectoftextobject
  • 5. Text mining, crowdsourcing, standardizing see http://eol.org/info/fellows Co-occurrence, term extraction & linked data Thessen & Devries EnvO habitat terms Pafilis et al. Altitude Specificity of Flower Coloration Wright Morphological impacts of extinction risk in fish Chang Butterfly-hostplant associations Ferrer-Parris et al. Species Interactions Poelen & Mungall et al.
  • 6. 14 datasets containing 25k taxa, 422k interactions, for 3k locations alpha version of ingestion, normalization, aggregation alpha version of web API alpha version of data exports Dr. Katy Börner led Information Visualization MOOC GLoBI http://globalbioticinteractions.wordpress.com/
  • 7. EOL TraitBank Funded: Marine focus Virtuoso triple store, re-using URIs where possible 5 datasets 128,050 data points for 20,896 taxa Harvest and display on data tab Downloads, fancy searching Machine access
  • 8.
  • 9.
  • 10. Uploads & harvests will be by spreadsheet and Darwin Core Archive Support for annotation and curation Please contact me to be part of the private beta
  • 11. Easy access to analyzable trait data “Are blue organisms more common in high altitudes?” “Does the evolution of mammalian bacula appear to be related to the pattern of promiscuous mating?” “What organisms should I collect to fill in gaps in genome quality tissue collections?” • Look for trait, download for all taxa • Create a collection of taxa, download all data • Use Reol: an R interface to EOL (Banbury, O’Meara) http://reolblog.wordpress.com/ • Find more specialized data repositories
  • 12. But also . . .
  • 13. Thanks Funding & other contributions Sloan Foundation Smithsonian Institution David Rubenstein Marine Biological Laboratory Harvard University Our content partners Thousands of individual contributors, and hundreds of volunteer curators Image credits Jenny from Taipei Cynthia Parr Chief Scientist @eol @cydparr parrc@si.edu Alexandria Archive: Sarah Kansa, Eric Kansa, 34 othe zooarchaeologists GLoBI: Jorrit Poelen (lead/software), Chris Mungall (ontologies), James Simons (biologist) and Robert Reiz (software). Datasets shared by: Peter D. Roopnarine, Rachel Hertog, Carlos García- Robledo, James Simons, Jenny L. Wrast, C. Barnes, International Council for the Exploration of the Sea (ICES), Jose R. Ferrer Paris, Senol Akin, Malcolm Storey (BioInfo.org.uk), Ivy E. Baremore, Joel Sachs (SPIRE), Colt W. Cook, David A. Blewett
  • 14. Quick math In Phenoscape 57 publications had 565,158 anatomical trait descriptions for 2,527 kinds of organisms = 223 traits/organism In ZFIN 38,189 trait descriptions for 4,727 genes for Zebra Fish 1.9 million species on the planet = LOTS OF TRAITS
  • 15. Anatolia Zooarchaeology Case Study led by Alexandria Archive Institute 1. 14 different sites 2. 34+ zooarchaeologists 3. Decoding, cleanup, metadata documentation 4. 220,000+ specimens 5. 450 entities linked to 143 EOL taxon concepts 6. Anatomical entities linked to Uberon.org 7. Biometrics linked to measurement ontology 8. Collaborative analysis http://opencontext.org/

Editor's Notes

  1. We have a working infrastructure as well as more than 200 partners, We harvest and sort text and multimedia by topic and by species and put it on our pages. Curation + user-added content from the crowds is added to the mix.This is fed back to providers, giving them traffic, quality control on their own content, and new content for them to use And, we are already seeing spinoff products. We make it easy for developers, and everything is either public domain or CC-licensed so it can be re-used.
  2. We now have over a million pages with content, some of it is even in other languages like Arabic, Spanish, and Chinese. And we are getting traffic mostly from the general public, from all over the world.
  3. Most of our 5.4 million content objects are text blobs and here are the subjects of that text. Most often, our text objects are about distribution. But there are many other subjects involved including essays that include multiple subjects.
  4. Except for the first, links for that one on request
  5. Information Visualization MOOC (Massive Open Online Course) led by Dr. Katy Börner of Indiana University, students TwyBethard (United States), Andrew Miles (United Kingdom), Edward Kok (Netherlands) and Mattia Della Libera (Italy) used GloBI data to create an insightful visualization of spatial marine food webs in the Gulf of Mexico.
  6. Starting with marine dataIn the most simplistic view, we’ll be storing triplesThis data will be organized on a data tab, sorting out the data into the 35 or so “topics” that we currently have text chapters for, and we will also allow powerful downloading and searching capabilityFinally we’ll be setting up ways for other applications to grab the data and do interesting things with it. We already have a tool for making field guides,The approach here builds on our innovations for EOL and adds some proven technology called the “semantic web” to our domain. The next step takes this chain of innovation even further.
  7. Drawing data from the literature, from online databases, and from published datasets as in Dryad, summarizing collections databases
  8. Everyone wants to know theattributes of organismsPeople exploring the world find something and want to be able to search on characteristics they can seeTeachers want their students to become adept at analyzing data, and how better than to work with real numerical information about the size of organisms or their behavior or what their sensitivity is to temperature and what might happen in the face of climate changeSo while scientists were saying they needed us to provide data they could analyze, we heard the same thing from our educators, too.
  9. Phenoscape is a database that is looking at anatomical traits in fishes. Looking just at 57 publications they have more than 500K descriptions for 2500 kinds of organisms.ZFIN is a model organism database for zebrafish, a common model organism for developmental biologists. In just this one species they have captured nearly 40,000 traits – just for ONE very well-studied SPECIES
  10. .