SlideShare a Scribd company logo
1 of 35
Cynthia Parr          Global Content Summit
Species Pages Group   17-19 Jan 2011
http://www.eol.org
• All species known to science
• Freely accessible: open
  access, open source
• Available from a single portal
  in a common format
• Quality
• Constantly growing
• Aimed at multiple audiences
GBIF
EOL Global Partners                                 ViBRANT



                      Dutch
                                Pan-       China
Mexico                          Arab
                                       India
   Costa
   Rica    Colombia

            Peru
                                                   Australia
                        South Africa


                                                    BHL-
                                                    Global
                                                    BHL
Aims of global partners
   Global access to knowledge about life on Earth
   To increase awareness and understanding of living
     nature through an Encyclopedia of Life that
     gathers, generates and shares knowledge in an
     open, freely accessible and trusted digital resource


Work together towards this vision and mission, sharing
expertise and knowledge as appropriate
Expand the global pool of knowledge about biodiversity and
improve access to it
Aims of this workshop
• Gather content experts from Global Partners
• Become familiar with each other’s work
• Learn how core EOL works and provide
  feedback on it
• Form the Species Pages Working Group
     Team at Smithsonian (SPG)
     Representatives from global partners
• Draft individual plans that complement each
  other towards a common goal
• Remind ourselves WHY we want to do this
What is content?
Biological information
   Names and hierarchies
   Descriptive text
   Literature
   Multimedia
   Maps
   Links to more information
…..what about comments, collection annotations?
Overview of agenda

Day 1: Introductions
Day 2: Sharing
Day 3: Planning
Acknowledgements
• Funding from:
    David M. Rubenstein gift
    John D. and Catherine T. MacArthur Foundation
    Alfred P. Sloane Foundation
    Smithsonian Institution
    Marine Biological Laboratory
    Harvard University
         and other funders and donors
• All our content partners and global partners
• Volunteer curators and individual contributors via Flickr, Wikimedia,
  and members of EOL
• All of you for coming
• Claire Badgley
Overview of Content Partnering




Cynthia Parr               Global Content Summit
Species Pages Group        17-19 Jan 2011
EOL is a content curation
                 community

Databases
Journals
LifeDesks & Scratchpads
                                                  Curate
Public contributions

          Aggregate

                                                  Comment
                                                  Rate, Collect
                                                                  eol.org


                      Quality control, prioritization                     API


                                                                  Third party apps
http://eol.org/content_partners
http://eol.org/info/content_partner_collections
Low hanging fruit




                    Photo credit: Stanislas PERRIN
Partner trajectory
                     150


                     125
Number of partners




                     100


                      75


                      50


                      25


                       0
                       Y1Q3 Y1Q4 Y2Q1 Y2Q2 Y2Q3 Y2Q4 Y3Q1 Y3Q2 Y3Q3 Y3Q4 Y4Q1 Y4Q2 Y4Q3
Long Tail in databases contributing to EOL
Number of taxa for which content is contributed to EOL
                                                         600000

                                                         500000

                                                         400000

                                                         300000

                                                         200000

                                                         100000

                                                              0
                                                                   1       11     21    31    41    51    61   71   81   91   101   111   121   131


                                                                                 … viewed on log scale
                                                         1000000

                                                          100000

                                                           10000

                                                             1000

                                                              100

                                                                  10

                                                                   1
                                                                       1    11     21    31    41    51   61   71   81   91   101   111   121   131


                                                                           Partners in order of # taxa contributed to EOL
Content strategy
Highlights
Priorities
Richness score
Processes
Goals
http://eol.org/info/partners
Content Partner process overview
Partner creates an EOL member account
Adds a content partner
We communicate with them
They (or we) upload a resource file or set a
 URL where one can be found
They set a harvest frequency
EOL harvests at that frequency
Current methods of data transfer
EOL resource document (XML) (usually they do
  the work)
Spreadsheet upload (either can do the work)
Connector (we do the work)
  Scrape web site or PDF
  Use web services
  Work from a copy of DB
Darwin Core Archive (classifications, soon)
See http://eol.org/info/cp_resource_checklist
How EOL gets content n=141 partners
70

60

50

40
                                    CSV
30                                  web
                                    service
20
                                    PDF
10                                  HTML
                                    DB
 0
     XML resource doc   Connector        LD/eLD/Scratchpad
                                          LD/eLD/Scratchpad Spreadsheet
Example partner
• Pensoft has a
  process to generate
  EOL-compliant XML
  for new species
• Also sends images to
  Morphbank,
  specimens to GBIF
• They registered the
  URL at EOL
• Our script checks for
  changes once a day
EOL Schema Sources

Content type              Standards used
Taxa                      Darwin Core Archive
Attribution & licensing   Dublin & Darwin Core
Text objects & links      Species Profile Model(and
Multimedia                  now +)
                          Dublin (+ Audubon Core)
Example biological content
EOL Table of Contents                        TDWG Species Profile
                                             Model
Physical Description › Morphology            #Morphology
Physical Description › Size                  #Size
Ecology › Habitat                            #Habitat
Ecology › Associations                       #Associations
Life History & Behavior › Life Expectancy    #LifeExpectancy
Evolution and Systematics › Functional       #Evolution
Adaptations
Conservation > Conservation Status           #ConservationStatus
Molecular Biology and Genetics › Genetics    #Genetics
Molecular Biology and Genetics › Genome      #MolecularBiology
Molecular Biology and Genetics › Molecular   #MolecularBiology
Biology
Nucleotide Sequences                         #MolecularBiology
SPM
  DwC            infoitem
description


              Plinian
               Core
                         using
                         Darwin Core Archive
                         flat files as
                         transport mechanism

   EOL v2
Controlled
              vocabulary
Numeric
 values


          Relations




EOL v3?
Partners
Can delete or replace any of their objects
Control how often we harvest, and can force a harvest
Get an automatically updating collection
Can request that we use their classification for browsing
Can change the logo and description of their project
Receive comments and curator actions immediately
Receive monthly reminders they can get traffic statistics
Get many links back to their original web resources
Partners cannot

Publish the very first time
Decide if they are pre-vetted
Roll back a harvest
Change the object of any other partners
Change classifications from any other
 partners
http://eol.org/pages/704102




    Richness scores




Cynthia Parr                  Global Content Summit
Species Pages Group           17-19 Jan 2011
Taxon page richness algorithm

a (Breadth)     +    b (Depth)      +    c (Diversity)

     60%                 30%                    10%


Breadth: Images, topics of text objects, references, maps,
videos, sounds, conservation status

Depth: # words per text object, # words total

Diversity: Sources (partners)
                                  0 – 100, Threshold 40
Summary of EOL page richness
Overall                 Hot List
950,000 have content    30 % of 75K are rich
2 % are rich            Average richness = ~30
~22 % have only links
to literature           Red Hot List
                        56 % of 3K are rich
                        Average richness = 43
How richness is used
Choose images for home page “March of Life”
Allows sorting in collections Weird life example
Helps provide best search and API results



Any other ideas? Could we be matchmakers for
 pages needing enrichment and users?
http://synthesis.eol.org/media/treemap
Strategies for improving richness
Crowd-sourcing    Leveraging
Collections       Enabling platforms
Communities       Enabling journals
Mobile apps       Data mining BHL etc.
The page richness index

Helps fill gaps with existing knowledge
Helps prioritize funding and training so that it
 has maximum impact on closing true gaps
Will be available via API

Computing and storing richness index on
 EOL is a step towards storing and serving
 computable data

More Related Content

Similar to Global content summit: Overview, content partnering, richness

Introduction to EOL - Long
Introduction to EOL - LongIntroduction to EOL - Long
Introduction to EOL - LongKatja Schulz
 
Using Online Natural History Databases to Support Innovation in Undergraduate...
Using Online Natural History Databases to Support Innovation in Undergraduate...Using Online Natural History Databases to Support Innovation in Undergraduate...
Using Online Natural History Databases to Support Innovation in Undergraduate...Encyclopedia of Life Learning + Education
 
Ontology and Ontology Libraries: a critical study
Ontology and Ontology Libraries: a critical studyOntology and Ontology Libraries: a critical study
Ontology and Ontology Libraries: a critical studyDebashisnaskar
 
Forging the Digital Roadmap: The Preservation, Curation and Stewardship Nexus
Forging the Digital Roadmap: The Preservation, Curation and Stewardship NexusForging the Digital Roadmap: The Preservation, Curation and Stewardship Nexus
Forging the Digital Roadmap: The Preservation, Curation and Stewardship NexusBianca Crowley
 
Leveraging an international infrastructure: Case studies from the Encyclopeda...
Leveraging an international infrastructure: Case studies from the Encyclopeda...Leveraging an international infrastructure: Case studies from the Encyclopeda...
Leveraging an international infrastructure: Case studies from the Encyclopeda...Cyndy Parr
 
EOL and Science: Yes we can!
EOL and Science: Yes we can!EOL and Science: Yes we can!
EOL and Science: Yes we can!Cyndy Parr
 
Introduction to EOL v2 for Crossroads
Introduction to EOL v2 for Crossroads Introduction to EOL v2 for Crossroads
Introduction to EOL v2 for Crossroads Cyndy Parr
 
Neaq june.4.10
Neaq june.4.10Neaq june.4.10
Neaq june.4.10tbarbaro
 
Encyclopedia of Life: Use cases for phenotypes
Encyclopedia of Life: Use cases for phenotypesEncyclopedia of Life: Use cases for phenotypes
Encyclopedia of Life: Use cases for phenotypesCyndy Parr
 
V Rolfe OER12 Conference Search Engine Optimisation 17April2012
V Rolfe OER12 Conference Search Engine Optimisation 17April2012V Rolfe OER12 Conference Search Engine Optimisation 17April2012
V Rolfe OER12 Conference Search Engine Optimisation 17April2012Vivien Rolfe
 
Smithsonian Libraries & The Biodiversity Heritage Library: 2003 - 2012
Smithsonian Libraries & The Biodiversity Heritage Library: 2003 - 2012Smithsonian Libraries & The Biodiversity Heritage Library: 2003 - 2012
Smithsonian Libraries & The Biodiversity Heritage Library: 2003 - 2012Martin Kalfatovic
 
Connecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics InstituteConnecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics InstituteConnected Data World
 
John Dalling JIBS User Group Resource Discovery event February 2013
John Dalling JIBS User Group Resource Discovery event February 2013John Dalling JIBS User Group Resource Discovery event February 2013
John Dalling JIBS User Group Resource Discovery event February 2013sherif user group
 
Ontology and Ontology Libraries: a Critical Study
Ontology and Ontology Libraries: a Critical StudyOntology and Ontology Libraries: a Critical Study
Ontology and Ontology Libraries: a Critical StudyDebashisnaskar
 
Society for biocuration panel discussion, April 2013
Society for biocuration panel discussion, April 2013Society for biocuration panel discussion, April 2013
Society for biocuration panel discussion, April 2013Theodora Bloom
 
Community content building for evolutionary biology: Lessons learned from Lep...
Community content building for evolutionary biology: Lessons learned from Lep...Community content building for evolutionary biology: Lessons learned from Lep...
Community content building for evolutionary biology: Lessons learned from Lep...Cyndy Parr
 
Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...Dag Endresen
 
Facilitating semantic alignment.-biohackathon-jupp
Facilitating semantic alignment.-biohackathon-juppFacilitating semantic alignment.-biohackathon-jupp
Facilitating semantic alignment.-biohackathon-juppSimon Jupp
 

Similar to Global content summit: Overview, content partnering, richness (20)

Introduction to EOL - Long
Introduction to EOL - LongIntroduction to EOL - Long
Introduction to EOL - Long
 
Using Online Natural History Databases to Support Innovation in Undergraduate...
Using Online Natural History Databases to Support Innovation in Undergraduate...Using Online Natural History Databases to Support Innovation in Undergraduate...
Using Online Natural History Databases to Support Innovation in Undergraduate...
 
Ontology and Ontology Libraries: a critical study
Ontology and Ontology Libraries: a critical studyOntology and Ontology Libraries: a critical study
Ontology and Ontology Libraries: a critical study
 
Forging the Digital Roadmap: The Preservation, Curation and Stewardship Nexus
Forging the Digital Roadmap: The Preservation, Curation and Stewardship NexusForging the Digital Roadmap: The Preservation, Curation and Stewardship Nexus
Forging the Digital Roadmap: The Preservation, Curation and Stewardship Nexus
 
Leveraging an international infrastructure: Case studies from the Encyclopeda...
Leveraging an international infrastructure: Case studies from the Encyclopeda...Leveraging an international infrastructure: Case studies from the Encyclopeda...
Leveraging an international infrastructure: Case studies from the Encyclopeda...
 
EOL and Science: Yes we can!
EOL and Science: Yes we can!EOL and Science: Yes we can!
EOL and Science: Yes we can!
 
Introduction to EOL v2 for Crossroads
Introduction to EOL v2 for Crossroads Introduction to EOL v2 for Crossroads
Introduction to EOL v2 for Crossroads
 
Neaq june.4.10
Neaq june.4.10Neaq june.4.10
Neaq june.4.10
 
Encyclopedia of Life: Use cases for phenotypes
Encyclopedia of Life: Use cases for phenotypesEncyclopedia of Life: Use cases for phenotypes
Encyclopedia of Life: Use cases for phenotypes
 
V Rolfe OER12 Conference Search Engine Optimisation 17April2012
V Rolfe OER12 Conference Search Engine Optimisation 17April2012V Rolfe OER12 Conference Search Engine Optimisation 17April2012
V Rolfe OER12 Conference Search Engine Optimisation 17April2012
 
Danis&raymond
Danis&raymondDanis&raymond
Danis&raymond
 
Smithsonian Libraries & The Biodiversity Heritage Library: 2003 - 2012
Smithsonian Libraries & The Biodiversity Heritage Library: 2003 - 2012Smithsonian Libraries & The Biodiversity Heritage Library: 2003 - 2012
Smithsonian Libraries & The Biodiversity Heritage Library: 2003 - 2012
 
Connecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics InstituteConnecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics Institute
 
John Dalling JIBS User Group Resource Discovery event February 2013
John Dalling JIBS User Group Resource Discovery event February 2013John Dalling JIBS User Group Resource Discovery event February 2013
John Dalling JIBS User Group Resource Discovery event February 2013
 
FAIR data requires FAIR ontologies, how do we do?
FAIR data requires FAIR ontologies, how do we do?FAIR data requires FAIR ontologies, how do we do?
FAIR data requires FAIR ontologies, how do we do?
 
Ontology and Ontology Libraries: a Critical Study
Ontology and Ontology Libraries: a Critical StudyOntology and Ontology Libraries: a Critical Study
Ontology and Ontology Libraries: a Critical Study
 
Society for biocuration panel discussion, April 2013
Society for biocuration panel discussion, April 2013Society for biocuration panel discussion, April 2013
Society for biocuration panel discussion, April 2013
 
Community content building for evolutionary biology: Lessons learned from Lep...
Community content building for evolutionary biology: Lessons learned from Lep...Community content building for evolutionary biology: Lessons learned from Lep...
Community content building for evolutionary biology: Lessons learned from Lep...
 
Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...
 
Facilitating semantic alignment.-biohackathon-jupp
Facilitating semantic alignment.-biohackathon-juppFacilitating semantic alignment.-biohackathon-jupp
Facilitating semantic alignment.-biohackathon-jupp
 

More from Cyndy Parr

Open data and the ag data commons
Open data and the ag data commonsOpen data and the ag data commons
Open data and the ag data commonsCyndy Parr
 
Ag Data Commons for AgBioData
Ag Data Commons for AgBioDataAg Data Commons for AgBioData
Ag Data Commons for AgBioDataCyndy Parr
 
Biodiversity informatics and the agricultural data landscape
Biodiversity informatics and the agricultural data landscapeBiodiversity informatics and the agricultural data landscape
Biodiversity informatics and the agricultural data landscapeCyndy Parr
 
Public access to research results at USDA
Public access to research results at USDAPublic access to research results at USDA
Public access to research results at USDACyndy Parr
 
Ag Data Commons: Agricultural research metadata and data
Ag Data Commons: Agricultural research metadata and dataAg Data Commons: Agricultural research metadata and data
Ag Data Commons: Agricultural research metadata and dataCyndy Parr
 
Ag Data Commons: A new USDA catalog and repository for agricultural research ...
Ag Data Commons: A new USDA catalog and repository for agricultural research ...Ag Data Commons: A new USDA catalog and repository for agricultural research ...
Ag Data Commons: A new USDA catalog and repository for agricultural research ...Cyndy Parr
 
Preparing for data-intensive science across domains.
Preparing for data-intensive science across domains.Preparing for data-intensive science across domains.
Preparing for data-intensive science across domains.Cyndy Parr
 
Parr ag datacommonsnal_brownbag
Parr ag datacommonsnal_brownbagParr ag datacommonsnal_brownbag
Parr ag datacommonsnal_brownbagCyndy Parr
 
Ag Data Commons: Adding Value to open agricultural research data
Ag Data Commons: Adding Value to open agricultural research dataAg Data Commons: Adding Value to open agricultural research data
Ag Data Commons: Adding Value to open agricultural research dataCyndy Parr
 
Big Data Initiatives for Agroecosystems
Big Data Initiatives for AgroecosystemsBig Data Initiatives for Agroecosystems
Big Data Initiatives for AgroecosystemsCyndy Parr
 
TDWG 2014 opening talk: Chair's Welcome
TDWG 2014 opening talk: Chair's WelcomeTDWG 2014 opening talk: Chair's Welcome
TDWG 2014 opening talk: Chair's WelcomeCyndy Parr
 
Behavior ontology workshop princeton
Behavior ontology workshop princetonBehavior ontology workshop princeton
Behavior ontology workshop princetonCyndy Parr
 
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK Cyndy Parr
 
Frontiers of discovery with Encyclopedia of Life
Frontiers of discovery with Encyclopedia of LifeFrontiers of discovery with Encyclopedia of Life
Frontiers of discovery with Encyclopedia of Life Cyndy Parr
 
Practical interoperability across semantic stores of data for ecological, tax...
Practical interoperability across semantic stores of data for ecological, tax...Practical interoperability across semantic stores of data for ecological, tax...
Practical interoperability across semantic stores of data for ecological, tax...Cyndy Parr
 
Using and extending Darwin Core for structured attribute data
Using and extending Darwin Core for structured attribute dataUsing and extending Darwin Core for structured attribute data
Using and extending Darwin Core for structured attribute dataCyndy Parr
 
How the Encyclopedia of Life is wrangling organismal attribute data
How the Encyclopedia of Life is wrangling organismal attribute dataHow the Encyclopedia of Life is wrangling organismal attribute data
How the Encyclopedia of Life is wrangling organismal attribute dataCyndy Parr
 
The Road to TraitBank: What's Next for the Encyclopedia of Life
The Road to TraitBank: What's Next for the Encyclopedia of LifeThe Road to TraitBank: What's Next for the Encyclopedia of Life
The Road to TraitBank: What's Next for the Encyclopedia of LifeCyndy Parr
 
Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...
Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...
Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...Cyndy Parr
 
EOL China Center status
EOL China Center statusEOL China Center status
EOL China Center statusCyndy Parr
 

More from Cyndy Parr (20)

Open data and the ag data commons
Open data and the ag data commonsOpen data and the ag data commons
Open data and the ag data commons
 
Ag Data Commons for AgBioData
Ag Data Commons for AgBioDataAg Data Commons for AgBioData
Ag Data Commons for AgBioData
 
Biodiversity informatics and the agricultural data landscape
Biodiversity informatics and the agricultural data landscapeBiodiversity informatics and the agricultural data landscape
Biodiversity informatics and the agricultural data landscape
 
Public access to research results at USDA
Public access to research results at USDAPublic access to research results at USDA
Public access to research results at USDA
 
Ag Data Commons: Agricultural research metadata and data
Ag Data Commons: Agricultural research metadata and dataAg Data Commons: Agricultural research metadata and data
Ag Data Commons: Agricultural research metadata and data
 
Ag Data Commons: A new USDA catalog and repository for agricultural research ...
Ag Data Commons: A new USDA catalog and repository for agricultural research ...Ag Data Commons: A new USDA catalog and repository for agricultural research ...
Ag Data Commons: A new USDA catalog and repository for agricultural research ...
 
Preparing for data-intensive science across domains.
Preparing for data-intensive science across domains.Preparing for data-intensive science across domains.
Preparing for data-intensive science across domains.
 
Parr ag datacommonsnal_brownbag
Parr ag datacommonsnal_brownbagParr ag datacommonsnal_brownbag
Parr ag datacommonsnal_brownbag
 
Ag Data Commons: Adding Value to open agricultural research data
Ag Data Commons: Adding Value to open agricultural research dataAg Data Commons: Adding Value to open agricultural research data
Ag Data Commons: Adding Value to open agricultural research data
 
Big Data Initiatives for Agroecosystems
Big Data Initiatives for AgroecosystemsBig Data Initiatives for Agroecosystems
Big Data Initiatives for Agroecosystems
 
TDWG 2014 opening talk: Chair's Welcome
TDWG 2014 opening talk: Chair's WelcomeTDWG 2014 opening talk: Chair's Welcome
TDWG 2014 opening talk: Chair's Welcome
 
Behavior ontology workshop princeton
Behavior ontology workshop princetonBehavior ontology workshop princeton
Behavior ontology workshop princeton
 
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
 
Frontiers of discovery with Encyclopedia of Life
Frontiers of discovery with Encyclopedia of LifeFrontiers of discovery with Encyclopedia of Life
Frontiers of discovery with Encyclopedia of Life
 
Practical interoperability across semantic stores of data for ecological, tax...
Practical interoperability across semantic stores of data for ecological, tax...Practical interoperability across semantic stores of data for ecological, tax...
Practical interoperability across semantic stores of data for ecological, tax...
 
Using and extending Darwin Core for structured attribute data
Using and extending Darwin Core for structured attribute dataUsing and extending Darwin Core for structured attribute data
Using and extending Darwin Core for structured attribute data
 
How the Encyclopedia of Life is wrangling organismal attribute data
How the Encyclopedia of Life is wrangling organismal attribute dataHow the Encyclopedia of Life is wrangling organismal attribute data
How the Encyclopedia of Life is wrangling organismal attribute data
 
The Road to TraitBank: What's Next for the Encyclopedia of Life
The Road to TraitBank: What's Next for the Encyclopedia of LifeThe Road to TraitBank: What's Next for the Encyclopedia of Life
The Road to TraitBank: What's Next for the Encyclopedia of Life
 
Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...
Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...
Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...
 
EOL China Center status
EOL China Center statusEOL China Center status
EOL China Center status
 

Recently uploaded

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 

Recently uploaded (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

Global content summit: Overview, content partnering, richness

  • 1. Cynthia Parr Global Content Summit Species Pages Group 17-19 Jan 2011
  • 2. http://www.eol.org • All species known to science • Freely accessible: open access, open source • Available from a single portal in a common format • Quality • Constantly growing • Aimed at multiple audiences
  • 3. GBIF EOL Global Partners ViBRANT Dutch Pan- China Mexico Arab India Costa Rica Colombia Peru Australia South Africa BHL- Global BHL
  • 4. Aims of global partners Global access to knowledge about life on Earth To increase awareness and understanding of living nature through an Encyclopedia of Life that gathers, generates and shares knowledge in an open, freely accessible and trusted digital resource Work together towards this vision and mission, sharing expertise and knowledge as appropriate Expand the global pool of knowledge about biodiversity and improve access to it
  • 5. Aims of this workshop • Gather content experts from Global Partners • Become familiar with each other’s work • Learn how core EOL works and provide feedback on it • Form the Species Pages Working Group Team at Smithsonian (SPG) Representatives from global partners • Draft individual plans that complement each other towards a common goal • Remind ourselves WHY we want to do this
  • 6. What is content? Biological information Names and hierarchies Descriptive text Literature Multimedia Maps Links to more information …..what about comments, collection annotations?
  • 7. Overview of agenda Day 1: Introductions Day 2: Sharing Day 3: Planning
  • 8. Acknowledgements • Funding from: David M. Rubenstein gift John D. and Catherine T. MacArthur Foundation Alfred P. Sloane Foundation Smithsonian Institution Marine Biological Laboratory Harvard University and other funders and donors • All our content partners and global partners • Volunteer curators and individual contributors via Flickr, Wikimedia, and members of EOL • All of you for coming • Claire Badgley
  • 9. Overview of Content Partnering Cynthia Parr Global Content Summit Species Pages Group 17-19 Jan 2011
  • 10. EOL is a content curation community Databases Journals LifeDesks & Scratchpads Curate Public contributions Aggregate Comment Rate, Collect eol.org Quality control, prioritization API Third party apps
  • 13. Low hanging fruit Photo credit: Stanislas PERRIN
  • 14. Partner trajectory 150 125 Number of partners 100 75 50 25 0 Y1Q3 Y1Q4 Y2Q1 Y2Q2 Y2Q3 Y2Q4 Y3Q1 Y3Q2 Y3Q3 Y3Q4 Y4Q1 Y4Q2 Y4Q3
  • 15. Long Tail in databases contributing to EOL Number of taxa for which content is contributed to EOL 600000 500000 400000 300000 200000 100000 0 1 11 21 31 41 51 61 71 81 91 101 111 121 131 … viewed on log scale 1000000 100000 10000 1000 100 10 1 1 11 21 31 41 51 61 71 81 91 101 111 121 131 Partners in order of # taxa contributed to EOL
  • 18. Content Partner process overview Partner creates an EOL member account Adds a content partner We communicate with them They (or we) upload a resource file or set a URL where one can be found They set a harvest frequency EOL harvests at that frequency
  • 19. Current methods of data transfer EOL resource document (XML) (usually they do the work) Spreadsheet upload (either can do the work) Connector (we do the work) Scrape web site or PDF Use web services Work from a copy of DB Darwin Core Archive (classifications, soon) See http://eol.org/info/cp_resource_checklist
  • 20. How EOL gets content n=141 partners 70 60 50 40 CSV 30 web service 20 PDF 10 HTML DB 0 XML resource doc Connector LD/eLD/Scratchpad LD/eLD/Scratchpad Spreadsheet
  • 21. Example partner • Pensoft has a process to generate EOL-compliant XML for new species • Also sends images to Morphbank, specimens to GBIF • They registered the URL at EOL • Our script checks for changes once a day
  • 22. EOL Schema Sources Content type Standards used Taxa Darwin Core Archive Attribution & licensing Dublin & Darwin Core Text objects & links Species Profile Model(and Multimedia now +) Dublin (+ Audubon Core)
  • 23. Example biological content EOL Table of Contents TDWG Species Profile Model Physical Description › Morphology #Morphology Physical Description › Size #Size Ecology › Habitat #Habitat Ecology › Associations #Associations Life History & Behavior › Life Expectancy #LifeExpectancy Evolution and Systematics › Functional #Evolution Adaptations Conservation > Conservation Status #ConservationStatus Molecular Biology and Genetics › Genetics #Genetics Molecular Biology and Genetics › Genome #MolecularBiology Molecular Biology and Genetics › Molecular #MolecularBiology Biology Nucleotide Sequences #MolecularBiology
  • 24. SPM DwC infoitem description Plinian Core using Darwin Core Archive flat files as transport mechanism EOL v2
  • 25. Controlled vocabulary Numeric values Relations EOL v3?
  • 26. Partners Can delete or replace any of their objects Control how often we harvest, and can force a harvest Get an automatically updating collection Can request that we use their classification for browsing Can change the logo and description of their project Receive comments and curator actions immediately Receive monthly reminders they can get traffic statistics Get many links back to their original web resources
  • 27.
  • 28. Partners cannot Publish the very first time Decide if they are pre-vetted Roll back a harvest Change the object of any other partners Change classifications from any other partners
  • 29. http://eol.org/pages/704102 Richness scores Cynthia Parr Global Content Summit Species Pages Group 17-19 Jan 2011
  • 30. Taxon page richness algorithm a (Breadth) + b (Depth) + c (Diversity) 60% 30% 10% Breadth: Images, topics of text objects, references, maps, videos, sounds, conservation status Depth: # words per text object, # words total Diversity: Sources (partners) 0 – 100, Threshold 40
  • 31. Summary of EOL page richness Overall Hot List 950,000 have content 30 % of 75K are rich 2 % are rich Average richness = ~30 ~22 % have only links to literature Red Hot List 56 % of 3K are rich Average richness = 43
  • 32. How richness is used Choose images for home page “March of Life” Allows sorting in collections Weird life example Helps provide best search and API results Any other ideas? Could we be matchmakers for pages needing enrichment and users?
  • 34. Strategies for improving richness Crowd-sourcing Leveraging Collections Enabling platforms Communities Enabling journals Mobile apps Data mining BHL etc.
  • 35. The page richness index Helps fill gaps with existing knowledge Helps prioritize funding and training so that it has maximum impact on closing true gaps Will be available via API Computing and storing richness index on EOL is a step towards storing and serving computable data

Editor's Notes

  1. EOL is a giant mashup that merges information that were created elsewhere on its pages which are then available for curators (mostly credentialed scientists) to trust or untrust and rate, or for anybody to provide comments or tags.We’re partnering with over a hundred scientific databases as well as public conribution sites like Flickr and Wikipedia.100+ partner databases700 curators/1000s contributors/46,000 members2.8 million pages500 thousand pages with Creative Commons contentOver 2 million data objects and >1 million pages with links to research literatureTraffic in past year: 1.7 million unique users, 6.2 million page views
  2. Low hanging fruit is mostly goneFellowsSmaller partners
  3. Partners are projects and databases that we are sharing data with
  4. Why it is important to streamlineAbout 32 partners have managed to make their own XML resource docs – but that probably has the lowest cost per returnBut Connectors may be even more important -- -- web services & db connectors putting content on at least ½ million pagesLDs/Scratchpads important for small partnersSpreadsheets popular, with new transfer schema and flatfile archive format, the XML bar may go down and the spreadsheet might go up
  5. Overview › Brief SummaryOverview › Comprehensive DescriptionOverview › DistributionPhysical Description › MorphologyPhysical Description › SizePhysical Description › Diagnostic DescriptionPhysical Description › Type InformationPhysical Description › Look AlikesPhysical Description › DevelopmentEcology › HabitatEcology › MigrationEcology › DispersalEcology › Diseases and ParasitesEcology › Population BiologyEcology › General EcologyLife History and Behavior › BehaviorLife History and Behavior › CyclicityLife History and Behavior › Life CycleLife History and Behavior › ReproductionLife History and Behavior › GrowthEvolution and Systematics › EvolutionEvolution and Systematics › Fossil HistoryEvolution and Systematics › Systematics or PhylogeneticsEvolution and Systematics › Functional AdaptationsPhysiology and Cell Biology › PhysiologyPhysiology and Cell Biology › Cell BiologyMolecular Biology and Genetics › GeneticsConservation › Conservation StatusConservation › TrendsConservation › ThreatsConservation › LegislationConservation › ManagementRelevance to Humans and Ecosystems › BenefitsRelevance to Humans and Ecosystems › RisksNotesTaxonomyEducation ResourcesCitizen ScienceIdentification Resources
  6. ExtensionLeveraging strengths
  7. Inspired by community ecology & measures of species diversity, which of course were originally inspired by information theory, but we haven’t used those measures. Instead we put together these factors in a way that we could assign weights to different factors based on how well they capture “a rich page”We sampled dozens of pages and had team members assess them for their gestalt “richness” based on their own criteria. Then we compared those scores to those generated by the algorithm, and iteratively changed weights until we achieved a set of weights that appeared to reflect human perception of “richness.”Note that there’s a penalty that unvetted material is only worth about 75% of vetted materialAlso there are maximums for many of these input values – having 200 images may not make a page much more rich than having 25 images.Reserve the right to change this to ensure that the index is as useful as possible. Like Google PageRank, want to ensure that nobody can game the system.
  8. Also note that there is an implication that a “rich page” is a “high quality page” – not necessarily true but often it is.As EOL goes forward with our version 2 we’ll be gathering other inputs that can tell us if a page is successful – ratings of its objects, for example.
  9. This Treemap summarizes the 1.9 million described species that each have a page on the Encyclopedis of life. Some of these pages have only a name so far but about a million of them actually have more than that, with maps, multimedia, text, at least literature references.Each of these species potentially represents a volume in a “living library,” as each has evolved solutions to nature’s challenges, solutions that can benefit human society. For example, the genomics revolution and half of our synthetic drugs were made possible by understanding the characteristics of particular species