iPlants Taxonomic Name Resolution               Service            Naim Matasci    BIO5 / The iPlant Collaborative        ...
What is iPlant?
Empowering a New Plant Biology
http://www.ncbi.nlm.nih.gov/Genbank/genbankstats.html
TMU* Growth of Biological Collections                              (1600 – 2012)            600,000,000            500,000...
If you cant find it, it doesnt exist
Data Reuse• Whats the correlation between leaf  morphology and leaf economy (R. Walls)?• Evolution of pit domatia (M. Dono...
iPlant Data Store• Based on iRODS  – Metadata driven  – Storing, Sharing and Distributing• Redundant (mirrors at TACC and ...
iPlant Data Store Performance                                    UC Berkeley to iDS                               100GB: 2...
PhytoBisque features• Rich internet application (completely web based)• Draws upon features from popular large scale photo...
Taxonomic uncertainty1. Non-existent names  •   Misspellings  •   Contamination      •   Annotations      •   Morphospecie...
Non-existent names:                    Herbarium specimensTotal specimens:                                                ...
Taxonomic Name Resolution Service• Computer assisted standardization of plant  names• Corrects spelling errors and alterna...
Future• More sources  – Standard source import with DwC support• Better performance• TNRastic API• Integration with Global...
• Web: http://tnrs.iplantc.org/• Code:  https://github.com/iPlantCollaborativeOpenS  ource/TNRS• API (provisional): http:/...
Brad Boyle                                  Paul Morris (Harvard University)Brian Enquist                               Al...
iPlant TNRS for digital collections - iDigBio Workshop
iPlant TNRS for digital collections - iDigBio Workshop
iPlant TNRS for digital collections - iDigBio Workshop
iPlant TNRS for digital collections - iDigBio Workshop
iPlant TNRS for digital collections - iDigBio Workshop
iPlant TNRS for digital collections - iDigBio Workshop
iPlant TNRS for digital collections - iDigBio Workshop
iPlant TNRS for digital collections - iDigBio Workshop
Upcoming SlideShare
Loading in …5
×

iPlant TNRS for digital collections - iDigBio Workshop

491 views
464 views

Published on

Introduction to iPlant tools that could be useful for digital collections and live demo of TNRS. Columbus, OH; July 12, 2012

Published in: Education, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
491
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
5
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Bringing a culture of computing to the Plant Sciences.
  • iPlant TNRS for digital collections - iDigBio Workshop

    1. 1. iPlants Taxonomic Name Resolution Service Naim Matasci BIO5 / The iPlant Collaborative tnrs.iplantc.org
    2. 2. What is iPlant?
    3. 3. Empowering a New Plant Biology
    4. 4. http://www.ncbi.nlm.nih.gov/Genbank/genbankstats.html
    5. 5. TMU* Growth of Biological Collections (1600 – 2012) 600,000,000 500,000,000 400,000,000Specimens 300,000,000 200,000,000 100,000,000 0 1600 1620 1640 1660 1680 1700 1720 1740 1760 1780 1800 1820 1840 1860 1880 1900 1920 1940 1960 1980 2000 2020 *TMU: Totally Made Up
    6. 6. If you cant find it, it doesnt exist
    7. 7. Data Reuse• Whats the correlation between leaf morphology and leaf economy (R. Walls)?• Evolution of pit domatia (M. Donoghue)
    8. 8. iPlant Data Store• Based on iRODS – Metadata driven – Storing, Sharing and Distributing• Redundant (mirrors at TACC and UoA)• Really, really, really big (6 PB + 40 PB LTS)• Really, really, really fast
    9. 9. iPlant Data Store Performance UC Berkeley to iDS 100GB: 29m15s 1 GB / 17.5 seconds Source Destination Copy Method Time (seconds) CD Desktop PC cp 320 Berkeley Server Desktop PC scp 150 External Drive Desktop PC cp 36 USB 2.0 Flash Desktop PC cp 30 iDS Desktop PC iget 18 Desktop PC Desktop PC cp 15Desktop PC (UA): Mac with 7.2K Internal Hard DriveExternal Drive: USB 2.0: 5.4k Hard DriveFlash Drive: USB 2.0 Patriot XT https://pods.iplantcollaborative.org/wiki/display/start/How+fast+is+the+iPlant+Data+Store
    10. 10. PhytoBisque features• Rich internet application (completely web based)• Draws upon features from popular large scale photo sharing sites and high resolution aerial imagery (google maps)• Ability to import and export over 100+ image formats, movies• Ability to import extremely large image sets using iPlant data store• Can display 20Kx20K image using standard web browser• Manage data sets with tags, metadata management• Utilizes distributed computing (connected to iPlant execute environment)
    11. 11. Taxonomic uncertainty1. Non-existent names • Misspellings • Contamination • Annotations • Morphospecies • Digitization issues (frame shifts, character encoding)Lexical variants (digitization conventions)2. Synonymy • Nomenclatural synonyms • Taxonomic synonyms / concepts3. Misidentifications, incomplete identifications
    12. 12. Non-existent names: Herbarium specimensTotal specimens: 1.1 millionUnique species names: 53,052Published names (legitimate & illegitimate): 44,532Misspelled names: 9371 (18%)Specimens with misspelled names: 101,237 (9%)*New World plant specimens, 34 herbaria, simple match against IPNI and TROPICOS, excluding authors
    13. 13. Taxonomic Name Resolution Service• Computer assisted standardization of plant names• Corrects spelling errors and alternative spellings to a standard list of names• Convert out-of-date names to currently accepted names
    14. 14. Future• More sources – Standard source import with DwC support• Better performance• TNRastic API• Integration with Global Names components
    15. 15. • Web: http://tnrs.iplantc.org/• Code: https://github.com/iPlantCollaborativeOpenS ource/TNRS• API (provisional): http://goo.gl/XnUiH• TNRastic API: http://goo.gl/Z7Fkc
    16. 16. Brad Boyle Paul Morris (Harvard University)Brian Enquist Alan Paton (Kew Royal Botanic GardensJuan Antonio Raygoza Garay and their International Plant Names Index)Nicole Hopkins Tony Rees (Commonwealth Scientific andZhenyuan Lu Industrial Research Organisation)Martha Narro Michael Giddens (www.silverbiology.com)Shannon Oliver Dmitry Mozzherin (Global BiodiversityWilliam Piel Information Facility)Jill Yarmchuk David Remsen (Global Biodiversity Information Facility)Bob Magill (Missouri Botanical Garden) David Patterson (Encyclopedia of Life)Chris Freeland (Missouri Botanical Cam Webb (Harvard University)Garden)Chuck Miller (Missouri Botanical Garden) Missouri Botanical Garden (Tropicos)Peter Jorgensen (Missouri BotanicalGarden) Funding provided by the National ScienceAmy Zanne (University of Missouri, St. Foundation Plant CyberinfrastructureLouis) Program (grant #DBI-0735191).Peter Stevens (Missouri Botanical Garden)Jay Paige (Missouri Botanical Garden)Bob Peet (University of North Carolina atChapel Hill)

    ×