The Crop Ontology: a resource for enabling access to breeders’ data

2,316 views

Published on

this presentation describes an initiative for enabling access to breeders data through standardization of terms & protocols related to crop improvement.

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
2,316
On SlideShare
0
From Embeds
0
Number of Embeds
449
Actions
Shares
0
Downloads
31
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • These elements are sufficient for managing phenotyping data from any field experiment, however a sixth component is required to facilitate integration of phenotyping data across studies. This is the Ontology Management System (OMS) which identifies comparable elements – labels, variates and values across studies.
  • Precomposition for annoattion
  • Turtle (Terse RDF Triple Language) is a format for expressing data in the Resource Description Framework (RDF) data model, similar to SPARQL.RDF represents information using triples, each of which consist of a subject, predicate and an object. Each of those items is expressed as a web URI
  • The Crop Ontology: a resource for enabling access to breeders’ data

    1. 1. http://www.cropontology.org The Crop Ontology a resource for enabling access to breeders’ data Elizabeth Arnaud1*, Luca Matteis1, Marie Angelique Laporte1, Herlin Espinosa2, Glenn Hyman2, Rosemary Shrestha3, Arlett Portugal4, Pierre Yves Chibon5, Medha Devare6, Akinnola Akintunde7, Jeffrey W. White8, Mark Wilkinson9, Caterina Caracciolo10, Fabrizio Celli10, Graham McLaren4 1Bioversity International, France, 2International Center for Tropical Agriculture (CIAT), Colombia, 3Genetic Resources Program (GRP), Centro Internacional de Mejoramiento de Maíz y Trigo (CIMMYT), Mexico, 4Generation Challenge Programme (GCP) c/o CIMMYT, 5 UR Plant Breeding, Univ. of Wageningen, The Netherlands, 6 International Maize and Wheat Improvement Center - South Asia Regional Office (CIMMYT-SARO), NepaL, 7International Black Sea University (IBSU) Georgia, 9 Centro de Biotecnología y Genómica de Plantas UPM-INIA, Spain, 10Food and Agriculture Organization (FAO) of the United Nations, Office for Partnership, Italy Generation Challenge Programme Workshop, 13th January 2014 In Plant and Animal Genomics Conference, San Diego, USA, 11-15th January 2014
    2. 2. CGIAR Crop Lead Centers Since 2008
    3. 3. The scientific context
    4. 4. The Knowledge domain: plant breeding Understanding the relationships between plant genotype and environment, develop the adaptive traits to respond to biotic and abiotic stress, promote the adequate agronomic practices to cultivate it and understand the heritability of adaptive traits
    5. 5. Dimensions of a phenotype Environmental Conditions Cultural Socio Economic Light Agronomic Developmental Water Nutrients Temperature Physiologica l Chemical Molecular Soil Understanding the GxE interaction and the heritability of adaptive traits Time
    6. 6. High Throughput Data Generation needs standardized trait concepts • Next Generation Sequencing (NGS) platforms for detailed analysis of largest plant genomes • Phenotyping platforms measure a wide range of structural and functional plant traits at the same time as collecting meticulous metadata on the environment and experimental setup [Fiorani and Schurr, 2013] •GWAS typically focus on associations between a single-nucleotide polymorphisms (SNPs) and traits.
    7. 7. Developing the Crop Ontology content as a Community of Practice
    8. 8. • Harmonization and access to data ‘Fruit colour‘ Breeders’ data are often • unstructured data - Complex free text used for phenotypes description No semantic coherence : Bean pod color • • • Same trait given different names by scientists One trait named the same way Rice grain or for various species but refers to caryopsis colour different plant structures Data and metadata are NOT interoperable and often not online Maize Kernel Colour
    9. 9. Integrated Breeding Platform www.integratedbreeding.net • one-stop shop for services to design and carry out breeding projects – Integrated breeding workflow • Breeders’s databases share a common schema and are being published online • IB Fielbook is available with a standard list of traits per crop
    10. 10. Phenotype It is a composite of an entity (e.g. fruit) and an attribute (e.g. shape) with a value (e.g. round): Entity + Attribute = Trait Entity + (Attribute + Value) = Phenotype (observed) fruit + (shape + round) = fruit shape round -> round fruit is the phenotype
    11. 11. A range of controlled vocabularies Web 2.0  From the controlled vocabularies build valid semantic ontologies consumabke by Web 2.0 Best practices
    12. 12. Crop Ontology • Crop Ontology is primarily an application Ontology for fielbooks • A visualization tool supporting communitybased development tool of trait dictionaries and crop specific ontologies • Compare and validate terms in common Rosemary Shretha, CIMMYT CO coordinator until 2012,
    13. 13. Community based development process • • • • • Domain experts (breeders, pathologists, agronomists, etc) and Data managers identify the list of concepts For an variety evaluation project, Data Managers and breeders produce the IBfieldbook template with the traits and submit new terms Crop ontology curators in the Crop Lead centers curate, validate, compile the list and upload on the site The Global Crop Ontology Curator curates the crop ontology with the Crop Lead Centers’ curators Web development expert maintains the site
    14. 14. Crop curators and associated scientists Crop Crop Lead Center Curator Scientists Barley Cassava ICARDA, Tunisia & Marocco International Institute of Tropical Agriculture (IITA), Nigeria ICRISAT-Patancheru Andhra Pradesh, India Fawzy Nawar Bakare Moshood –replaced by Afolabi Agbona Prasad Peteti Ramesh Verma Peter Kulakow Guerrero Alberto Fabio Steve Beebe; Rowland Chirwa Sam Ofodile Ousmane Boukar Fawsy Nawar Rosemary Shrestha Shiv Kumar Agrawal Rhiannon Chrichton Inge Van den Bergh Praveen Reddy Praveen Reddy Reinhard Simon Frances Nikki Borja Until 2013 Praveen Reddy Ibrahima Sissokho Tom C. Hash Isabel Vales Sorghum International Center for Tropical Agriculture (CIAT), Colombia International Institute of Tropical Agriculture(IITA), Nigeria ICARDA, Tunisia, Marrocco International Maize and Wheat Improvement Center (CIMMYT) Mexico Bioversity International Montpellier, France ICRISAT-Andhra Pradesh, India id International Center for Potato (CIP), Perou International Rice Research Institute (IRRI), Philippines ICRISAT-India and Mali Wheat Yam Global CIRAD CIMMYT (see above) IITA, Nigeria Bioversity International, Montpellier Chickpea Groundnut Common beans Cowpea Lentil Maize Musa Pearl millet Pigeon pea Potato Rice Rosemary Shrestha Afolabi Agbona Harold Durufle Trushar Shah Mauleon Ramil; Ruaraidh Sackville Hamilton Trushar Shah Eva Weltzien-Rattunde, Taba Nebe Jean Francois Rami Antonio Jose Lopes Montez
    15. 15. Crop Ontology themes General germplasm information Phenotype and traits Plant anatomy and development Location and environment Trial management and experimental design Structural and functional genomics
    16. 16. Traits and Phenotypes
    17. 17. Crop Ontology www.cropontology.org 14 CGP crops • Banana • Cassava • Chickpea • Common beans • Cowpea • Groundnut • Maize • Pearl millet • Pigeon Pea • Potato • Rice • Sorghum • Wheat • Yam For 2014, adding  Barley  Lentil  Soybean  Sweet Potato
    18. 18. Ontology Engineering • With OBO-edit - http://oboedit.org/ • Creating multi-relationships between concepts • cross referencing with Plant Ontology and Trait Ontology
    19. 19. Trait Description
    20. 20. Crop Trait Dictionary Template simple to share with breeders Name of submitting scientist Institution Language of submission Date of submission Bibliographic Reference Comments n Method ID Name of Method Describe how measured (method) Growth Stage Field, greenhouse 1 1 Crop Name Name of Trait Abbreviated name Synonyms (separate by commas) Trait ID for modification, Blank for New Description of Trait How is this trait routinely used? Trait Class n Scale ID Type of Measure (Continuous, Discrete or Categorical) For Continuous: units of measurement, reporting units, minimum. maximum For Discrete: Name of scale or units of measurement For Categorical: Name of rating scale, Class # value = meaning
    21. 21. Online visualization of Trait dictionaries
    22. 22. Methods & Scales for annotations • Precomposed relationships between Trait, Methods and Scales required for annotations in phenotype databases • On going discussion for revising the structure and get the 3 separated in 3 namespaces
    23. 23. Methods & scales for the standard lists of the Breeders’ fieldbook Visualization & download In Crop database and Fieldbook template
    24. 24. Easy to use the site - Partners published their Trait ontologies Soybean Solanaceae France Grape Barley
    25. 25. Multilingual versions of the crop ontologies Multiple languages
    26. 26. Experimental design ontology Trial management tasks • CROP - PLANTING • SEED TREATMENT • IRRIGATION • FERTILIZER • PESTICIDE • SOIL • BIOTIC STRESS • ABIOTIC STRESS • HARVEST-YIELD Medha Devare CSISA-Nepal Coordinator, CIMMYT –SARO Design of the Fieldbook and coordination Akinnola Akintunde, International Black Sea Univ. (IBSU), Georgia Development of the ontology and fieldbook
    27. 27. Dictionary for Trial Management Concepts From Medha Devare, CSISA-Nepal Coordinator CIMMYT -SARO
    28. 28. Environmental Ontology Jeffrey W. White Research Plant Physiologist & Research Leader Arid-Land Agricultural Research Center USDA-ARS, Arizona, USA Sheryl Porter Coordinator, Computer Research Applications University of Florida, Gainesville, FL, USA
    29. 29. Environment Ontology and Trial management Ontology
    30. 30. Environmental Ontology • Improve the current list of concepts •International Consortium for Agricultural System Applications (ICASA) • Integration of a Master list of 600 variables for describing crop management and recording plant responses. • ICASA promotes the use of standards in relation to crop field research and for ecophysiological models. • One objective is the application of ICASA variables by the Agricultural Model Intercomparison and Improvement Project (AgMIP) (http://www.agmip.org/ ).
    31. 31. Synchronization with the Crop databases and IBWS
    32. 32. Synchronization of Crop Ontology with Integrated Breeding Workflow Graham Mc Laren, Generation Challenge Programme Rebecca Berrigan, Efficio Technology Service Arllet Portugal IBP Data Management Leader Luca Matteis, CO Web Site developer, Bioversity International Harold Durufle, CO curator, Bioversity International
    33. 33. Application Programming Interface (API) • Developed by Luca Matteis • Provide access services to 3rd party web sites or software • Support open collaboration and use of the Crop Ontology
    34. 34. Local Databases Breeders & Data Managers Breeders’ Trait Dictionaries Crop Database Data Manager Curation of the Crop Ontology Fieldbook Template Data Annotation & new terms addition Cross referencing terms with Plant Ontology &Trait Ontology Submission of new traits through the term tracker
    35. 35. IBWS - Key elements of the Logical Data Model to store phenotypic data
    36. 36. Annotation for storing phenotypic data in the IBWS Property (Trait)- CO_ID Requires Method - CO_ID 3 namespaces Scale – CO_ID continuous discrete categorical Class1-value – CO_ID Class2-value – CO_ID Class3-value – CO_ID A unique combination of IDs for P+M+S+C = A Standard Variable Is_a_valid_value_of Data Controlled vocabulary Term ID
    37. 37. Synchronization flow The IBWS accepts updates sent by Crop ontologies Schema from Rebecca Berrigan, Efficio LLC
    38. 38. Synchronization flow Crop ontology accepts new addition from local ontologies Schema from Rebecca Berrigan, Efficio LLC
    39. 39. The crop Ontology web site A Concept name server on the Cloud Luca Matteis, Web developer, Bioversity International
    40. 40. Crop Ontology
    41. 41. API access by rd 3 Party Web sites IBP Crop Databases IB Fieldbook Genotype Data MS [Text] API Phenomics Ontology Driven DB (PODD) EU-SOL Solanaceae Breeding DB Wageningen. [Text] International cassava DB Agtrials -CCAFS
    42. 42. Global Agricultural Trial Repository and database www.agtrials.org Glenn Hyman, geographer, CIAT Herlin R. Espinosa G. , web developper, CIAT Luca Matteis, Web developer, Bioversity International
    43. 43. Global Agricultural Trial Repository http://www.agtrials.org/ • To store evaluation data files described with metadata • To produce an Atlas of the trials 1,029 trials for Cassava
    44. 44. 1. Annotating Evaluation data files
    45. 45. 2. Searching evaluation data files Agtrials uses the Crop Ontology trait terms
    46. 46. 3. Display the Trial Information Access to the definition of the Trait in the Crop Ontology
    47. 47. Integration of Crop Ontology in IBP Fred Okono, IBP Project Administrator Brandon Tooke, IBP web developer Luca Matteis, CO Web developer, Bioversity International
    48. 48. Integration of Crop Ontology in IBP
    49. 49. CO Semantic Web Compliance Marie Angelique Laporte, Ontology development, RDF & SKOS conversion, Bioversity International Luca Matteis, CO Web developer, Bioversity International Mark Wilkinson, Centro de Biotecnología y Genómica de Plantas UPM-INIA, Spain
    50. 50. Linked Open Data Cloud • A term used to describe a recommended best practice for exposing, sharing, and connecting pieces of data, information and knowledge • It builds upon standard Web technologies such as HTTP, RDF and URIs • Rather than using them to serve web pages for human readers, it extends them to share information in a way that can be read automatically by computers. Wikipedia • This enables data from different sources to be connected and queried.
    51. 51. Crop Ontology in the Linked Open Data recommended format • Conversion from OBO to RDF/SKOS resolvable HTTP URIs • A conversion into Simple Knowledge Organization System (SKOS) is going on <http://www.cropontology.org/rdf/CO_324:0000002> a skos:Concept ; rdfs:label "Flag leaf weight"@en ; dc:creator _:b1 ; skos:definition "Weight of the flag leaf (the one just below the panicle)." ; skos:inScheme co:sorghum ; skosxl:prefLabel [a skosxl:Label ; co:acronym [a skosxl:Label ; skosxl:literalForm "FLGWT" ]; skosxl:literalForm "Flag leaf weight"@en ].
    52. 52. Linked Open Data publishing and Aligning Crop Ontology with AGROVOC Caterina Caracciolo, Food and Agriculture Organization (FAO), AIMES, Italy Fabrizzio Celli, Food and Agriculture Organization (FAO), AIMES, Italy Marie Angelique Laporte, Bioversity International Luca MatteisBioversity International
    53. 53. Agrovoc - Agricultural Thesaurus • 32,000 concepts organized in a hierarchy • each concept may have labels in up to 22 languages • is now available as a linked data set published, aligned (linked) with several vocabularies
    54. 54. Release of Agris 2.0 agris.fao.org • AGRIS bibliographic records contain rich metadata and are largely indexed by AGROVOC FAO’s multilingual thesaurus
    55. 55. AGRIS 2.0 and Phenotypic Data • AGRIS 2.0 uses the Linked Open Data Methodology to link various source of data in the mash up site • Proof of concept done with the Collecting mission database of Bioversity International • 3 steps 1. The AGRIS datasets were converted to RDF creating some 200 million triples. AGROVOC was aligned to other thesauri. 2. Sparql endpoints, web services and APIs were discovered. 3. AGRIS RDF was interlinked – using AGROVOC LOD as a backbone – to external datasets. • Align Crop Ontology with AGROVOC in SKOS/RDF • Promote the publishing of Phenotypic data into RDF • Objective : Retrieve bibliographic references and data from phenotypic databases in the mash up site
    56. 56. Partners collaborating to the informatics and integration formats • IBFieldbook and IBWS teams and Efficio LLC • Plant Breeding dept. of Wageningen for the Resource Description Format (RDF) • CIAT-DAPA, for the synchronization of The Global Repository of Evaluation trials (Agtrials) of CCAFS • FAO-AIMES for the use of Linked Open data with AGRIS 2.0
    57. 57. Partners collaborating to the content engineering & the looking forward to a Reference Ontology for plants • Plant Ontology, Jaiswal Lab., Oregon State University, USA • Soybase, USDA-ARS, USA • Solanaceae Genomic Network (SGN), USA • Cornell University, USA • Institut National de Recherche d’Agronomie (INRA), France • Centro de Biotecnología y Genómica de Plantas UPMINIA, Spain • POLAPGEN, Poland • Australian Plant Phenomics Data Repository
    58. 58. Any questions, please contact us Send a mail at : e.arnaud@cgiar.org h.durufle@cgiar.org l.matteis@cgiar.org helpdesk@cropontology-curationtool.org Poster #981 Plant Genomics Outreach Booth # 305

    ×