2. Reminder of the main objectives of the
WP
• To manage phenotypic data of apple and peach from the
genetic materials present across EU from different locations
• To develop and utilise tools to study model traits based on
functional and structural genomics
• To manage genotypic data of apple and peach for the
development of a comprehensive platform including
gene/QTL map position, gene ontology
• To improve collaboration on apple/peach genome studies
• To speed breeding procedures (synthesis of data to select
parents; direct identification of elite individuals by markers,
etc.).
3. Main achievements
• Structure of Phenotypic database for peach
and apple completed
– Both interface and excel template
4. Main achievements
• Structure of Phenotypic database for peach
and apple completed
– Both interface and excel template
9. Data Standardization
• Standardization of nomenclature
– Data received on Dec 13th still lacking
homogeinity
• Also language problems
• Solution
– Suggested method
• Google refine (soon will be open refine): it helps to
clean up data
10. Problems to be solved
• Solution
– Suggested method
• Google refine (soon will be open refine): it helps to
clean up data
14. Genome browser and Genome mining
tool
• Based on GMOD
• Tracks include RefSeq, gene predictions and
homologous apple/peach
15. We are migrating the system to use a new technology suite:
• GMOD - (Generic Model Organism Database project), a
collection of open source software tools for creating and
managing genome-scale biological databases
• CHADO - a relational database schema linked to GMOD and
capable of representing biological data such as sequence,
sequence comparisons, phenotypes, genotypes, ontologies,
publications, and phylogeny.
• TRIPAL - a web interface for GMOD-Chado using community
supported tools.
•Currently under development
16. Advantages
From these new technologies we'll get:
• FLEXIBILITY - easy integration of new data (e.g. different
phenotypes; geolocalization)
• DATA GRANULARITY - data integration down to single
specimen/single gene snippet
• DATA ANALYSIS - e.g. easy coupling of genotype and
phenotype data
• INFRASTRUCTURE MAINTENANCE agreement with
Washington State University after the project's end
17. Results
• Gmod+Tripal based server up and running
• complete migration of dictionary, institutions, orchards and
stock data for both peach and apple datasets
• almost complete migration of phenotype data for peach
• ready to migrate apple data (as soon as we receive them...)
• knowledge base on data organization and migration on a
private wiki
• reusable, data-independent migration tools
19. Problems/ToDo
• data are in Chado, could use some tuning
• Tripal web front-end need to be configured to actually show
our data
• small data inconsistencies
• integration with statistical modules (?) need to be designed
from scrap
20. Challenges for 2013
• Genotype data:
– Restricted and public availability
– Output format
– Link to phenotype
• Re-sequencing data:
– Which is the best way to make it available to partners
(public) and how to link it to the information in the
database
21. Action Plan for 2013
WHAT WHO HOW DEADLINE
Complete uploading data in apple
phenotypic db
End of
March
2013
Receive genotyping infomartion Wp3 Wp4 May 2013
Receive chromosomal region (QTL) WP3, WP4, WP6 End of
summer
2013
Build Genome Mining Tool in region
for region of interest
End of
summer
2013
22. Interactions between your WP and the
rest of the project
• Interactions planned with other WPs of the project:
– From your WP
WP2, 3, 4 web interface for data input and query and downloading
– To your WP
• WP1 breeder interface questionnaire
• WP2 phenotype specification
• WP3 marker information, pedigree tools
• WP4 phenotype, marker data
• WP5 phenotype
• Interactions planned with the stakeholders of the project
(and what do you expect from them?):
– Breeders
• Feedback on needs for the breeder informatics tools