VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
Â
ELIXIR. Technical Coordinator
1. ELIXIR
Technical Coordinator
Rafael C Jimenez
2013 - 09 - 24
âBefore, no one cared about standards. Now, people are very
aware of their importance, so everyone is developing their ownâ
ELIXIR WP12 report - Preparatory phase.
10. More data coverage
Consortiums for data exchange
Nucleotide sequences
INSDC
ENA
DDBJ
NCBI
Molecular interactions
IMEx
IntAct
InnateDB
DIP
MINT
âŚ
Protein indentifications
ProteomeXchange
PRIDE
PeptideAtlas
GPMDB
Tranche
âŚ
Replication Federation Centralization
Proteomics data exchange and storage: the need for common
standards and public repositories. JimĂŠnez RC, VizcaĂno JA.
Better data management
Less redundancy
Agreement on data standards and data integration
Less inconsistency
11. Schema
Interfaces
Guidelines
Ontologies Format
Identifiers
Data
Definition Representation Access
Federation
PublishFind
Registry
DAS ClientsDAS Clients
Clients
PSICQUIC
sources
PSICQUIC
sourcesServices
query
format
DAS
PSICQUIC
Service Oriented Architecture
Standards
PSICQUIC and PSISCORE:
accessing and scoring
molecular interactions.
Aranda B, Jimenez RC, et
all.
Integrating biological data-
-the Distributed
Annotation System.
Jenkinson AM, Jimenez RC,
et all
Teaching the fundamentals of biological data integration using classroom games.
Schneider MV, Jimenez RC.
12. Components Web applications
reuse
share
BioJS
BioJS: an open source JavaScript framework for biological data visualization. GĂłmez J, GarcĂa
LJ, Salazar GA, Villaveces J, Gore S, GarcĂa A, MartĂn MJ, Launay G, AlcĂĄntara R, Del-Toro N,
Dumousseau M, Orchard S, Velankar S, Hermjakob H, Zong C, Ping P, Corpas M, JimĂŠnez RC.
14. Unifying components
13.12.2018
⢠Represent the same type of information in different
projects using the same graphical component
1 2 3
Graphical representation
Developer
1 2 3 *
Type of representation
Website
Implementation a b c d
15. EMBL-EBI integration projects
EBI search / S4
⢠Summaries from
different sources
⢠Links to the original data
⢠âJust enoughâ just âin
timeâ
13.12.2018 15
Biosapiens
⢠Comparable annotations
from different sources
⢠Common control
vocabulary
⢠Links to the original data
16. Improving Links Between the Human Protein Atlas
(HPA) and EMBL-EBIâs resources
13.12.2018 16
DASS4
HP-WSBioJS
Summary data Details
Standards
20. iAnn
Curation Centralization Distribution Integration
Registry
Input
form
Web
sites
Visualization
ModulesWeb
services
Editor Web services Viewer
Open source community-driven platform
for dissemination of life science events
iAnn: an event sharing platform for the life sciences.
Jimenez RC, Albar JP, Bhak J, Blatter MC, Blicher T, Brazas MD, Brooksbank C, Budd A, De Las Rivas J, Dreyer J, van
Driel MA, Dunn MJ, Fernandes PL, van Gelder CW, Hermjakob H, Ioannidis V, Judge DP, Kahlem P, Korpelainen E,
Kraus HJ, Loveland J, Mayer C, McDowall J, Moran F, Mulder N, Nyronen T, Rother K, Salazar GA, Schneider R, Via
A, Villaveces JM, Yu P, Schneider MV, Attwood TK, Corpas M.
21. Format
Ontology
Minimum
Information
guideline
Standards to exchange announcements among bioinformatics societies
More data coverage, less redundancy, less inconsistency, better data management
Integration
Access
Exchange
Sharing
Portability
Interoperability
Annotation
Comparison
Verification
Reusability
Representation
22. Conclusions
⢠Be realistic about what you can achieve
⢠Keep it simple
⢠Open the project to external contributions
⢠Let the community decide
⢠Purpose-driven requirements
⢠Fail early
23.
24. Dasty3, a WEB framework for DAS
Integration of Cardiac Proteome
Biology and Medicine
iAnn: an event sharing
platform for the life sciences
A new reference
implementation of the PSICQUIC
web service
BioJS: an open source JavaScript framework
for biological data visualization
Best practices in bioinformatics
training for life scientists
Bioinformatics workflows and web
services for experimentalists
Proteomics data exchange and storage:
common standards and public repositories.
Teaching the fundamentals of biological
data integration using classroom games.
The IntAct molecular
interaction database in 2012
myKaryoView: a light-weight client for
visualization of genomic data
PSICQUIC and PSISCORE: accessing and
scoring molecular interactions.
MyDas, an extensible Java DAS server
Bioinformatics Training Network (BTN): a
community resource for bioinformatics trainers.
DAS writeback: a collaborative
annotation system
easyDAS: automatic
creation of DAS servers
The Protein Feature Ontology: a tool for the
unification of protein feature annotations
Dasty2, an Ajax protein DAS client
OntoDas - a tool for facilitating the
construction of complex queries to the GO.
Integrating biological data--the
Distributed Annotation System
2013
2009
2010
2011
2012
UPV
CIPFEMBL-EBIEMBL-EBI
Publications
26. Quotes
Good project management cannot guarantee success, but poor management on signiďŹcant
projects always leads to failure. A.A.Puntambekar
The biggest single problem that afďŹicts software developing is that of underestimating resources
required for a project. Mike Wooldridge
Before, no one cared about standards. Now, people are very aware of their importance, so
everyone is developing their own. ELIXIR WP12. Preparatory phase report
Considerations for transitioning to open development (NCI):
⢠Keep it simple. Make barriers to entry as low as possible, and reuse available resources.
⢠Let the community decide. For an open-source development ecosystem to be successful, it
needs to be truly owned
⢠Purpose-driven requirements. The use of standards and any concept of âcertificationâ should
be established based on real needs and use cases.
⢠Donât over-think the process, and stop waiting to do something â fail early. Donât wait for the
code or process to be perfect. Put the applications out there and let them fail or succeed, as
determined by the community.
28. Data integration problems
Many data resources
⢠Many to maintain
⢠Databases change
⢠New appearing
⢠Some disappearing*
⢠Not easy to find them
Different query interfaces
data integration?
Variable results
⢠Formats
⢠Schemas
⢠Controlled vocabularies
⢠Minimum information guidelines
* Merali Z. et all. Databases in peril. Nature 2005.
⢠Inconsistency
⢠Mapping problems
⢠Records with not enough information
⢠Redundancy
30. 1 3
5
Popular data integration approaches
4
6
2
...
Data centralization Data warehousing Dataset integration Hyperlinks
Federated databases View integration
31. Heterogeneous
data sources
Same data types
Homogeneous vs Heterogeneous
data integration
A B C
1
2
leverage
B
C
A
Teaching the fundamentals of biological data integration
using classroom games. Schneider MV, Jimenez RC.
33. Expertise to do this job
⢠Software project management
⢠Collaborative software development
⢠Bioinformatics in multidisciplinary fields
⢠Biological science
⢠System administration
⢠Writing
⢠Interpersonal skills
34. Achievements (last 4 years)
⢠Definition and implementation of standards like DAS and PSI
⢠Start the BioJS community open source project
⢠Participate as a Researcher Co-Investigator in a BBSRC grant
proposal
⢠Publish 17 publication (6 as a first or last author)
⢠Write 2 book chapters
⢠Report more than 10 deliverables
⢠I supervised 5 students (3 of which are currently collaborating with
our team)
⢠Participate as speaker/trainer in almost 30 bioinformatics events
⢠Co-organise 6 workshops about web services and visualisation
⢠Participate in several projects including Biosapiends, ENFIN, COPa,
IntAct, IMEx, PRIDE, Affinomics, LARK2-LEAPS, ELIXIR,
BioMedBridges and AgedBrainSysBio