2. Table of contents
• Data integration
– Data integration approaches
• Warehousing vs. Federation
• Dataset integration
– Query interfaces
• Web Services
– SOAP vs. REST
• PSICQUIC
– Registry
– Services
• REST queries
• MIQL
– PSICQUIC view
– PSISCORE
• Registry
• Client
• Workflows
– myGrid tools
– PSICQUIC workflows
• myExperiment
• Taverna
3. 1 3
5
Popular data integration approaches
4
6
2
...
Data centralization Data warehousing Dataset integration Hyperlinks
Federated databases View integration
4. 13.12.2018 4
Warehousing vs. Federation
Database Query InterfaceQI User
Data warehousing Federated databases
S
i
S i
i
S
integration
standardization
7. Warehousing vs. Federation
• Data warehousing
– Pull data from several resources into one resource.
– Main features:
• Data centralization
• High maintenance
• Data out of date
• Modifications (schema, format, content, …)
• Federated databases
– Data residing in different sources with a common standard
protocol and query system.
– Main features:
• Fresh data (original)
• Data redundancy
• Data inconsistency
8. QI
i
3
Dataset integration
Curators / Annotators
Original data sources
Third party implementations
Users
Examples:
• Your own script
• Workflows
i
S
integration
standardization
11. Web Services
• It is a piece of software that runs remotely
• It is accessible over a network (e.g. Internet)
• It is meant for machine to machine communication
• Independent from programming languages
• It can be operated following specific rules (i.e. protocol)
• There are 2 main protocols in use:
– REST
– SOAP
This introduction is intended for a non technical
audience with purposely simplified technical concepts.
12. Web Services
How should I invoke you?
Documentation
Make a request
Results
Web serverClient
describes the methods and variables to query the service
1
2
3
4
2
13. SOAP vs. REST
13
REST
• Geared to simplicity
• A browser can be a client
• Request as complex as a URL can be
http://www.ebi.ac.uk/Tools/webservices/psicquic/intact/webservices/current/search/query/P99999?format=xml25
SOAP
• Based on Standards
• Only accessed by software
• Allow description of complex data structure in request and response
SOAP REST
14. PSICQUIC
• Proteomics Standards Initiative Common QUery InterfaCe.
• Community effort to standardise the way to access and retrieve data
from Molecular Interaction databases.
• Widely implemented by independent interaction data resources.
• Based on the PSI standard formats (PSI-MI XML and MITAB)
• Not limited to protein-protein interactions, also e.g.
• Drug-target interactions
• Simplified pathway data
• A registry listing resources implementing PSICQUIC
• Documentation: http://psicquic.googlecode.com
18. PSICQUIC Registry
• It contains a list of the PSICQUIC services from different
providers.
• It is a web service itself, and it can be accessed remotely
using REST.
• Information can be found about the services, such as the
URLs to use, number of interactions provided, versioning,
tags, etc.
21. • PSICQUIC services are Web Services
• SOAP
• REST
• The same methods to query several services
• Results from different sources following the same PSI-MI standards
• Results in two standard formats: PSI-MI XML or PSI-MI TAB.
PSICQUIC services
26. PSICQUIC view
• Simple and complex queries
• Link back to the original source for more details
• Clustering of queries across providers
• Visualization of graphical network
http://www.ebi.ac.uk/Tools/webservices/psicquic/view/
34. Introduction to Web Services at EBI
Workflow
• Workflow
– Sequence of tasks that produces
a result of observable value
• Workflow management
system
– Computer system to compose
and execute workflows.
• Workflow components
– Input
– Service
– Output
– Shims
Service A
Service B
35. Create and run workflows
Share, discover and reuse workflows
Discover and reuse services
myGrid tools