Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
1 
myEquivalentsmyEquivalents 
aka, Serviceaka, Cross Reference Service 
Marco 2013Marco Brandizi, EBI, 14 Feb 2013 
(tree...
2 
It's not rocket science... 
(html)(Image Source: http://www.chinapage.com/space/moon/orbiter.html)
3 
… yet... 
(Image Source: oh com'n! 2001, A Space Odissey)
4 
Rationale 
References: AE ENA 
References: BioSD ENA 
References: BioSD AE
5 
Rationale 
References: AE ENA 
References: BioSD AE 
References: BioSD ENA
6 
So, it's about equivalence relations
7 
Why a Centralised Service 
BioSDSamplesSAMEA597705 
AEExperimentsE-AFMX-11 
http://www.ebi.ac.uk/arrayexpress/experimen...
8 
Why a Centralised Service 
their consumers only 
Simplifies management 
URI auto-creation 
Links updated independently ...
9 
The Model 
BioSD/SamplesSAMEA597705 
AE/ExperimentsE-AFMX-11 
AE/DataE-AFMX-11 
ENA/SequencesSRR034107 
ServiceAccessio...
10 
API Examples (Java, Mapping) 
public interface EntityMappingManager { 
public void storeMappings ( String ... entityId...
11 
API Examples (Java)
12 
API Examples (Web Service)
13 
Component-based Architecture 
Components and their topology configured/instantiated via Spring 
Easy to build features...
14 
Related Work 
myEquivalents inspired to this 
Does pretty much what we do 
With a very similar internal model 
But for...
15 
Related Work 
Pair model for URIs is a standard 
Equivalence-based model missing 
Dual identification mechanism missin...
16 
Future: RDF, SPARQL, Semantic Web 
Dereferenceable URIs, with RDF output 
Keeping support to the accession-based model...
17 
Related Work 
It is to manage entities that share accessions 
e.g., PubMed and CiteXplore 
So, not enough for us 
But ...
18 
Future: MIRAM and identifiers.org support 
Services & 
Entities 
Service Collection
19 
Future: MIRAM and identifiers.org support 
Service Collection 
Services 
Entity
20 
Combining MIRAM and myEquivalents 
Uniprot P62158 
MIR:001000234599080 
http://www.ebi.ac.uk/citexplore/ 
citationDeta...
21 
Issues: Access Control (on-going) 
We assume: 
by just within the same most of data is publicly readable 
except priva...
22 
Issues: Versioning (future?) 
That's been ignored so far 
cause we're assuming one version ↔ one accession ↔ one URI 
...
23 
That's 
all! 
Thank 
You! 
Have a look at the code and the wiki (on-going work!): 
http://myequivalentshttp://github.c...
Upcoming SlideShare
Loading in …5
×

myEquivalents, aka a new cross-reference service

646 views

Published on

myequivalents is a system to manage cross-references between entities that can be identified by pairs composed of a service name (e.g., EBI's ArrayExpress, Wikipedia) and an accession (e.g., E-MEXP-2514, Barack_Obama). For those familiar with the Semantic Web, we plan to support identification of entities via URIs and the owl:sameAs property. For those who already know MIRIAM and identifiers.org, myequivalents is more general than them and we plan to support these services in future.

Published in: Software
  • Be the first to comment

  • Be the first to like this

myEquivalents, aka a new cross-reference service

  1. 1. 1 myEquivalentsmyEquivalents aka, Serviceaka, Cross Reference Service Marco 2013Marco Brandizi, EBI, 14 Feb 2013 (tree)(Image Source: http://stackoverflow.com/questions/13340232/pythagoras-tree-to-windy-tree)
  2. 2. 2 It's not rocket science... (html)(Image Source: http://www.chinapage.com/space/moon/orbiter.html)
  3. 3. 3 … yet... (Image Source: oh com'n! 2001, A Space Odissey)
  4. 4. 4 Rationale References: AE ENA References: BioSD ENA References: BioSD AE
  5. 5. 5 Rationale References: AE ENA References: BioSD AE References: BioSD ENA
  6. 6. 6 So, it's about equivalence relations
  7. 7. 7 Why a Centralised Service BioSDSamplesSAMEA597705 AEExperimentsE-AFMX-11 http://www.ebi.ac.uk/arrayexpress/experiments/E-AFMX-11 AEDataE-AFMX-11 http://www.ebi.ac.uk/arrayexpress/files/E-AFMX-11 ENASequencesSRR034107 Bundle 1Bundle 1 Bundle 1Bundle 1 http://dbpedia.org/resource/Barak_h_obama http://en.wikipedia.org/wiki/Barack_Obama http://www.freebase.com/view/en/barack_obama Bundle 2Bundle 2 Bundle 2Bundle 2 Managing equivalenceManaging equivalence classes compactclasses is more compactand efficientand more efficient Managing equivalenceManaging equivalence classes compactclasses is more compactand efficientand more efficient
  8. 8. 8 Why a Centralised Service their consumers only Simplifies management URI auto-creation Links updated independently on their consumers and once only Avoids redundancy implicit symmetry and transitivity in the bundles single-point storage and rendering vs one per repository More efficient A specialised service for this is potentially faster, e.g. sameas.org More features can be added to the basic service Multiple access formats and paradigms (e.g., XML, RDF, SPARQL) MIRIAM integration
  9. 9. 9 The Model BioSD/SamplesSAMEA597705 AE/ExperimentsE-AFMX-11 AE/DataE-AFMX-11 ENA/SequencesSRR034107 ServiceAccession Entity Entity Mapping BioSD ENA AE Service collectionsame accessions, implicit mapping Bundle (i.e., partition class) provides service provides service provides service Repositories Service Properties: Title, Description URI Pattern Repository Properties: Title, Description URL Managing Organization Logo URL
  10. 10. 10 API Examples (Java, Mapping) public interface EntityMappingManager { public void storeMappings ( String ... entityIds ); public void storeMappingBundle ( String ... entityIds ); public int deleteMappings ( String ... entityIds ); public int deleteEntities ( String ... entityIds ); public EntityMappingSearchResult getMappings ( Boolean wantRawResult, String ... entityIds ); public EntityMappingSearchResult getMappingsForTarget ( Boolean wantRawResult, String targetServiceName, String entityId ); public String getMappingsAs ( String outputFormat, Boolean wantRawResult, String ... entityIds ); public String getMappingsForTargetAs ( String outputFormat, Boolean wantRawResult, String targetServiceName, String entityId ); public void close (); } public interface EntityMappingManager { public void storeMappings ( String ... entityIds ); public void storeMappingBundle ( String ... entityIds ); public int deleteMappings ( String ... entityIds ); public int deleteEntities ( String ... entityIds ); public EntityMappingSearchResult getMappings ( Boolean wantRawResult, String ... entityIds ); public EntityMappingSearchResult getMappingsForTarget ( Boolean wantRawResult, String targetServiceName, String entityId ); public String getMappingsAs ( String outputFormat, Boolean wantRawResult, String ... entityIds ); public String getMappingsForTargetAs ( String outputFormat, Boolean wantRawResult, String targetServiceName, String entityId ); public void close (); } Multiple access means Programmatic API Line Commands REST Web Service Multiple data exchange formats Java and Java REST (Jersey used, client available) XML (The same that comes from REST, mapped via JAXB) JSON (future, maybe) RDF (future, more later) Queries via service+accession or URI (in future) Multiple access means Programmatic API Line Commands REST Web Service Multiple data exchange formats Java and Java REST (Jersey used, client available) XML (The same that comes from REST, mapped via JAXB) JSON (future, maybe) RDF (future, more later) Queries via service+accession or URI (in future)
  11. 11. 11 API Examples (Java)
  12. 12. 12 API Examples (Web Service)
  13. 13. 13 Component-based Architecture Components and their topology configured/instantiated via Spring Easy to build features like: Caching Logging Layered computations (e.g., add services in the same collection) Integration of 3-rd party systems (e.g., MIRIAM, more later) Components and their topology configured/instantiated via Spring Easy to build features like: Caching Logging Layered computations (e.g., add services in the same collection) Integration of 3-rd party systems (e.g., MIRIAM, more later)
  14. 14. 14 Related Work myEquivalents inspired to this Does pretty much what we do With a very similar internal model But for URIs only Code not available Only available as SAAS, no binary to deploy myEquivalents inspired to this Does pretty much what we do With a very similar internal model But for URIs only Code not available Only available as SAAS, no binary to deploy
  15. 15. 15 Related Work Pair model for URIs is a standard Equivalence-based model missing Dual identification mechanism missing Pair model for URIs is a standard Equivalence-based model missing Dual identification mechanism missing
  16. 16. 16 Future: RDF, SPARQL, Semantic Web Dereferenceable URIs, with RDF output Keeping support to the accession-based model too SPARQL, with support for both: ?b a mye:Bundle; mye:has-entity ?e1, ?e2, e3 (equivalence class model). ?entity1 owl:sameAs ?entity2 (mapping pair model) and for entity containers: _:e1 mye:provided-by [ _:s1 a mye:Service dc:title 'BioSD' ] adding reasoning over service types could come easilye.g. sample-service is-a biomaterial-service To be implemented with direct translation from Java objects to SPARQL (not just export), e.g., using ARQ in Jena Support for inference directly in the object model faster than a generic reasoner Support for SPARQL/UPDATE? Would allow for using an endpoint straight as back-end Support to keyword-based search, as in sameas.org Requires the addition of attributes (eg, title, description), nothing available at the Dereferenceable URIs, with RDF output Keeping support to the accession-based model too SPARQL, with support for both: ?b a mye:Bundle; mye:has-entity ?e1, ?e2, e3 (equivalence class model). ?entity1 owl:sameAs ?entity2 (mapping pair model) and for entity containers: _:e1 mye:provided-by [ _:s1 a mye:Service dc:title 'BioSD' ] adding reasoning over service types could come easilye.g. sample-service is-a biomaterial-service To be implemented with direct translation from Java objects to SPARQL (not just export), e.g., using ARQ in Jena Support for inference directly in the object model faster than a generic reasoner Support for SPARQL/UPDATE? Would allow for using an endpoint straight as back-end Support to keyword-based search, as in sameas.org Requires the addition of attributes (eg, title, description), nothing available at the
  17. 17. 17 Related Work It is to manage entities that share accessions e.g., PubMed and CiteXplore So, not enough for us But would be great to integrate! It is to manage entities that share accessions e.g., PubMed and CiteXplore So, not enough for us But would be great to integrate!
  18. 18. 18 Future: MIRAM and identifiers.org support Services & Entities Service Collection
  19. 19. 19 Future: MIRAM and identifiers.org support Service Collection Services Entity
  20. 20. 20 Combining MIRAM and myEquivalents Uniprot P62158 MIR:001000234599080 http://www.ebi.ac.uk/citexplore/ citationDetails.do? dataSource=MED&externalId=4599080 http://www.ebi.ac.uk/citexplore/ citationDetails.do? dataSource=MED&externalId=4599080 HubMed4599080 http://www.ncbi.nlm.nih.gov/protein/P62158 http://www.ncbi.nlm.nih.gov/protein/P62158 Mappings Stored in myEquivalents Computed by MIRIAM Computed by MIRIAM Resources importedfrom MIRIAM
  21. 21. 21 Issues: Access Control (on-going) We assume: by just within the same most of data is publicly readable except private entities (maybe) Implies a very simple model, users can have the roles of reader, can only read public stuff the only thing got by anonymous (i.e., un-authenticated user) editor, can change all (mappings, service descriptions etc) Authentication details travel via SSL/HTTPS and via POST makes it unnecessary to have complex mechanisms based on shared We assume: updates are managed by just a few people, within the same organisation and collaborating team most of data is publicly readable except private entities (maybe) Implies a very simple model, users can have the roles of reader, can only read public stuff the only thing got by anonymous (i.e., un-authenticated user) editor, can change all (mappings, service descriptions etc) admin, can administrate users and permissions Though simple, it's a good base for managing provenance too Authentication details all requests contains user + hash(password) travel via SSL/HTTPS and via POST makes it unnecessary to have complex mechanisms based on shared secret (eg, OAuth)
  22. 22. 22 Issues: Versioning (future?) That's been ignored so far cause we're assuming one version ↔ one accession ↔ one URI and leaving versioning fun to the repositories Must be addressed later Possible scenario: Entities are identified by means of service + acc + version New version relations are added (has-version, is-prior-version, has-next- version) It is still one URI ↔ one entity at the level of a given version URI pattern contains an additional placeholder for the ver. It's up to the myEquivalents clients to either: omit the version (ie, last version is always assumed, even upon ver. increase) specify a given version (requires manual version update) Possibly: keep history of all versions That's been ignored so far cause we're assuming one version ↔ one accession ↔ one URI and leaving versioning fun to the repositories Must be addressed later Possible scenario: Entities are identified by means of service + acc + version New version relations are added (has-version, is-prior-version, has-next- version) It is still one URI ↔ one entity at the level of a given version URI pattern contains an additional placeholder for the ver. It's up to the myEquivalents clients to either: omit the version (ie, last version is always assumed, even upon ver. increase) specify a given version (requires manual version update) Possibly: keep history of all versions
  23. 23. 23 That's all! Thank You! Have a look at the code and the wiki (on-going work!): http://myequivalentshttp://github.com/EBIBioSamples/myequivalents

×