The Sanger Mouse Resources Portal - A Testbed for Collaborative Data Integration

  • 1,407 views
Uploaded on

A brief overview of how we put the new Sanger mouse portal together. …

A brief overview of how we put the new Sanger mouse portal together.

This presentation was given at the International Workshop for Portals in Life Sciences (IWPLS) 14th September 2009, Edinburgh.

More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
1,407
On Slideshare
0
From Embeds
0
Number of Embeds
2

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Sanger Mouse Resources Portal A Testbed for Collaborative Data Integration Darren Oakley, Vivek Iyer, Bill Skarnes
  • 2. Making a Collaborative Data Portal...
  • 3. ‘Borg’ Approach • Single group becomes sole owner/curator of portal and its data • Other groups feed their data into portal group
  • 4. burp
  • 5. Why This Works • Clearly defined centre • It provides central curation for all data
  • 6. Mouse Informatics • Genes • Mutants (ES Cells, Mice) • Phenotypes • In mouse informatics, the traditional Borg is MGI - this has worked nicely for many years: http://informatics.jax.org
  • 7. Mouse Informatics • Times are changing... • Other informatics groups are providing high volume data and want in on the portal game
  • 8. “Hand over your data, prepare to be assimilated” “No, YOU hand over your data and prepare to be assimilated” “Ahem, both of you, prepare to be assimilated!”
  • 9. “Hand over your data, prepare to be assimilated” “No, YOU hand over your data and prepare to be assimilated” ? lB org he rea t yo u is c h of … whi “Ahem, both of you, prepare to be assimilated!”
  • 10. ‘Federation’ Approach • Each group hosts their own data and exposes it via defined services • Make a ‘clever’ portal that pulls of these resources together • No single group is totally in charge
  • 11. The Sanger Mouse Resources Portal http://www.sanger.ac.uk/mouseportal (Our Attempt at the Federation Approach...)
  • 12. Distributed Data • Currently 5 distinct, but related sets of mouse data: • Gene Information • Phenotyping • Mutant Mouse Breeding • Mutant ES Cell / Vector Production • Other DNA Resources
  • 13. Screenshot Tour
  • 14. Technologies Search Engine Portal Interface Data Services
  • 15. index searchable terms
  • 16. index searchable terms
  • 17. MartSearch / Portal index searchable terms
  • 18. MartSearch / Portal send users search term to Solr index searchable terms
  • 19. MartSearch / Portal send users search term to Solr Solr returns groups of terms to query Biomarts with index searchable terms
  • 20. MartSearch / Portal send users search term to Solr Solr returns groups of terms to query Biomarts with send asynchronous requests to each of the Biomarts for the data the user is interested in index searchable terms
  • 21. User searches for ‘Cbx7’
  • 22. User searches for ‘Cbx7’ Search for ‘Cbx7’
  • 23. User searches for ‘Cbx7’ Search for ‘Cbx7’ JSON data containing information on what to search each biomart by...
  • 24. User searches for ‘Cbx7’ Search for ‘Cbx7’ JSON data containing information on what to search each biomart by... Search using query parameters defined by Solr response
  • 25. User searches for ‘Cbx7’ Search for ‘Cbx7’ JSON data containing information on what to search each biomart by... Search using query parameters defined by Solr response Render search results using templates
  • 26. Extending The Portal • Put new data into a Biomart • Write JSON config file for MartSearch (defining filters to index and use) • Rebuild the index
  • 27. Advantages • Easily extensible • Data responsibility shared
  • 28. Disadvantages • Hard to avoid redundancy • Sometimes needed for data linking • Un-curated • Each group can curate its own data • No curation as a whole
  • 29. Disclaimer • Windows users... • If you use IE - it will eat your browser • Use Firefox/Chrome/Safari/Opera for a more pleasant internet experience • We are working on it - IE 8 gives an ok experience...
  • 30. Acknowledgments • Funding: I-DCC grant (EU FP7) • Coordination of informatic resources from high-throughput mouse ES cell mutagensis programs • Wellcome Trust Sanger Institute • T87 - ES Cell Mutagenesis • MIG - Mouse Informatics Group
  • 31. http://www.sanger.ac.uk/mouseportal http://github.com/dazoakley/martsearch do2@sanger.ac.uk dazoakley