Outlining some ideas... - I-DCC Kick Off Metting


Published on

This was a presentation I gave at the I-DCC (http://www.i-dcc.org) project kick-off meeting in April 2009. The gist of the talk was how we can go about making a collaborative data portal (one of the goals of the project) and showing some early prototype work that I had done.

1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Outlining some ideas... - I-DCC Kick Off Metting

  1. 1. Outlining some ideas... Darren Oakley - WTSI do2@sanger.ac.uk
  2. 2. Ideas for what? WP4 - Web Portal How we can all work together
  3. 3. WP4 Web Portal
  4. 4. WP4 objectives Create a site to display current repository information Create DAS-tracks, to display this information in its genomic context Create a Biomart. The Biomart will ser ve DAS- tracks, provide query web-ser vices, and link to other Biomarts (including EnsMart), greatly enhancing the search capability and future utility of the repository
  5. 5. The idea... De-centralize the data, everyone who wants in on the portal: use Biomart! Standardized Web services and DAS out of the box This makes the data open to all We promise not to take over the world
  6. 6. The idea... 2 Interfaces: Damian New MartView interface (advanced search) Us Google-like search (simple search - “MartSearch”)
  7. 7. The idea... Turn the portal into a Biomart mashup! “In web development, a mashup is a Web application that combines data from one or more sources into a single integrated tool. The term Mashup implies easy, fast integration, frequently done by access to open APIs and data sources to produce results that were not the original reason for producing the raw source data” - Wikipedia
  8. 8. Implementation 100% Javascript driven user interface User goes to the portal enters a search term, this gets fired against a cloud of biomarts and returns a coherent response No complex controller logic (it shouldn’t need any)
  9. 9. Javascript?!? Aaargh!! The old days... Browser incompatibilities, clunky performance Now... Javascript is fast! Chrome, Firefox 3.1, Safari 4, IE 8 Libraries take care of the cross-browser issues
  10. 10. Obligatory Architecture Drawings
  11. 11. Plan A HTTP request MartSearch Martservice XML query Biomart based federation
  12. 12. Plan A HTTP request MartSearch You Can only federate across 2 marts Martservice XML query Search times can vary greatly with federation Biomart based federation
  13. 13. Plan B HTTP request MartSearch Martser vice XML query to each mart, perform federation on the fly
  14. 14. Plan B HTTP request MartSearchattribute Searching on more than one requires many XML requests per mart No way to page results Martser vice XML query to No way of doing OR queries each mart, perform No way of doing loose text queries federation on the fly
  15. 15. Plan C HTTP request 1 MartSearch Send query to Lucene based search index and retrieve paged list of genes and linking IDs 2 Martservice XML query to each mart 0 Index the searchable fields from the biomarts
  16. 16. Plan C FAST search results HTTP request Can do loose text and OR queries 1 Pagination MartSearch Solr takes care of the federation for you Send query to Lucene based search index and retrieve paged list of genes and linking IDs 2 Martservice XML query to each mart One more software stack to accommodate Need to re-build index after mart rebuild 0 Index the searchable fields from the biomarts
  17. 17. Demo http://www.i-dcc.org/dev/martsearch/
  18. 18. Home
  19. 19. Search
  20. 20. Refined searches
  21. 21. Fast, flexible searching Customizable Add and remove data source from display Restrict the data coming back from source Extensible Adding in new data sources should be easy Custom templates for every data source Open Anyone can access the data and index (via ser vices) Anyone can get the code
  22. 22. How it works... Apache Solr (http://lucene.apache.org/solr) Enterprise grade search ser ver built upon lucene Web service driven Represents each search object as a document
  23. 23. Document XML
  24. 24. How it works... jQuery (http://jquery.com) jQuery UI (http://jqueryui.com) EJS (http://embeddedjs.com) ActiveRecord.js (http://activerecordjs.org) Jamal (http://jamal-mvc.com)
  25. 25. Moving for ward... Make (and/or integrate) more marts MGI, Komp-DCC, Eurexpress, GXD, EuroPhenome Portal branding, design, colour, layout How to represent the data Dictated by the type of user... Who are our users and what do they want from us?!?!?
  26. 26. Get the code! http://github.com/dazoakley/martsearch/
  27. 27. Working together
  28. 28. Typical scenario Each group says... I’ll take this task - will send you the results when it’s ready If we’re (very) lucky, we get something sort of coherent in the end
  29. 29. We can be better than this!
  30. 30. What we should do... Open source code on a public repository Github, Google Code, Sourceforge Or even one of our own - as long as its public Shared bug tracking / support and wiki Github (wiki) + Lighthouse (bug tracking) Google Code / Sourceforge Host an instance of Redmine or Trac
  31. 31. Get the code! http://github.com/dazoakley/martsearch/