Outlining some ideas... - I-DCC Kick Off Metting

  • 587 views
Uploaded on

This was a presentation I gave at the I-DCC (http://www.i-dcc.org) project kick-off meeting in April 2009. The gist of the talk was how we can go about making a collaborative data portal (one of the …

This was a presentation I gave at the I-DCC (http://www.i-dcc.org) project kick-off meeting in April 2009. The gist of the talk was how we can go about making a collaborative data portal (one of the goals of the project) and showing some early prototype work that I had done.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
587
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Outlining some ideas... Darren Oakley - WTSI do2@sanger.ac.uk
  • 2. Ideas for what? WP4 - Web Portal How we can all work together
  • 3. WP4 Web Portal
  • 4. WP4 objectives Create a site to display current repository information Create DAS-tracks, to display this information in its genomic context Create a Biomart. The Biomart will ser ve DAS- tracks, provide query web-ser vices, and link to other Biomarts (including EnsMart), greatly enhancing the search capability and future utility of the repository
  • 5. The idea... De-centralize the data, everyone who wants in on the portal: use Biomart! Standardized Web services and DAS out of the box This makes the data open to all We promise not to take over the world
  • 6. The idea... 2 Interfaces: Damian New MartView interface (advanced search) Us Google-like search (simple search - “MartSearch”)
  • 7. The idea... Turn the portal into a Biomart mashup! “In web development, a mashup is a Web application that combines data from one or more sources into a single integrated tool. The term Mashup implies easy, fast integration, frequently done by access to open APIs and data sources to produce results that were not the original reason for producing the raw source data” - Wikipedia
  • 8. Implementation 100% Javascript driven user interface User goes to the portal enters a search term, this gets fired against a cloud of biomarts and returns a coherent response No complex controller logic (it shouldn’t need any)
  • 9. Javascript?!? Aaargh!! The old days... Browser incompatibilities, clunky performance Now... Javascript is fast! Chrome, Firefox 3.1, Safari 4, IE 8 Libraries take care of the cross-browser issues
  • 10. Obligatory Architecture Drawings
  • 11. Plan A HTTP request MartSearch Martservice XML query Biomart based federation
  • 12. Plan A HTTP request MartSearch You Can only federate across 2 marts Martservice XML query Search times can vary greatly with federation Biomart based federation
  • 13. Plan B HTTP request MartSearch Martser vice XML query to each mart, perform federation on the fly
  • 14. Plan B HTTP request MartSearchattribute Searching on more than one requires many XML requests per mart No way to page results Martser vice XML query to No way of doing OR queries each mart, perform No way of doing loose text queries federation on the fly
  • 15. Plan C HTTP request 1 MartSearch Send query to Lucene based search index and retrieve paged list of genes and linking IDs 2 Martservice XML query to each mart 0 Index the searchable fields from the biomarts
  • 16. Plan C FAST search results HTTP request Can do loose text and OR queries 1 Pagination MartSearch Solr takes care of the federation for you Send query to Lucene based search index and retrieve paged list of genes and linking IDs 2 Martservice XML query to each mart One more software stack to accommodate Need to re-build index after mart rebuild 0 Index the searchable fields from the biomarts
  • 17. Demo http://www.i-dcc.org/dev/martsearch/
  • 18. Home
  • 19. Search
  • 20. Refined searches
  • 21. Fast, flexible searching Customizable Add and remove data source from display Restrict the data coming back from source Extensible Adding in new data sources should be easy Custom templates for every data source Open Anyone can access the data and index (via ser vices) Anyone can get the code
  • 22. How it works... Apache Solr (http://lucene.apache.org/solr) Enterprise grade search ser ver built upon lucene Web service driven Represents each search object as a document
  • 23. Document XML
  • 24. How it works... jQuery (http://jquery.com) jQuery UI (http://jqueryui.com) EJS (http://embeddedjs.com) ActiveRecord.js (http://activerecordjs.org) Jamal (http://jamal-mvc.com)
  • 25. Moving for ward... Make (and/or integrate) more marts MGI, Komp-DCC, Eurexpress, GXD, EuroPhenome Portal branding, design, colour, layout How to represent the data Dictated by the type of user... Who are our users and what do they want from us?!?!?
  • 26. Get the code! http://github.com/dazoakley/martsearch/
  • 27. Working together
  • 28. Typical scenario Each group says... I’ll take this task - will send you the results when it’s ready If we’re (very) lucky, we get something sort of coherent in the end
  • 29. We can be better than this!
  • 30. What we should do... Open source code on a public repository Github, Google Code, Sourceforge Or even one of our own - as long as its public Shared bug tracking / support and wiki Github (wiki) + Lighthouse (bug tracking) Google Code / Sourceforge Host an instance of Redmine or Trac
  • 31. Get the code! http://github.com/dazoakley/martsearch/