Your SlideShare is downloading. ×
0
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Building a Distributed Data Portal
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Building a Distributed Data Portal

407

Published on

Slides from a presentation I gave at SciBarCamb 2011 (9th April, 2011) in Cambridge (UK). …

Slides from a presentation I gave at SciBarCamb 2011 (9th April, 2011) in Cambridge (UK).

Basically it goes through some of the recent work and theory i've been doing to do with setting up a data portal using distributed web services, allowing easy data sharing and reduced effort in data maintenance.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
407
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Building a Distributed Data Portal Darren Oakley SciBarCamb 2011
  • 2. Background•mouse informatics @ Sanger Institute•work with lots of other groups •need to share, integrate and represent lots of datatypes •both OUR and OTHER peoples data
  • 3. •gene information (id‘s, location, GO etc)•related human diseases (OMIM, GWAS)•expression•phenotyping•mutant mouse breeding•mutant es cells, vectors
  • 4. that‘s a lot of stuff...
  • 5. we can do this one of two ways...
  • 6. ‘Borg‘ Approach • single group becomes sole owner/curator of portal and its data • other groups feed their data into portal group
  • 7. burp
  • 8. Pros•clearly defined centre to the universe•provides central curation to all data
  • 9. Cons•huge effort to curate and maintain large and diverse dataset •hold / maintain your own db of everything •integrating totally new / different data becomes a challenge•single group becomes effective ‘owner‘•can stifle innovation and new ideas
  • 10. what happens whenmore than one group tries to do this?
  • 11. “Hand over your data,prepare to be assimilated” “No, YOU hand over your data and prepare to be assimilated” “Ahem, both of you, prepare to be assimilated!”
  • 12. “Hand over your data,prepare to be assimilated” “No, YOU hand over your data and prepare to be assimilated” g? l Bor e rea u is th o ch of y … whi “Ahem, both of you, prepare to be assimilated!”
  • 13. ‘Federation‘ Approach • each group hosts their own data and exposes it via defined services • make a ‘clever‘ portal that pulls these resources together • no single group is totally in charge
  • 14. Use data for a more specialized purposeBuild own portal competitor
  • 15. The Techsearch engine data sources web service
  • 16. MartSearch / Portal
  • 17. MartSearch / Portal
  • 18. MartSearch / Portalindex searchable terms
  • 19. MartSearch / Portalindex searchable terms
  • 20. MartSearch / Portalindex searchable terms
  • 21. MartSearch / Portal send users search term to Solrindex searchable terms
  • 22. MartSearch / Portal send users search term to Solr Solr returns groups of terms to query data sources withindex searchable terms
  • 23. MartSearch / Portal send users search term to Solr Solr returns groups of terms to query data sources with send asynchronous requests to each of the data sources for the data the user is interested inindex searchable terms
  • 24. User searches for ‘diabetes‘
  • 25. User searches for ‘diabetes‘ Search for ‘diabetes‘
  • 26. User searches for ‘diabetes‘ Search for ‘diabetes‘ JSON data containing information on what to search each datasource by...
  • 27. User searches for ‘diabetes‘ Search for ‘diabetes‘ JSON data containing information on what to search each datasource by... Search using query parameters defined by Solr response
  • 28. User searches for ‘diabetes‘ Search for ‘diabetes‘ JSON data containing information on what to search each datasource by... Search using query parameters defined by Solr response Render search results using templates
  • 29. Pros•easily extendable•data curation done by primary data producers / handlers•YOU don‘t have to keep / maintain copies of everything
  • 30. Cons•hard to avoid some data redundancy •need common linking terms•un-curated as a whole
  • 31. Extending the Portal•set-up or find a new datasource to add •other web service •another biomart•write a simple config/adaptor to talk to it
  • 32. www.knockoutmouse.org/martsearchgithub.com/i-dcc/martsearch@dazoakley

×