Your SlideShare is downloading. ×
0
Kallio bosc2010 chipster-cloud
Kallio bosc2010 chipster-cloud
Kallio bosc2010 chipster-cloud
Kallio bosc2010 chipster-cloud
Kallio bosc2010 chipster-cloud
Kallio bosc2010 chipster-cloud
Kallio bosc2010 chipster-cloud
Kallio bosc2010 chipster-cloud
Kallio bosc2010 chipster-cloud
Kallio bosc2010 chipster-cloud
Kallio bosc2010 chipster-cloud
Kallio bosc2010 chipster-cloud
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Kallio bosc2010 chipster-cloud

395

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
395
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
4
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Connecting Chipster genome browser to the cloud Aleksi Kallio CSC – IT Center for Science, Finland
  • 2. Architecture of Chipster platform Authentication Management service service Message broker File broker Clients Brokers Computing services  Loosely coupled, independent components  Message oriented communications  Flexible, scalable, robust  In other words, very cloud like
  • 3. Chipster in the cloud  1) Deploying compute nodes in the cloud • Easy, because architecture already loosely coupled and based on message passing  2) Running large parallel jobs in the cloud • Architecture allows this easily • Cloud compatible tools can be integrated quickly  3) Using cloud as a back end for interactive visualisations • Not maybe so obvious • So let's dig into this further...
  • 4. Background: Chipster Genome Browser  Interactive Swing-based GUI  Shows reads and analysis results in genomic context  Interactive zooming from chromosome down to nucleotide level  Ensembl annotations for genes and transcripts  Integrated with the rest of the Chipster  Parallel, distributed to some extent
  • 5. Basic idea  Preprocess data with Hadoop / MapReduce  Generate powers of two summaries for the data, like in Google Earth • Doubles the data size  Current genome browser samples data to produce summaries  Now summaries can be read directly – Accurate results, significantly less disk seeks  Distribute data to scale into massive datasets • Use messaging to query independent data providers  Aggregate results as/if they appear to the visualiser
  • 6. Work in progress...  Genome browser up and running  Hadoop based data processing at very early stages  Currently trying to get it scale well
  • 7. What's the point?  Besides items (e.g., reads), visualiser can receive “superitems” (e.g., summaries of reads) • Summarises coverage, quality, SNP's etc. of the original reads  All kinds of advanced information can be generated in the preprocessing step – Such as features that combine large number of genomes – Generators should be pluggable  We spend resources on the server side to improve user experience on the client side • At server side CPU, memory and disk space required • But only for a short time (like in large batch jobs) • Cheap commodity servers can be used • And the experiment has already been expensive
  • 8. Summary  Use cheap server resources to enable better user experience  Goal: to make data analysis quicker (and more fun)  Tackle server side unreliability on the client side  Future development – If this works out, it could be used in other Chipster visualisers also – Integrating Hbase queries to interactive visualisations – Optimising data summarising for visual truthfulness  For more info: aleksi.kallio@csc.fi,

×