Hw09 Protein Alignment

•Download as PPT, PDF•

0 likes•637 views

Cloudera, Inc.

Technology Education

Hadoop World 2009 New York Oct 2, 2009 Sequence Alignment and Hadoop . Booz Allen Hamilton Inc. 134 National Business Parkway Annapolis Junction, MD 20701 Tel (301) 543-4665 [email_address] Paul Brown Associate

The Impact of Hadoop ,[object Object],[object Object],[object Object],[object Object]

Biological Information Paul Brown 9/21/09 Need to verify all these.

Biology + Computer Science = Bioinformatics http://bioinformatics.ubc.ca/about/what_is_bioinformatics A Y N A R N A N R N Y A Y N N R N A A N R N

Bioinformatics: The Pain ,[object Object],[object Object],[object Object],[object Object],Why Hadoop: ,[object Object],[object Object]

So What? Querying a database of sequences for similar sequences ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

So What? Comparing sequences in bulk ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Reconstructed Sequence

What if the dataset of sequences doesn't fit on one machine? ,[object Object],[object Object],Sequence: A B C D Pair: AB AC AD BC BD CD Input data Pre Joined Data MapReduce MapReduce Pre Join Data Alignment Results Alignment Algorithm 1 Alignment Algorithm 2 Alignment Algorithm N

So What? Analyzing really big sequences ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Demonstration Implementation: Smith-Waterman Alignment ,[object Object],[object Object],[object Object],[object Object]

Smith-Waterman Algorithm - A Y N A N A N A - 0 0 0 0 0 0 0 0 0 A 0 2 1 0 2 1 2 1 2 N 0 1 1 3 2 4 3 4 3 A 0 2 1 2 5 4 6 5 6 N 0 1 1 3 4 7 6 8 7 A 0 2 1 2 5 6 9 8 10 N 0 1 1 3 4 7 8 11 10 R 0 0 0 2 3 6 7 10 10 A 0 2 1 1 4 5 8 9 12

Hadoop and EC2 Implementation ,[object Object],[object Object],[object Object],[object Object]

Ready for What’s Next…. ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

What's hot

VFB 2013 - HP Labs - Horizon Scanning - Technology TrendsScience City Bristol

Open problems big_data_19_feb_2015_ver_0.1Vijay Srinivas Agneeswaran, Ph.D

The internet of things, do we need all that data?Christian Verstraete

Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...Data Con LA

High Performance Predictive Analytics in R and HadoopRevolution Analytics

Distributed computing abstractions_data_science_6_june_2016_ver_0.4Vijay Srinivas Agneeswaran, Ph.D

Giraph++: From "Think Like a Vertex" to "Think Like a Graph"Yuanyuan Tian

High Performance Predictive Analytics in R and HadoopRevolution Analytics

Big Data Analysis Starts with RRevolution Analytics

Democratizing Machine Learning: Perspective from a scikit-learn CreatorDatabricks

The Future of Data ScienceDataWorks Summit

What Are Science Clouds?Robert Grossman

R and Data ScienceRevolution Analytics

The Open Science Data Cloud: Empowering the Long Tail of ScienceRobert Grossman

Introduction to Big Data and Science Clouds (Chapter 1, SC 11 Tutorial)Robert Grossman

Clouds, Grids and DataGuy Coates

Big Data, The Community and The Commons (May 12, 2014)Robert Grossman

The Business Economics and Opportunity of Open Source Data ScienceRevolution Analytics

Architectures for Data Commons (XLDB 15 Lightning Talk)Robert Grossman

High Performance Predictive Analytics in R and HadoopDataWorks Summit

What's hot (20)

VFB 2013 - HP Labs - Horizon Scanning - Technology Trends

Open problems big_data_19_feb_2015_ver_0.1

The internet of things, do we need all that data?

Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...

High Performance Predictive Analytics in R and Hadoop

Distributed computing abstractions_data_science_6_june_2016_ver_0.4

Giraph++: From "Think Like a Vertex" to "Think Like a Graph"

High Performance Predictive Analytics in R and Hadoop

Big Data Analysis Starts with R

Democratizing Machine Learning: Perspective from a scikit-learn Creator

The Future of Data Science

What Are Science Clouds?

R and Data Science

The Open Science Data Cloud: Empowering the Long Tail of Science

Introduction to Big Data and Science Clouds (Chapter 1, SC 11 Tutorial)

Clouds, Grids and Data

Big Data, The Community and The Commons (May 12, 2014)

The Business Economics and Opportunity of Open Source Data Science

Architectures for Data Commons (XLDB 15 Lightning Talk)

High Performance Predictive Analytics in R and Hadoop

Similar to Hw09 Protein Alignment

Pervasive DataRushtempledf

From Relational Database Management to Big Data: Solutions for Data Migration...Cognizant

Oct 2011 CHADNUG Presentation on HadoopJosh Patterson

Lecture 5 - Big Data and Hadoop Intro.pptalmaraniabwmalk

Big Data Basic Concepts | Presented in 2014Kenneth Igiri

Big data: Descoberta de conhecimento em ambientes de big data e computação na...Rio Info

Dataintensivesulfath

Paralyzing Bioinformatics Applications Using Conducive Hadoop ClusterIOSR Journals

Lesson 1 introduction to_big_data_and_hadoop.pptxPankajkumar496281

Introduction to Apache HadoopChristopher Pezza

Building a Big Data platform with the Hadoop ecosystemGregg Barrett

HadoopZubair Arshad

Machine Learning and HadoopJosh Patterson

Cloud and Bid data Dr.VK.pdfkalai75

Big Data & Data MiningMd Mizanur Rahman

Big data and hadoopAshishRathore72

Big Data and OSS at IBMBoulder Java User's Group

BIG DATAShashank Shetty

Analyzing Big data in R and Scala using Apache Spark 17-7-19Ahmed Elsayed

1.demystifying big data & hadoopdatabloginfo

Similar to Hw09 Protein Alignment (20)

Pervasive DataRush

From Relational Database Management to Big Data: Solutions for Data Migration...

Oct 2011 CHADNUG Presentation on Hadoop

Lecture 5 - Big Data and Hadoop Intro.ppt

Big Data Basic Concepts | Presented in 2014

Big data: Descoberta de conhecimento em ambientes de big data e computação na...

Dataintensive

Paralyzing Bioinformatics Applications Using Conducive Hadoop Cluster

Lesson 1 introduction to_big_data_and_hadoop.pptx

Introduction to Apache Hadoop

Building a Big Data platform with the Hadoop ecosystem

Hadoop

Machine Learning and Hadoop

Cloud and Bid data Dr.VK.pdf

Big Data & Data Mining

Big data and hadoop

Big Data and OSS at IBM

BIG DATA

Analyzing Big data in R and Scala using Apache Spark 17-7-19

1.demystifying big data & hadoop

Recently uploaded

SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

GenCyber Cyber Security Day PresentationMichael W. Hawkins

Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4

Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime

Artificial intelligence in the post-deep learning eraDeakin University

Slack Application Development 101 Slidespraypatel2

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxnull - The Open Security Community

Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik

Pigging Solutions Piggable Sweeping ElbowsPigging Solutions

The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited

Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes

Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group

Vulnerability_Management_GRC_by Sohang Sengupta.pptxnull - The Open Security Community

How to Remove Document Management Hurdles with X-Docs?XfilesPro

Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j

Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski

Recently uploaded (20)

SIEMENS: RAPUNZEL – A Tale About Knowledge Graph

08448380779 Call Girls In Civil Lines Women Seeking Men

GenCyber Cyber Security Day Presentation

Azure Monitor & Application Insight to monitor Infrastructure & Application

Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget

Artificial intelligence in the post-deep learning era

Slack Application Development 101 Slides

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx

Injustice - Developers Among Us (SciFiDevCon 2024)

Pigging Solutions Piggable Sweeping Elbows

The Codex of Business Writing Software for Real-World Solutions 2.pptx

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners

Next-generation AAM aircraft unveiled by Supernal, S-A2

Vulnerability_Management_GRC_by Sohang Sengupta.pptx

How to Remove Document Management Hurdles with X-Docs?

Human Factors of XR: Using Human Factors to Design XR Systems

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...

Snow Chain-Integrated Tire for a Safe Drive on Winter Roads

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...