SlideShare a Scribd company logo
1 of 8
Twister4Azure : Iterative MapReduce for Azure Cloud ThilinaGunarathne, Judy Qiu, Geoffrey Fox {tgunarat, xqiu,gcf}@indiana.edu CCA 2011 April 12 – 13, 2011
MapReduceRolesfor Azure Familiar MapReduce programming model Built using highly-available and scalable Azure cloud services Co-exist with eventual consistency & high latency of cloud services Decentralized control No single point of failure. Supports dynamically scaling up and down of the compute resources. MapReduce fault tolerance
MapReduceRolesfor Azure
Twister for Azure ,[object Object]
In-Memory Caching of static data
Cache aware hybrid scheduling using Queues as well as using a bulletin board (special table) ,[object Object]
Performance – Kmeans Clustering Performance with/without data caching.  Speedup gained using data cache Increasing number of iterations Scaling speedup
Performance Comparisons BLAST Sequence Search Smith Watermann Sequence Alignment Cap3 Sequence Assembly

More Related Content

What's hot

Feature Geo Analytics and Big Data Processing: Hybrid Approaches for Earth Sc...
Feature Geo Analytics and Big Data Processing: Hybrid Approaches for Earth Sc...Feature Geo Analytics and Big Data Processing: Hybrid Approaches for Earth Sc...
Feature Geo Analytics and Big Data Processing: Hybrid Approaches for Earth Sc...Dawn Wright
 
Enabling Efficient and Geometric Range Query with Access Control over Encrypt...
Enabling Efficient and Geometric Range Query with Access Control over Encrypt...Enabling Efficient and Geometric Range Query with Access Control over Encrypt...
Enabling Efficient and Geometric Range Query with Access Control over Encrypt...JAYAPRAKASH JPINFOTECH
 
Realworld Powersavings for the Cloud Customer
Realworld Powersavings for the Cloud CustomerRealworld Powersavings for the Cloud Customer
Realworld Powersavings for the Cloud Customercloudcampghent
 
Bioclouds CAMDA (Robert Grossman) 09-v9p
Bioclouds CAMDA (Robert Grossman) 09-v9pBioclouds CAMDA (Robert Grossman) 09-v9p
Bioclouds CAMDA (Robert Grossman) 09-v9pRobert Grossman
 
Stanford/SLAC Cryo-EM Computing and Storage, Yee-Ting Li
Stanford/SLAC Cryo-EM Computing and Storage, Yee-Ting LiStanford/SLAC Cryo-EM Computing and Storage, Yee-Ting Li
Stanford/SLAC Cryo-EM Computing and Storage, Yee-Ting LiPacificResearchPlatform
 
Foss4G 2009 Scenz Grid
Foss4G 2009 Scenz GridFoss4G 2009 Scenz Grid
Foss4G 2009 Scenz Gridnoho
 
Data Orchestration for AI, Big Data, and Cloud
Data Orchestration for AI, Big Data, and CloudData Orchestration for AI, Big Data, and Cloud
Data Orchestration for AI, Big Data, and CloudAlluxio, Inc.
 
OCC Overview OMG Clouds Meeting 07-13-09 v3
OCC Overview OMG Clouds Meeting 07-13-09 v3OCC Overview OMG Clouds Meeting 07-13-09 v3
OCC Overview OMG Clouds Meeting 07-13-09 v3Robert Grossman
 
Modern Scientific Data Management Practices: The Atmospheric Radiation Measur...
Modern Scientific Data Management Practices: The Atmospheric Radiation Measur...Modern Scientific Data Management Practices: The Atmospheric Radiation Measur...
Modern Scientific Data Management Practices: The Atmospheric Radiation Measur...Globus
 
Enabling Efficient and Geometric Range Query with Access Control over Encrypt...
Enabling Efficient and Geometric Range Query with Access Control over Encrypt...Enabling Efficient and Geometric Range Query with Access Control over Encrypt...
Enabling Efficient and Geometric Range Query with Access Control over Encrypt...JAYAPRAKASH JPINFOTECH
 
AWS Customer Presentation - Cycle Computing - AWS Summit 2012 - NYC
AWS Customer  Presentation - Cycle Computing - AWS Summit 2012 - NYCAWS Customer  Presentation - Cycle Computing - AWS Summit 2012 - NYC
AWS Customer Presentation - Cycle Computing - AWS Summit 2012 - NYCAmazon Web Services
 
Clustring computing
Clustring computingClustring computing
Clustring computingmeraz_ahmed
 
NASA Earth Exchange (NEX) Overview
NASA Earth Exchange (NEX) OverviewNASA Earth Exchange (NEX) Overview
NASA Earth Exchange (NEX) OverviewPlanet OS
 

What's hot (20)

Feature Geo Analytics and Big Data Processing: Hybrid Approaches for Earth Sc...
Feature Geo Analytics and Big Data Processing: Hybrid Approaches for Earth Sc...Feature Geo Analytics and Big Data Processing: Hybrid Approaches for Earth Sc...
Feature Geo Analytics and Big Data Processing: Hybrid Approaches for Earth Sc...
 
Hadoop
HadoopHadoop
Hadoop
 
Scheduling in cloud
Scheduling in cloudScheduling in cloud
Scheduling in cloud
 
Introduction to OpenStack
Introduction to OpenStackIntroduction to OpenStack
Introduction to OpenStack
 
Enabling Efficient and Geometric Range Query with Access Control over Encrypt...
Enabling Efficient and Geometric Range Query with Access Control over Encrypt...Enabling Efficient and Geometric Range Query with Access Control over Encrypt...
Enabling Efficient and Geometric Range Query with Access Control over Encrypt...
 
WTIA Cloud Computing Series - Part I: The Fundamentals
WTIA Cloud Computing Series - Part I: The FundamentalsWTIA Cloud Computing Series - Part I: The Fundamentals
WTIA Cloud Computing Series - Part I: The Fundamentals
 
Realworld Powersavings for the Cloud Customer
Realworld Powersavings for the Cloud CustomerRealworld Powersavings for the Cloud Customer
Realworld Powersavings for the Cloud Customer
 
DGterzo
DGterzoDGterzo
DGterzo
 
Grid
GridGrid
Grid
 
Poster
PosterPoster
Poster
 
Bioclouds CAMDA (Robert Grossman) 09-v9p
Bioclouds CAMDA (Robert Grossman) 09-v9pBioclouds CAMDA (Robert Grossman) 09-v9p
Bioclouds CAMDA (Robert Grossman) 09-v9p
 
Stanford/SLAC Cryo-EM Computing and Storage, Yee-Ting Li
Stanford/SLAC Cryo-EM Computing and Storage, Yee-Ting LiStanford/SLAC Cryo-EM Computing and Storage, Yee-Ting Li
Stanford/SLAC Cryo-EM Computing and Storage, Yee-Ting Li
 
Foss4G 2009 Scenz Grid
Foss4G 2009 Scenz GridFoss4G 2009 Scenz Grid
Foss4G 2009 Scenz Grid
 
Data Orchestration for AI, Big Data, and Cloud
Data Orchestration for AI, Big Data, and CloudData Orchestration for AI, Big Data, and Cloud
Data Orchestration for AI, Big Data, and Cloud
 
OCC Overview OMG Clouds Meeting 07-13-09 v3
OCC Overview OMG Clouds Meeting 07-13-09 v3OCC Overview OMG Clouds Meeting 07-13-09 v3
OCC Overview OMG Clouds Meeting 07-13-09 v3
 
Modern Scientific Data Management Practices: The Atmospheric Radiation Measur...
Modern Scientific Data Management Practices: The Atmospheric Radiation Measur...Modern Scientific Data Management Practices: The Atmospheric Radiation Measur...
Modern Scientific Data Management Practices: The Atmospheric Radiation Measur...
 
Enabling Efficient and Geometric Range Query with Access Control over Encrypt...
Enabling Efficient and Geometric Range Query with Access Control over Encrypt...Enabling Efficient and Geometric Range Query with Access Control over Encrypt...
Enabling Efficient and Geometric Range Query with Access Control over Encrypt...
 
AWS Customer Presentation - Cycle Computing - AWS Summit 2012 - NYC
AWS Customer  Presentation - Cycle Computing - AWS Summit 2012 - NYCAWS Customer  Presentation - Cycle Computing - AWS Summit 2012 - NYC
AWS Customer Presentation - Cycle Computing - AWS Summit 2012 - NYC
 
Clustring computing
Clustring computingClustring computing
Clustring computing
 
NASA Earth Exchange (NEX) Overview
NASA Earth Exchange (NEX) OverviewNASA Earth Exchange (NEX) Overview
NASA Earth Exchange (NEX) Overview
 

Similar to Twister4Azure - Iterative MapReduce for Azure Cloud

Improved Utilization of Infrastructure of Clouds by using Upgraded Functional...
Improved Utilization of Infrastructure of Clouds by using Upgraded Functional...Improved Utilization of Infrastructure of Clouds by using Upgraded Functional...
Improved Utilization of Infrastructure of Clouds by using Upgraded Functional...AM Publications
 
New proximity estimate for incremental update of non uniformly distributed cl...
New proximity estimate for incremental update of non uniformly distributed cl...New proximity estimate for incremental update of non uniformly distributed cl...
New proximity estimate for incremental update of non uniformly distributed cl...IJDKP
 
Hybrid Task Scheduling Approach using Gravitational and ACO Search Algorithm
Hybrid Task Scheduling Approach using Gravitational and ACO Search AlgorithmHybrid Task Scheduling Approach using Gravitational and ACO Search Algorithm
Hybrid Task Scheduling Approach using Gravitational and ACO Search AlgorithmIRJET Journal
 
A novel cost-based replica server placement for optimal service quality in ...
  A novel cost-based replica server placement for optimal service quality in ...  A novel cost-based replica server placement for optimal service quality in ...
A novel cost-based replica server placement for optimal service quality in ...IJECEIAES
 
IRJET- Improving Data Availability by using VPC Strategy in Cloud Environ...
IRJET-  	  Improving Data Availability by using VPC Strategy in Cloud Environ...IRJET-  	  Improving Data Availability by using VPC Strategy in Cloud Environ...
IRJET- Improving Data Availability by using VPC Strategy in Cloud Environ...IRJET Journal
 
IEEE Parallel and distributed system 2016 Title and Abstract
IEEE Parallel and distributed system 2016 Title and AbstractIEEE Parallel and distributed system 2016 Title and Abstract
IEEE Parallel and distributed system 2016 Title and Abstracttsysglobalsolutions
 
MataNui - Building a Grid Data Infrastructure that "doesn't suck!"
MataNui - Building a Grid Data Infrastructure that "doesn't suck!"MataNui - Building a Grid Data Infrastructure that "doesn't suck!"
MataNui - Building a Grid Data Infrastructure that "doesn't suck!"Guy K. Kloss
 
A Novel Approach for Workload Optimization and Improving Security in Cloud Co...
A Novel Approach for Workload Optimization and Improving Security in Cloud Co...A Novel Approach for Workload Optimization and Improving Security in Cloud Co...
A Novel Approach for Workload Optimization and Improving Security in Cloud Co...IOSR Journals
 
Data Distribution Handling on Cloud for Deployment of Big Data
Data Distribution Handling on Cloud for Deployment of Big DataData Distribution Handling on Cloud for Deployment of Big Data
Data Distribution Handling on Cloud for Deployment of Big Dataneirew J
 
Data Distribution Handling on Cloud for Deployment of Big Data
Data Distribution Handling on Cloud for Deployment of Big DataData Distribution Handling on Cloud for Deployment of Big Data
Data Distribution Handling on Cloud for Deployment of Big Dataijccsa
 
Extending Grids with Cloud Resource Management for Scientific Computing
Extending Grids with Cloud Resource Management for Scientific ComputingExtending Grids with Cloud Resource Management for Scientific Computing
Extending Grids with Cloud Resource Management for Scientific ComputingBharat Kalia
 
IRJET- An Efficient Data Replication in Salesforce Cloud Environment
IRJET-  	  An Efficient Data Replication in Salesforce Cloud EnvironmentIRJET-  	  An Efficient Data Replication in Salesforce Cloud Environment
IRJET- An Efficient Data Replication in Salesforce Cloud EnvironmentIRJET Journal
 
Assimilating sense into disaster recovery databases and judgement framing pr...
Assimilating sense into disaster recovery databases and  judgement framing pr...Assimilating sense into disaster recovery databases and  judgement framing pr...
Assimilating sense into disaster recovery databases and judgement framing pr...IJECEIAES
 
A Study on Replication and Failover Cluster to Maximize System Uptime
A Study on Replication and Failover Cluster to Maximize System UptimeA Study on Replication and Failover Cluster to Maximize System Uptime
A Study on Replication and Failover Cluster to Maximize System UptimeYogeshIJTSRD
 
Final_attribute based encryption in cloud with significant reduction of compu...
Final_attribute based encryption in cloud with significant reduction of compu...Final_attribute based encryption in cloud with significant reduction of compu...
Final_attribute based encryption in cloud with significant reduction of compu...Naveena N
 
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUESANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUESijccsa
 
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUESANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUESijccsa
 
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUESANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUESijccsa
 

Similar to Twister4Azure - Iterative MapReduce for Azure Cloud (20)

Improved Utilization of Infrastructure of Clouds by using Upgraded Functional...
Improved Utilization of Infrastructure of Clouds by using Upgraded Functional...Improved Utilization of Infrastructure of Clouds by using Upgraded Functional...
Improved Utilization of Infrastructure of Clouds by using Upgraded Functional...
 
Paper444012-4014
Paper444012-4014Paper444012-4014
Paper444012-4014
 
New proximity estimate for incremental update of non uniformly distributed cl...
New proximity estimate for incremental update of non uniformly distributed cl...New proximity estimate for incremental update of non uniformly distributed cl...
New proximity estimate for incremental update of non uniformly distributed cl...
 
Hybrid Task Scheduling Approach using Gravitational and ACO Search Algorithm
Hybrid Task Scheduling Approach using Gravitational and ACO Search AlgorithmHybrid Task Scheduling Approach using Gravitational and ACO Search Algorithm
Hybrid Task Scheduling Approach using Gravitational and ACO Search Algorithm
 
A novel cost-based replica server placement for optimal service quality in ...
  A novel cost-based replica server placement for optimal service quality in ...  A novel cost-based replica server placement for optimal service quality in ...
A novel cost-based replica server placement for optimal service quality in ...
 
IRJET- Improving Data Availability by using VPC Strategy in Cloud Environ...
IRJET-  	  Improving Data Availability by using VPC Strategy in Cloud Environ...IRJET-  	  Improving Data Availability by using VPC Strategy in Cloud Environ...
IRJET- Improving Data Availability by using VPC Strategy in Cloud Environ...
 
IEEE Parallel and distributed system 2016 Title and Abstract
IEEE Parallel and distributed system 2016 Title and AbstractIEEE Parallel and distributed system 2016 Title and Abstract
IEEE Parallel and distributed system 2016 Title and Abstract
 
MataNui - Building a Grid Data Infrastructure that "doesn't suck!"
MataNui - Building a Grid Data Infrastructure that "doesn't suck!"MataNui - Building a Grid Data Infrastructure that "doesn't suck!"
MataNui - Building a Grid Data Infrastructure that "doesn't suck!"
 
D017212027
D017212027D017212027
D017212027
 
A Novel Approach for Workload Optimization and Improving Security in Cloud Co...
A Novel Approach for Workload Optimization and Improving Security in Cloud Co...A Novel Approach for Workload Optimization and Improving Security in Cloud Co...
A Novel Approach for Workload Optimization and Improving Security in Cloud Co...
 
Data Distribution Handling on Cloud for Deployment of Big Data
Data Distribution Handling on Cloud for Deployment of Big DataData Distribution Handling on Cloud for Deployment of Big Data
Data Distribution Handling on Cloud for Deployment of Big Data
 
Data Distribution Handling on Cloud for Deployment of Big Data
Data Distribution Handling on Cloud for Deployment of Big DataData Distribution Handling on Cloud for Deployment of Big Data
Data Distribution Handling on Cloud for Deployment of Big Data
 
Extending Grids with Cloud Resource Management for Scientific Computing
Extending Grids with Cloud Resource Management for Scientific ComputingExtending Grids with Cloud Resource Management for Scientific Computing
Extending Grids with Cloud Resource Management for Scientific Computing
 
IRJET- An Efficient Data Replication in Salesforce Cloud Environment
IRJET-  	  An Efficient Data Replication in Salesforce Cloud EnvironmentIRJET-  	  An Efficient Data Replication in Salesforce Cloud Environment
IRJET- An Efficient Data Replication in Salesforce Cloud Environment
 
Assimilating sense into disaster recovery databases and judgement framing pr...
Assimilating sense into disaster recovery databases and  judgement framing pr...Assimilating sense into disaster recovery databases and  judgement framing pr...
Assimilating sense into disaster recovery databases and judgement framing pr...
 
A Study on Replication and Failover Cluster to Maximize System Uptime
A Study on Replication and Failover Cluster to Maximize System UptimeA Study on Replication and Failover Cluster to Maximize System Uptime
A Study on Replication and Failover Cluster to Maximize System Uptime
 
Final_attribute based encryption in cloud with significant reduction of compu...
Final_attribute based encryption in cloud with significant reduction of compu...Final_attribute based encryption in cloud with significant reduction of compu...
Final_attribute based encryption in cloud with significant reduction of compu...
 
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUESANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
 
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUESANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
 
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUESANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
 

Recently uploaded

SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

Twister4Azure - Iterative MapReduce for Azure Cloud

  • 1. Twister4Azure : Iterative MapReduce for Azure Cloud ThilinaGunarathne, Judy Qiu, Geoffrey Fox {tgunarat, xqiu,gcf}@indiana.edu CCA 2011 April 12 – 13, 2011
  • 2. MapReduceRolesfor Azure Familiar MapReduce programming model Built using highly-available and scalable Azure cloud services Co-exist with eventual consistency & high latency of cloud services Decentralized control No single point of failure. Supports dynamically scaling up and down of the compute resources. MapReduce fault tolerance
  • 4.
  • 5. In-Memory Caching of static data
  • 6.
  • 7. Performance – Kmeans Clustering Performance with/without data caching. Speedup gained using data cache Increasing number of iterations Scaling speedup
  • 8. Performance Comparisons BLAST Sequence Search Smith Watermann Sequence Alignment Cap3 Sequence Assembly
  • 9. Conclusion Enables users to easily and efficiently perform large scale iterative data analysis and scientific computations on Azure cloud. Utilizes a novel hybrid scheduling mechanism to provide the caching of static data across iterations. Utilize cloud infrastructure services effectively to deliver robust and efficient applications. http://salsahpc.indiana.edu/twister4azure

Editor's Notes

  1. We use Azure Queues for scheduling, Tables to store meta-data and monitoring data, Blobs for input/output/intermediate data storage.
  2. In our current work, we extend MR4Azure to support iterative mapreduce computations. We added an additional merge step to the programming model, where the computations decides whether to go for a new iteration or not. We also support in-memory caching of static data between iterations and we developed a hybrid scheduling strategy to perform cache aware scheduling.
  3. We don’t have a master node, who has the global knowledge about the cached data.. Hence in each iteration, tasks will be posted to the bulleting board, where workers will first check to identify any tasks that require a data product they have in cache. If not they fall back to the queue.
  4. KMeans iterative MapReduce performance.16 Azure Small instances,6 iterations,8 to 48 million 20-D data points.Left: Performance with and without data caching.Right: Speedup obtained from using the data cacheLeft: Scaling speedup with increasing number of instances (Azure Small) & data for 10 iterations.Right: Increasing number of iterations using 16 million data points with caching using 16 Azure Small instances.