SlideShare a Scribd company logo
1 of 19
STARFISH: A SELF-TUNING SYSTEM FOR BIGDATA ANALYTICS 
SEMINAR BY 
Y.SAI PRAMODA 
10191A0511
CONTENTS 
• Introduction to Big data 
• Hadoop 
• Tuning problems 
• Starfish Architecture 
• Usage of Starfish 
• Conclusion
INTRODUCTION TO BIG DATA 
 Big data is the term for data sets so large and complicated 
that it becomes difficult to process using traditional data 
management tools or processing applications 
 What are the tools of Big data? 
 Features of Big data Analytics
BIG DATA PRACTITIONERS 
• Data analysts 
Report generation, data mining, ad optimization 
• Computational scientists 
Computational biology, economics, journalism 
• Statisticians and machine-learning researchers 
• Systems researchers, developers, and testers 
Distributed systems, networking, security, …
Practitioners want a MAD system-HADOOP 
Hadoop is as MAD as it is! 
Magnetism “Attracts” or welcomes all sources of data, 
regardless of structure, values, etc. 
Agility Adaptive, remains in sync with rapid data 
evolution and modification 
Depth More than just your typical analytics, we 
need to support complex operations like statistical analysis 
and machine learning
MADDER 
Data-lifecycle Do more than just queries, 
Awareness optimize the movement, 
storage, and processing of big 
Elasticity Dynamically adjust resource usage 
and user requirements 
Robustness Provide storage and querying 
services even in the 
event of some failures
Tuning Challenges 
• Heavy use of programming languages for 
MapReduce programs 
• Data loaded/accessed as opaque files 
• Large space of tuning choices 
• Elasticity is wonderful, but hard to achieve 
• Terabyte-scale data cycles.
Tuning Problems 
Job-level 
MapReduce 
configuration 
Cluster sizing 
Workload 
management 
Data 
layout 
tuning 
J1 J2 
Workflow 
optimization 
J3 
J4
Starfish’s Core Approach to Tuning 
Profiler 
Collects concise 
summaries of 
execution 
Cluster 
What-if Engine 
Estimates impact of 
hypothetical changes 
on execution 
Optimizers 
Search through space of tuning choices 
Job 
Workflow 
Workload 
Data layout
THE STARFISH PHILOSOPHY 
• Goal: A high-performance MAD system 
• Build on Hadoop’s strengths 
• How can users get good performance 
automatically?
STARFISH ARCHITECTURE
VISUALIZE WITH STARFISH 
• See how MapReduce apps are working 
• Understand Bottlenecks in Hadoop 
• Find Misconfigured Hadoop Parameters 
• Learn to develop MapReduce apps
OPTIMIZE WITH STARFISH 
• Tune Hadoop easily 
• Find Optimal parameters settings for 
MapReduce applications
STRATEGIZE WITH STARFISH 
• Make intelligent resource allocation choices for 
Hadoop. 
• Find Instances for Workloads. 
• Meet time and cost budgets with ease.
STEPS TO USE STARFISH
Cntd… 
• First Step: collect the profiling the data from your 
Hadoop cluster. 
• Second Step: import the profiling data into profile 
store. 
• Third Step: Fire up the Graphical or Command Line 
interfaces to invoke visualize, optimize and strategize 
features.
CONCLUSION 
Hadoop is now a viable competitor to existing 
systems for big data analytics. 
 Starfish fills a different void by enabling Hadoop 
users and applications to get good performance 
automatically throughout the data lifecycle in analytics.
REFERENCES 
• Herodotou, Herodotos, et al. "Starfish: A self-tuning 
system for big data analytics." Proc. of the Fifth CIDR 
Conf. 2011. 
• Dong, Fei. Extending Starfish to Support the Growing 
Hadoop Ecosystem. Diss. Duke University, 2012. 
• Herodotou, Herodotos, Fei Dong, and Shivnath Babu. 
"MapReduce programming and cost-based 
optimization? Crossing this chasm with Starfish." 
Proceedings of the VLDB Endowment 4.12 (2011). 
• http://www.cs.duke.edu/starfish/ 
• http://www.youtube.com/watch?v=Upxe2dzE1uk
Starfish-A self tuning system for bigdata analytics

More Related Content

What's hot

MACHINE LEARNING ON MAPREDUCE FRAMEWORK
MACHINE LEARNING ON MAPREDUCE FRAMEWORKMACHINE LEARNING ON MAPREDUCE FRAMEWORK
MACHINE LEARNING ON MAPREDUCE FRAMEWORKAbhi Jit
 
Handling Big Data Using a Data-Aware HDFS and Evolutionary Clustering Technique
Handling Big Data Using a Data-Aware HDFS and Evolutionary Clustering TechniqueHandling Big Data Using a Data-Aware HDFS and Evolutionary Clustering Technique
Handling Big Data Using a Data-Aware HDFS and Evolutionary Clustering TechniqueJAYAPRAKASH JPINFOTECH
 
Sitka_GeoOptix_Diagram_031816_FNL
Sitka_GeoOptix_Diagram_031816_FNLSitka_GeoOptix_Diagram_031816_FNL
Sitka_GeoOptix_Diagram_031816_FNLdkinpdx
 
Pivotal-HadoopOverview2016-working
Pivotal-HadoopOverview2016-workingPivotal-HadoopOverview2016-working
Pivotal-HadoopOverview2016-workingtts2086
 
Data management stocktaking—ILRI and Livestock CRP
Data management stocktaking—ILRI and Livestock CRPData management stocktaking—ILRI and Livestock CRP
Data management stocktaking—ILRI and Livestock CRPILRI
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoopSri Kanth
 
All data accessible to all my organization - Presentation at OW2con'19, June...
 All data accessible to all my organization - Presentation at OW2con'19, June... All data accessible to all my organization - Presentation at OW2con'19, June...
All data accessible to all my organization - Presentation at OW2con'19, June...OW2
 
Project Name
Project NameProject Name
Project Namebutest
 
Introduction To Big Data Analytics On Hadoop - SpringPeople
Introduction To Big Data Analytics On Hadoop - SpringPeopleIntroduction To Big Data Analytics On Hadoop - SpringPeople
Introduction To Big Data Analytics On Hadoop - SpringPeopleSpringPeople
 
Introduction to Big Data and Hadoop
Introduction to Big Data and HadoopIntroduction to Big Data and Hadoop
Introduction to Big Data and HadoopGreyCampus
 
Big Data Introduction
Big Data IntroductionBig Data Introduction
Big Data Introductionyalla4u
 
Accelerating your Research with Microsoft Azure (June 2015)
Accelerating your Research with Microsoft Azure (June 2015)Accelerating your Research with Microsoft Azure (June 2015)
Accelerating your Research with Microsoft Azure (June 2015)Microsoft Azure for Research
 

What's hot (20)

MACHINE LEARNING ON MAPREDUCE FRAMEWORK
MACHINE LEARNING ON MAPREDUCE FRAMEWORKMACHINE LEARNING ON MAPREDUCE FRAMEWORK
MACHINE LEARNING ON MAPREDUCE FRAMEWORK
 
Handling Big Data Using a Data-Aware HDFS and Evolutionary Clustering Technique
Handling Big Data Using a Data-Aware HDFS and Evolutionary Clustering TechniqueHandling Big Data Using a Data-Aware HDFS and Evolutionary Clustering Technique
Handling Big Data Using a Data-Aware HDFS and Evolutionary Clustering Technique
 
Sitka_GeoOptix_Diagram_031816_FNL
Sitka_GeoOptix_Diagram_031816_FNLSitka_GeoOptix_Diagram_031816_FNL
Sitka_GeoOptix_Diagram_031816_FNL
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop bigdata projects list(ver)
Hadoop bigdata projects list(ver)Hadoop bigdata projects list(ver)
Hadoop bigdata projects list(ver)
 
Pivotal-HadoopOverview2016-working
Pivotal-HadoopOverview2016-workingPivotal-HadoopOverview2016-working
Pivotal-HadoopOverview2016-working
 
Big Data
Big DataBig Data
Big Data
 
Cool Tools Esri ArcGIS
Cool Tools Esri ArcGISCool Tools Esri ArcGIS
Cool Tools Esri ArcGIS
 
Data management stocktaking—ILRI and Livestock CRP
Data management stocktaking—ILRI and Livestock CRPData management stocktaking—ILRI and Livestock CRP
Data management stocktaking—ILRI and Livestock CRP
 
Data Offloading
Data OffloadingData Offloading
Data Offloading
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
 
All data accessible to all my organization - Presentation at OW2con'19, June...
 All data accessible to all my organization - Presentation at OW2con'19, June... All data accessible to all my organization - Presentation at OW2con'19, June...
All data accessible to all my organization - Presentation at OW2con'19, June...
 
Project Name
Project NameProject Name
Project Name
 
A4 r overview deck_1.7
A4 r overview deck_1.7A4 r overview deck_1.7
A4 r overview deck_1.7
 
Introduction To Big Data Analytics On Hadoop - SpringPeople
Introduction To Big Data Analytics On Hadoop - SpringPeopleIntroduction To Big Data Analytics On Hadoop - SpringPeople
Introduction To Big Data Analytics On Hadoop - SpringPeople
 
Introduction to Big Data and Hadoop
Introduction to Big Data and HadoopIntroduction to Big Data and Hadoop
Introduction to Big Data and Hadoop
 
Reaerch data management
Reaerch data managementReaerch data management
Reaerch data management
 
Big Data Introduction
Big Data IntroductionBig Data Introduction
Big Data Introduction
 
Accelerating your Research with Microsoft Azure (June 2015)
Accelerating your Research with Microsoft Azure (June 2015)Accelerating your Research with Microsoft Azure (June 2015)
Accelerating your Research with Microsoft Azure (June 2015)
 
Introduction to Bigdata & Hadoop
Introduction to Bigdata & HadoopIntroduction to Bigdata & Hadoop
Introduction to Bigdata & Hadoop
 

Viewers also liked

Advanced Hadoop Tuning and Optimization
Advanced Hadoop Tuning and Optimization Advanced Hadoop Tuning and Optimization
Advanced Hadoop Tuning and Optimization Shivkumar Babshetty
 
Sub-second-sql-on-hadoop-at-scale
Sub-second-sql-on-hadoop-at-scaleSub-second-sql-on-hadoop-at-scale
Sub-second-sql-on-hadoop-at-scaleYifeng Jiang
 
Hive join optimizations
Hive join optimizationsHive join optimizations
Hive join optimizationsSzehon Ho
 
Advanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop ConsultingAdvanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop ConsultingImpetus Technologies
 
Hadoop configuration & performance tuning
Hadoop configuration & performance tuningHadoop configuration & performance tuning
Hadoop configuration & performance tuningVitthal Gogate
 
唯品会大数据实践 Sacc pub
唯品会大数据实践 Sacc pub唯品会大数据实践 Sacc pub
唯品会大数据实践 Sacc pubChao Zhu
 

Viewers also liked (7)

Advanced Hadoop Tuning and Optimization
Advanced Hadoop Tuning and Optimization Advanced Hadoop Tuning and Optimization
Advanced Hadoop Tuning and Optimization
 
Sub-second-sql-on-hadoop-at-scale
Sub-second-sql-on-hadoop-at-scaleSub-second-sql-on-hadoop-at-scale
Sub-second-sql-on-hadoop-at-scale
 
Hive tuning
Hive tuningHive tuning
Hive tuning
 
Hive join optimizations
Hive join optimizationsHive join optimizations
Hive join optimizations
 
Advanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop ConsultingAdvanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop Consulting
 
Hadoop configuration & performance tuning
Hadoop configuration & performance tuningHadoop configuration & performance tuning
Hadoop configuration & performance tuning
 
唯品会大数据实践 Sacc pub
唯品会大数据实践 Sacc pub唯品会大数据实践 Sacc pub
唯品会大数据实践 Sacc pub
 

Similar to Starfish-A self tuning system for bigdata analytics

Hadoop - Architectural road map for Hadoop Ecosystem
Hadoop -  Architectural road map for Hadoop EcosystemHadoop -  Architectural road map for Hadoop Ecosystem
Hadoop - Architectural road map for Hadoop Ecosystemnallagangus
 
Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which DataWorks Summit
 
Foxvalley bigdata
Foxvalley bigdataFoxvalley bigdata
Foxvalley bigdataTom Rogers
 
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data TorrentSeagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data TorrentSeeling Cheung
 
Big Data Analytics Using Hadoop
Big Data Analytics Using HadoopBig Data Analytics Using Hadoop
Big Data Analytics Using HadoopSrikanth VNV
 
Fundamentals of big data analytics and Hadoop
Fundamentals of big data analytics and HadoopFundamentals of big data analytics and Hadoop
Fundamentals of big data analytics and HadoopArchana Gopinath
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...MapR Technologies
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKRajesh Jayarman
 
Analysing of big data using map reduce
Analysing of big data using map reduceAnalysing of big data using map reduce
Analysing of big data using map reducePaladion Networks
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadhMithlesh Sadh
 
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Denodo
 
Présentation on radoop
Présentation on radoop   Présentation on radoop
Présentation on radoop siliconsudipt
 
M. Florence Dayana - Hadoop Foundation for Analytics.pptx
M. Florence Dayana - Hadoop Foundation for Analytics.pptxM. Florence Dayana - Hadoop Foundation for Analytics.pptx
M. Florence Dayana - Hadoop Foundation for Analytics.pptxDr.Florence Dayana
 
Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi
Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin MotgiWhither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi
Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin MotgiFelicia Haggarty
 
20131111 - Santa Monica - BigDataCamp - Big Data Design Patterns
20131111 - Santa Monica - BigDataCamp - Big Data Design Patterns20131111 - Santa Monica - BigDataCamp - Big Data Design Patterns
20131111 - Santa Monica - BigDataCamp - Big Data Design PatternsAllen Day, PhD
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with HadoopPhilippe Julio
 

Similar to Starfish-A self tuning system for bigdata analytics (20)

Hadoop - Architectural road map for Hadoop Ecosystem
Hadoop -  Architectural road map for Hadoop EcosystemHadoop -  Architectural road map for Hadoop Ecosystem
Hadoop - Architectural road map for Hadoop Ecosystem
 
Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which
 
Foxvalley bigdata
Foxvalley bigdataFoxvalley bigdata
Foxvalley bigdata
 
Bar camp bigdata
Bar camp bigdataBar camp bigdata
Bar camp bigdata
 
Big data Question bank.pdf
Big data Question bank.pdfBig data Question bank.pdf
Big data Question bank.pdf
 
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data TorrentSeagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
 
Big Data Analytics Using Hadoop
Big Data Analytics Using HadoopBig Data Analytics Using Hadoop
Big Data Analytics Using Hadoop
 
Fundamentals of big data analytics and Hadoop
Fundamentals of big data analytics and HadoopFundamentals of big data analytics and Hadoop
Fundamentals of big data analytics and Hadoop
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RK
 
Analysing of big data using map reduce
Analysing of big data using map reduceAnalysing of big data using map reduce
Analysing of big data using map reduce
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadh
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
 
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
 
Présentation on radoop
Présentation on radoop   Présentation on radoop
Présentation on radoop
 
M. Florence Dayana - Hadoop Foundation for Analytics.pptx
M. Florence Dayana - Hadoop Foundation for Analytics.pptxM. Florence Dayana - Hadoop Foundation for Analytics.pptx
M. Florence Dayana - Hadoop Foundation for Analytics.pptx
 
De-Mystifying Big Data
De-Mystifying Big DataDe-Mystifying Big Data
De-Mystifying Big Data
 
Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi
Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin MotgiWhither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi
Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi
 
20131111 - Santa Monica - BigDataCamp - Big Data Design Patterns
20131111 - Santa Monica - BigDataCamp - Big Data Design Patterns20131111 - Santa Monica - BigDataCamp - Big Data Design Patterns
20131111 - Santa Monica - BigDataCamp - Big Data Design Patterns
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 

Recently uploaded

Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 

Recently uploaded (20)

Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 

Starfish-A self tuning system for bigdata analytics

  • 1. STARFISH: A SELF-TUNING SYSTEM FOR BIGDATA ANALYTICS SEMINAR BY Y.SAI PRAMODA 10191A0511
  • 2. CONTENTS • Introduction to Big data • Hadoop • Tuning problems • Starfish Architecture • Usage of Starfish • Conclusion
  • 3. INTRODUCTION TO BIG DATA  Big data is the term for data sets so large and complicated that it becomes difficult to process using traditional data management tools or processing applications  What are the tools of Big data?  Features of Big data Analytics
  • 4. BIG DATA PRACTITIONERS • Data analysts Report generation, data mining, ad optimization • Computational scientists Computational biology, economics, journalism • Statisticians and machine-learning researchers • Systems researchers, developers, and testers Distributed systems, networking, security, …
  • 5. Practitioners want a MAD system-HADOOP Hadoop is as MAD as it is! Magnetism “Attracts” or welcomes all sources of data, regardless of structure, values, etc. Agility Adaptive, remains in sync with rapid data evolution and modification Depth More than just your typical analytics, we need to support complex operations like statistical analysis and machine learning
  • 6. MADDER Data-lifecycle Do more than just queries, Awareness optimize the movement, storage, and processing of big Elasticity Dynamically adjust resource usage and user requirements Robustness Provide storage and querying services even in the event of some failures
  • 7. Tuning Challenges • Heavy use of programming languages for MapReduce programs • Data loaded/accessed as opaque files • Large space of tuning choices • Elasticity is wonderful, but hard to achieve • Terabyte-scale data cycles.
  • 8. Tuning Problems Job-level MapReduce configuration Cluster sizing Workload management Data layout tuning J1 J2 Workflow optimization J3 J4
  • 9. Starfish’s Core Approach to Tuning Profiler Collects concise summaries of execution Cluster What-if Engine Estimates impact of hypothetical changes on execution Optimizers Search through space of tuning choices Job Workflow Workload Data layout
  • 10. THE STARFISH PHILOSOPHY • Goal: A high-performance MAD system • Build on Hadoop’s strengths • How can users get good performance automatically?
  • 12. VISUALIZE WITH STARFISH • See how MapReduce apps are working • Understand Bottlenecks in Hadoop • Find Misconfigured Hadoop Parameters • Learn to develop MapReduce apps
  • 13. OPTIMIZE WITH STARFISH • Tune Hadoop easily • Find Optimal parameters settings for MapReduce applications
  • 14. STRATEGIZE WITH STARFISH • Make intelligent resource allocation choices for Hadoop. • Find Instances for Workloads. • Meet time and cost budgets with ease.
  • 15. STEPS TO USE STARFISH
  • 16. Cntd… • First Step: collect the profiling the data from your Hadoop cluster. • Second Step: import the profiling data into profile store. • Third Step: Fire up the Graphical or Command Line interfaces to invoke visualize, optimize and strategize features.
  • 17. CONCLUSION Hadoop is now a viable competitor to existing systems for big data analytics.  Starfish fills a different void by enabling Hadoop users and applications to get good performance automatically throughout the data lifecycle in analytics.
  • 18. REFERENCES • Herodotou, Herodotos, et al. "Starfish: A self-tuning system for big data analytics." Proc. of the Fifth CIDR Conf. 2011. • Dong, Fei. Extending Starfish to Support the Growing Hadoop Ecosystem. Diss. Duke University, 2012. • Herodotou, Herodotos, Fei Dong, and Shivnath Babu. "MapReduce programming and cost-based optimization? Crossing this chasm with Starfish." Proceedings of the VLDB Endowment 4.12 (2011). • http://www.cs.duke.edu/starfish/ • http://www.youtube.com/watch?v=Upxe2dzE1uk

Editor's Notes

  1. Profiler Collect summaries of jobs Collect information on a task basis What-if Engine Answers questions after the Profiler is run Optimizers Enumerate & Search through decision space to satisfy the requirements.