SlideShare a Scribd company logo

Big Data Certification

A
A

Slides from a STL HUG presentation on Big Data certifications. Material dated 1/3/2018

Big Data Certification

1 of 23
Download to read offline
Confidential and Proprietary to Daugherty Business Solutions
Big Data Certifications
Jan 2018
Confidential and Proprietary to Daugherty Business Solutions
Adam Doyle
• Co-Organizer, St. Louis HUG
• Big Data Community Lead,
Daugherty Business Solutions
• Formerly Lead Big Data
developer at Mercy
• Speaker at local and national
Big Data conferences
Adam Riggs
• Sr. Recruiter at Daugherty Business
• 6 years of technical recruiting experience
specializing in application development
and Big Data.
• I’ve hired candidates for most of the local
fortune 500 companies including Wells
Fargo, Express Scripts, MasterCard,
Charter, and Anheuser Busch.
Introduction
Confidential and Proprietary to Daugherty Business Solutions
• Why get certified?
• What certifications are available?
• What is the certification test like?
• Now what?
Agenda
Confidential and Proprietary to Daugherty Business Solutions
• For consulting services organizations, there are reasons to get certified
– Partnering with vendors
– Discounts
– Publicity
• For companies, there are also reasons to get certified
– Publicity
– Recognition
4
Why get certified?
Confidential and Proprietary to Daugherty Business Solutions
Confidential and Proprietary to Daugherty
Business Solutions
Why get
certified?
Potential increase of 7-9%
with Hadoop certification in.
The big data market is still
immature. Few companies
have defined their big data
strategy and hiring has been
slow due to the lack of
qualified candidates. As this
matures the value of a
certification will increase, in
my opinion.
5
Big Data Supply and Demand
Certification Demand
Confidential and Proprietary to Daugherty Business Solutions
• What do I want out of the certification?
• How does my current employer value the credential and do they
understand the business case?
• Does this credential complement my experience or add depth to my
skills?
• Are the opportunities I’m seeking, requiring this credential?
• What is my plan to continue education after this certification?
6
Evaluating Certifications

Recommended

Application Architectures with Hadoop - Big Data TechCon SF 2014
Application Architectures with Hadoop - Big Data TechCon SF 2014Application Architectures with Hadoop - Big Data TechCon SF 2014
Application Architectures with Hadoop - Big Data TechCon SF 2014hadooparchbook
 
Architecting a Fraud Detection Application with Hadoop
Architecting a Fraud Detection Application with HadoopArchitecting a Fraud Detection Application with Hadoop
Architecting a Fraud Detection Application with HadoopDataWorks Summit
 
Pig Tutorial | Apache Pig Tutorial | What Is Pig In Hadoop? | Apache Pig Arch...
Pig Tutorial | Apache Pig Tutorial | What Is Pig In Hadoop? | Apache Pig Arch...Pig Tutorial | Apache Pig Tutorial | What Is Pig In Hadoop? | Apache Pig Arch...
Pig Tutorial | Apache Pig Tutorial | What Is Pig In Hadoop? | Apache Pig Arch...Simplilearn
 
Fraud Detection using Hadoop
Fraud Detection using HadoopFraud Detection using Hadoop
Fraud Detection using Hadoophadooparchbook
 
Putting Apache Drill into Production
Putting Apache Drill into ProductionPutting Apache Drill into Production
Putting Apache Drill into ProductionMapR Technologies
 
Architecting applications with Hadoop - Fraud Detection
Architecting applications with Hadoop - Fraud DetectionArchitecting applications with Hadoop - Fraud Detection
Architecting applications with Hadoop - Fraud Detectionhadooparchbook
 

More Related Content

What's hot

Spark SQL versus Apache Drill: Different Tools with Different Rules
Spark SQL versus Apache Drill: Different Tools with Different RulesSpark SQL versus Apache Drill: Different Tools with Different Rules
Spark SQL versus Apache Drill: Different Tools with Different RulesDataWorks Summit/Hadoop Summit
 
Architectural considerations for Hadoop Applications
Architectural considerations for Hadoop ApplicationsArchitectural considerations for Hadoop Applications
Architectural considerations for Hadoop Applicationshadooparchbook
 
An introduction to apache drill presentation
An introduction to apache drill presentationAn introduction to apache drill presentation
An introduction to apache drill presentationMapR Technologies
 
Application architectures with Hadoop – Big Data TechCon 2014
Application architectures with Hadoop – Big Data TechCon 2014Application architectures with Hadoop – Big Data TechCon 2014
Application architectures with Hadoop – Big Data TechCon 2014hadooparchbook
 
Apache Drill: Building Highly Flexible, High Performance Query Engines by M.C...
Apache Drill: Building Highly Flexible, High Performance Query Engines by M.C...Apache Drill: Building Highly Flexible, High Performance Query Engines by M.C...
Apache Drill: Building Highly Flexible, High Performance Query Engines by M.C...The Hive
 
Hadoop2 new and noteworthy SNIA conf
Hadoop2 new and noteworthy SNIA confHadoop2 new and noteworthy SNIA conf
Hadoop2 new and noteworthy SNIA confSujee Maniyam
 
Introduction to Data Analyst Training
Introduction to Data Analyst TrainingIntroduction to Data Analyst Training
Introduction to Data Analyst TrainingCloudera, Inc.
 
Application Architectures with Hadoop - UK Hadoop User Group
Application Architectures with Hadoop - UK Hadoop User GroupApplication Architectures with Hadoop - UK Hadoop User Group
Application Architectures with Hadoop - UK Hadoop User Grouphadooparchbook
 
Strata NY 2014 - Architectural considerations for Hadoop applications tutorial
Strata NY 2014 - Architectural considerations for Hadoop applications tutorialStrata NY 2014 - Architectural considerations for Hadoop applications tutorial
Strata NY 2014 - Architectural considerations for Hadoop applications tutorialhadooparchbook
 
Yarn by default (Spark on YARN)
Yarn by default (Spark on YARN)Yarn by default (Spark on YARN)
Yarn by default (Spark on YARN)Ferran Galí Reniu
 
Application Architectures with Hadoop
Application Architectures with HadoopApplication Architectures with Hadoop
Application Architectures with Hadoophadooparchbook
 
a Secure Public Cache for YARN Application Resources
a Secure Public Cache for YARN Application Resourcesa Secure Public Cache for YARN Application Resources
a Secure Public Cache for YARN Application ResourcesDataWorks Summit
 
Hadoop Application Architectures tutorial - Strata London
Hadoop Application Architectures tutorial - Strata LondonHadoop Application Architectures tutorial - Strata London
Hadoop Application Architectures tutorial - Strata Londonhadooparchbook
 
Pivotal HD as a Cloud Foundry Service
Pivotal HD as a Cloud Foundry ServicePivotal HD as a Cloud Foundry Service
Pivotal HD as a Cloud Foundry ServicePlatform CF
 
Application Architectures with Hadoop
Application Architectures with HadoopApplication Architectures with Hadoop
Application Architectures with Hadoophadooparchbook
 
Improving HDFS Availability with IPC Quality of Service
Improving HDFS Availability with IPC Quality of ServiceImproving HDFS Availability with IPC Quality of Service
Improving HDFS Availability with IPC Quality of ServiceDataWorks Summit
 
Cloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera, Inc.
 
Architecting application with Hadoop - using clickstream analytics as an example
Architecting application with Hadoop - using clickstream analytics as an exampleArchitecting application with Hadoop - using clickstream analytics as an example
Architecting application with Hadoop - using clickstream analytics as an examplehadooparchbook
 

What's hot (20)

Spark SQL versus Apache Drill: Different Tools with Different Rules
Spark SQL versus Apache Drill: Different Tools with Different RulesSpark SQL versus Apache Drill: Different Tools with Different Rules
Spark SQL versus Apache Drill: Different Tools with Different Rules
 
Architectural considerations for Hadoop Applications
Architectural considerations for Hadoop ApplicationsArchitectural considerations for Hadoop Applications
Architectural considerations for Hadoop Applications
 
An introduction to apache drill presentation
An introduction to apache drill presentationAn introduction to apache drill presentation
An introduction to apache drill presentation
 
Application architectures with Hadoop – Big Data TechCon 2014
Application architectures with Hadoop – Big Data TechCon 2014Application architectures with Hadoop – Big Data TechCon 2014
Application architectures with Hadoop – Big Data TechCon 2014
 
Apache Drill: Building Highly Flexible, High Performance Query Engines by M.C...
Apache Drill: Building Highly Flexible, High Performance Query Engines by M.C...Apache Drill: Building Highly Flexible, High Performance Query Engines by M.C...
Apache Drill: Building Highly Flexible, High Performance Query Engines by M.C...
 
Hadoop2 new and noteworthy SNIA conf
Hadoop2 new and noteworthy SNIA confHadoop2 new and noteworthy SNIA conf
Hadoop2 new and noteworthy SNIA conf
 
Introduction to Data Analyst Training
Introduction to Data Analyst TrainingIntroduction to Data Analyst Training
Introduction to Data Analyst Training
 
Application Architectures with Hadoop - UK Hadoop User Group
Application Architectures with Hadoop - UK Hadoop User GroupApplication Architectures with Hadoop - UK Hadoop User Group
Application Architectures with Hadoop - UK Hadoop User Group
 
Strata NY 2014 - Architectural considerations for Hadoop applications tutorial
Strata NY 2014 - Architectural considerations for Hadoop applications tutorialStrata NY 2014 - Architectural considerations for Hadoop applications tutorial
Strata NY 2014 - Architectural considerations for Hadoop applications tutorial
 
Yarn by default (Spark on YARN)
Yarn by default (Spark on YARN)Yarn by default (Spark on YARN)
Yarn by default (Spark on YARN)
 
Application Architectures with Hadoop
Application Architectures with HadoopApplication Architectures with Hadoop
Application Architectures with Hadoop
 
a Secure Public Cache for YARN Application Resources
a Secure Public Cache for YARN Application Resourcesa Secure Public Cache for YARN Application Resources
a Secure Public Cache for YARN Application Resources
 
MPP vs Hadoop
MPP vs HadoopMPP vs Hadoop
MPP vs Hadoop
 
Hadoop Application Architectures tutorial - Strata London
Hadoop Application Architectures tutorial - Strata LondonHadoop Application Architectures tutorial - Strata London
Hadoop Application Architectures tutorial - Strata London
 
Pivotal HD as a Cloud Foundry Service
Pivotal HD as a Cloud Foundry ServicePivotal HD as a Cloud Foundry Service
Pivotal HD as a Cloud Foundry Service
 
Apache Hadoop 3
Apache Hadoop 3Apache Hadoop 3
Apache Hadoop 3
 
Application Architectures with Hadoop
Application Architectures with HadoopApplication Architectures with Hadoop
Application Architectures with Hadoop
 
Improving HDFS Availability with IPC Quality of Service
Improving HDFS Availability with IPC Quality of ServiceImproving HDFS Availability with IPC Quality of Service
Improving HDFS Availability with IPC Quality of Service
 
Cloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for Hadoop
 
Architecting application with Hadoop - using clickstream analytics as an example
Architecting application with Hadoop - using clickstream analytics as an exampleArchitecting application with Hadoop - using clickstream analytics as an example
Architecting application with Hadoop - using clickstream analytics as an example
 

Similar to Big Data Certification

HadoopIntroduction.pptx
HadoopIntroduction.pptxHadoopIntroduction.pptx
HadoopIntroduction.pptxBalasundaramSr
 
HadoopIntroduction.pptx
HadoopIntroduction.pptxHadoopIntroduction.pptx
HadoopIntroduction.pptxBalasundaramSr
 
Hadoop in the Cloud – The What, Why and How from the Experts
Hadoop in the Cloud – The What, Why and How from the ExpertsHadoop in the Cloud – The What, Why and How from the Experts
Hadoop in the Cloud – The What, Why and How from the ExpertsDataWorks Summit/Hadoop Summit
 
Hadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the expertsHadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the expertsDataWorks Summit/Hadoop Summit
 
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...VMworld
 
HDPCD Spark using Python (pyspark)
HDPCD Spark using Python (pyspark)HDPCD Spark using Python (pyspark)
HDPCD Spark using Python (pyspark)Durga Gadiraju
 
What Is Hadoop | Hadoop Tutorial For Beginners | Edureka
What Is Hadoop | Hadoop Tutorial For Beginners | EdurekaWhat Is Hadoop | Hadoop Tutorial For Beginners | Edureka
What Is Hadoop | Hadoop Tutorial For Beginners | EdurekaEdureka!
 
Big Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the ExpertsBig Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the ExpertsDataWorks Summit/Hadoop Summit
 
Apache Spark Workshop at Hadoop Summit
Apache Spark Workshop at Hadoop SummitApache Spark Workshop at Hadoop Summit
Apache Spark Workshop at Hadoop SummitSaptak Sen
 
How to obtain the Cloudera Data Engineer Certification
How to obtain the Cloudera Data Engineer CertificationHow to obtain the Cloudera Data Engineer Certification
How to obtain the Cloudera Data Engineer Certificationelephantscale
 
Building Hadoop Data Applications with Kite
Building Hadoop Data Applications with KiteBuilding Hadoop Data Applications with Kite
Building Hadoop Data Applications with Kitehuguk
 
Hd insight essentials quick view
Hd insight essentials quick viewHd insight essentials quick view
Hd insight essentials quick viewRajesh Nadipalli
 
HdInsight essentials Hadoop on Microsoft Platform
HdInsight essentials Hadoop on Microsoft PlatformHdInsight essentials Hadoop on Microsoft Platform
HdInsight essentials Hadoop on Microsoft Platformnvvrajesh
 
Hd insight essentials quick view
Hd insight essentials quick viewHd insight essentials quick view
Hd insight essentials quick viewRajesh Nadipalli
 
Hadoop and Mapreduce Certification
Hadoop and Mapreduce CertificationHadoop and Mapreduce Certification
Hadoop and Mapreduce CertificationVskills
 
Pig Tutorial | Twitter Case Study | Apache Pig Script and Commands | Edureka
Pig Tutorial | Twitter Case Study | Apache Pig Script and Commands | EdurekaPig Tutorial | Twitter Case Study | Apache Pig Script and Commands | Edureka
Pig Tutorial | Twitter Case Study | Apache Pig Script and Commands | EdurekaEdureka!
 

Similar to Big Data Certification (20)

Robin_Hadoop
Robin_HadoopRobin_Hadoop
Robin_Hadoop
 
HadoopIntroduction.pptx
HadoopIntroduction.pptxHadoopIntroduction.pptx
HadoopIntroduction.pptx
 
HadoopIntroduction.pptx
HadoopIntroduction.pptxHadoopIntroduction.pptx
HadoopIntroduction.pptx
 
Hadoop Tutorial For Beginners
Hadoop Tutorial For BeginnersHadoop Tutorial For Beginners
Hadoop Tutorial For Beginners
 
Hadoop in the Cloud – The What, Why and How from the Experts
Hadoop in the Cloud – The What, Why and How from the ExpertsHadoop in the Cloud – The What, Why and How from the Experts
Hadoop in the Cloud – The What, Why and How from the Experts
 
Hadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the expertsHadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the experts
 
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
 
hadoop exp
hadoop exphadoop exp
hadoop exp
 
Resume_VipinKP
Resume_VipinKPResume_VipinKP
Resume_VipinKP
 
HDPCD Spark using Python (pyspark)
HDPCD Spark using Python (pyspark)HDPCD Spark using Python (pyspark)
HDPCD Spark using Python (pyspark)
 
What Is Hadoop | Hadoop Tutorial For Beginners | Edureka
What Is Hadoop | Hadoop Tutorial For Beginners | EdurekaWhat Is Hadoop | Hadoop Tutorial For Beginners | Edureka
What Is Hadoop | Hadoop Tutorial For Beginners | Edureka
 
Big Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the ExpertsBig Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the Experts
 
Apache Spark Workshop at Hadoop Summit
Apache Spark Workshop at Hadoop SummitApache Spark Workshop at Hadoop Summit
Apache Spark Workshop at Hadoop Summit
 
How to obtain the Cloudera Data Engineer Certification
How to obtain the Cloudera Data Engineer CertificationHow to obtain the Cloudera Data Engineer Certification
How to obtain the Cloudera Data Engineer Certification
 
Building Hadoop Data Applications with Kite
Building Hadoop Data Applications with KiteBuilding Hadoop Data Applications with Kite
Building Hadoop Data Applications with Kite
 
Hd insight essentials quick view
Hd insight essentials quick viewHd insight essentials quick view
Hd insight essentials quick view
 
HdInsight essentials Hadoop on Microsoft Platform
HdInsight essentials Hadoop on Microsoft PlatformHdInsight essentials Hadoop on Microsoft Platform
HdInsight essentials Hadoop on Microsoft Platform
 
Hd insight essentials quick view
Hd insight essentials quick viewHd insight essentials quick view
Hd insight essentials quick view
 
Hadoop and Mapreduce Certification
Hadoop and Mapreduce CertificationHadoop and Mapreduce Certification
Hadoop and Mapreduce Certification
 
Pig Tutorial | Twitter Case Study | Apache Pig Script and Commands | Edureka
Pig Tutorial | Twitter Case Study | Apache Pig Script and Commands | EdurekaPig Tutorial | Twitter Case Study | Apache Pig Script and Commands | Edureka
Pig Tutorial | Twitter Case Study | Apache Pig Script and Commands | Edureka
 

More from Adam Doyle

Data Engineering Roles
Data Engineering RolesData Engineering Roles
Data Engineering RolesAdam Doyle
 
Managed Cluster Services
Managed Cluster ServicesManaged Cluster Services
Managed Cluster ServicesAdam Doyle
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architectureAdam Doyle
 
Great Expectations Presentation
Great Expectations PresentationGreat Expectations Presentation
Great Expectations PresentationAdam Doyle
 
May 2021 Spark Testing ... or how to farm reputation on StackOverflow
May 2021 Spark Testing ... or how to farm reputation on StackOverflowMay 2021 Spark Testing ... or how to farm reputation on StackOverflow
May 2021 Spark Testing ... or how to farm reputation on StackOverflowAdam Doyle
 
Automate your data flows with Apache NIFI
Automate your data flows with Apache NIFIAutomate your data flows with Apache NIFI
Automate your data flows with Apache NIFIAdam Doyle
 
Apache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEAApache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEAAdam Doyle
 
Localized Hadoop Development
Localized Hadoop DevelopmentLocalized Hadoop Development
Localized Hadoop DevelopmentAdam Doyle
 
The new big data
The new big dataThe new big data
The new big dataAdam Doyle
 
Feature store Overview St. Louis Big Data IDEA Meetup aug 2020
Feature store Overview   St. Louis Big Data IDEA Meetup aug 2020Feature store Overview   St. Louis Big Data IDEA Meetup aug 2020
Feature store Overview St. Louis Big Data IDEA Meetup aug 2020Adam Doyle
 
Snowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at ScaleSnowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at ScaleAdam Doyle
 
Operationalizing Data Science St. Louis Big Data IDEA
Operationalizing Data Science St. Louis Big Data IDEAOperationalizing Data Science St. Louis Big Data IDEA
Operationalizing Data Science St. Louis Big Data IDEAAdam Doyle
 
Retooling on the Modern Data and Analytics Tech Stack
Retooling on the Modern Data and Analytics Tech StackRetooling on the Modern Data and Analytics Tech Stack
Retooling on the Modern Data and Analytics Tech StackAdam Doyle
 
Stl meetup cloudera platform - january 2020
Stl meetup   cloudera platform  - january 2020Stl meetup   cloudera platform  - january 2020
Stl meetup cloudera platform - january 2020Adam Doyle
 
How stlrda does data
How stlrda does dataHow stlrda does data
How stlrda does dataAdam Doyle
 
Tailoring machine learning practices to support prescriptive analytics
Tailoring machine learning practices to support prescriptive analyticsTailoring machine learning practices to support prescriptive analytics
Tailoring machine learning practices to support prescriptive analyticsAdam Doyle
 
Synthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingSynthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingAdam Doyle
 
Big Data IDEA 101 2019
Big Data IDEA 101 2019Big Data IDEA 101 2019
Big Data IDEA 101 2019Adam Doyle
 
Data Engineering and the Data Science Lifecycle
Data Engineering and the Data Science LifecycleData Engineering and the Data Science Lifecycle
Data Engineering and the Data Science LifecycleAdam Doyle
 

More from Adam Doyle (20)

ML Ops.pptx
ML Ops.pptxML Ops.pptx
ML Ops.pptx
 
Data Engineering Roles
Data Engineering RolesData Engineering Roles
Data Engineering Roles
 
Managed Cluster Services
Managed Cluster ServicesManaged Cluster Services
Managed Cluster Services
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architecture
 
Great Expectations Presentation
Great Expectations PresentationGreat Expectations Presentation
Great Expectations Presentation
 
May 2021 Spark Testing ... or how to farm reputation on StackOverflow
May 2021 Spark Testing ... or how to farm reputation on StackOverflowMay 2021 Spark Testing ... or how to farm reputation on StackOverflow
May 2021 Spark Testing ... or how to farm reputation on StackOverflow
 
Automate your data flows with Apache NIFI
Automate your data flows with Apache NIFIAutomate your data flows with Apache NIFI
Automate your data flows with Apache NIFI
 
Apache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEAApache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEA
 
Localized Hadoop Development
Localized Hadoop DevelopmentLocalized Hadoop Development
Localized Hadoop Development
 
The new big data
The new big dataThe new big data
The new big data
 
Feature store Overview St. Louis Big Data IDEA Meetup aug 2020
Feature store Overview   St. Louis Big Data IDEA Meetup aug 2020Feature store Overview   St. Louis Big Data IDEA Meetup aug 2020
Feature store Overview St. Louis Big Data IDEA Meetup aug 2020
 
Snowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at ScaleSnowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at Scale
 
Operationalizing Data Science St. Louis Big Data IDEA
Operationalizing Data Science St. Louis Big Data IDEAOperationalizing Data Science St. Louis Big Data IDEA
Operationalizing Data Science St. Louis Big Data IDEA
 
Retooling on the Modern Data and Analytics Tech Stack
Retooling on the Modern Data and Analytics Tech StackRetooling on the Modern Data and Analytics Tech Stack
Retooling on the Modern Data and Analytics Tech Stack
 
Stl meetup cloudera platform - january 2020
Stl meetup   cloudera platform  - january 2020Stl meetup   cloudera platform  - january 2020
Stl meetup cloudera platform - january 2020
 
How stlrda does data
How stlrda does dataHow stlrda does data
How stlrda does data
 
Tailoring machine learning practices to support prescriptive analytics
Tailoring machine learning practices to support prescriptive analyticsTailoring machine learning practices to support prescriptive analytics
Tailoring machine learning practices to support prescriptive analytics
 
Synthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingSynthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-making
 
Big Data IDEA 101 2019
Big Data IDEA 101 2019Big Data IDEA 101 2019
Big Data IDEA 101 2019
 
Data Engineering and the Data Science Lifecycle
Data Engineering and the Data Science LifecycleData Engineering and the Data Science Lifecycle
Data Engineering and the Data Science Lifecycle
 

Recently uploaded

Boost Your Job Search by Volunteering 2024
Boost Your Job Search by Volunteering 2024Boost Your Job Search by Volunteering 2024
Boost Your Job Search by Volunteering 2024Bruce Bennett
 
ACR-2ND-QUARTER-PORTFOLIO-DAY-JRNHS (1).docx
ACR-2ND-QUARTER-PORTFOLIO-DAY-JRNHS (1).docxACR-2ND-QUARTER-PORTFOLIO-DAY-JRNHS (1).docx
ACR-2ND-QUARTER-PORTFOLIO-DAY-JRNHS (1).docxJoeMarieVelasquez1
 
LinkedIn Strategic Guidelines February 2024
LinkedIn Strategic Guidelines February 2024LinkedIn Strategic Guidelines February 2024
LinkedIn Strategic Guidelines February 2024Bruce Bennett
 
fs-1-report-chapter-5 Assessment and reporting.pptx
fs-1-report-chapter-5 Assessment and reporting.pptxfs-1-report-chapter-5 Assessment and reporting.pptx
fs-1-report-chapter-5 Assessment and reporting.pptxAPALESJUNNAROSES
 
Cover Letter Examples For Biotechnology Job
Cover Letter Examples For Biotechnology JobCover Letter Examples For Biotechnology Job
Cover Letter Examples For Biotechnology JobLatoya White
 
NinaIveyIshokir.Resume.CV.February2024.docx
NinaIveyIshokir.Resume.CV.February2024.docxNinaIveyIshokir.Resume.CV.February2024.docx
NinaIveyIshokir.Resume.CV.February2024.docxNinaIshokir
 
Cyber security course in kerala | C|PENT | Blitz Academy
Cyber security course in kerala | C|PENT | Blitz AcademyCyber security course in kerala | C|PENT | Blitz Academy
Cyber security course in kerala | C|PENT | Blitz Academyananthakrishnansblit
 
Grade 12 WORK IMMERSION Work Ethics.pptx
Grade 12 WORK IMMERSION Work Ethics.pptxGrade 12 WORK IMMERSION Work Ethics.pptx
Grade 12 WORK IMMERSION Work Ethics.pptxHernilynManatad
 
Work Immersion SAFETY IN THE WORKPLACE PPT.pptx
Work Immersion SAFETY IN THE WORKPLACE PPT.pptxWork Immersion SAFETY IN THE WORKPLACE PPT.pptx
Work Immersion SAFETY IN THE WORKPLACE PPT.pptxHernilynManatad
 
(17-02-24) UI and UX Designers in the UK!
(17-02-24) UI and UX Designers in the UK!(17-02-24) UI and UX Designers in the UK!
(17-02-24) UI and UX Designers in the UK!The Knowledge Academy
 
Dr Jay Prakash Singh, Associate Professor Department of Education Netaji Subh...
Dr Jay Prakash Singh, Associate Professor Department of Education Netaji Subh...Dr Jay Prakash Singh, Associate Professor Department of Education Netaji Subh...
Dr Jay Prakash Singh, Associate Professor Department of Education Netaji Subh...JAYPRAKASHSINGH83
 
SELF INTRODUCTION about S.MOHAMED FAIZUL
SELF INTRODUCTION about S.MOHAMED FAIZULSELF INTRODUCTION about S.MOHAMED FAIZUL
SELF INTRODUCTION about S.MOHAMED FAIZULMohamedFaizul2
 
John Hart Havertown, PA: A Legacy of Academic Excellence, Leadership Prowess,...
John Hart Havertown, PA: A Legacy of Academic Excellence, Leadership Prowess,...John Hart Havertown, PA: A Legacy of Academic Excellence, Leadership Prowess,...
John Hart Havertown, PA: A Legacy of Academic Excellence, Leadership Prowess,...johnharthavertown
 
SELF INTRODUCTION - SANGEETHA.S AD21047
SELF INTRODUCTION - SANGEETHA.S AD21047SELF INTRODUCTION - SANGEETHA.S AD21047
SELF INTRODUCTION - SANGEETHA.S AD21047sangeethasiva2804
 
Mindshare Trends 2024_0.pdf
Mindshare Trends 2024_0.pdfMindshare Trends 2024_0.pdf
Mindshare Trends 2024_0.pdfalnishyia1
 
Module 1_Principles of Emergency Care.pptx
Module 1_Principles of Emergency Care.pptxModule 1_Principles of Emergency Care.pptx
Module 1_Principles of Emergency Care.pptxRachealSantos1
 
Grade 12 WORK IMMERSION Work Ethics.pptx
Grade 12 WORK IMMERSION Work Ethics.pptxGrade 12 WORK IMMERSION Work Ethics.pptx
Grade 12 WORK IMMERSION Work Ethics.pptxHernilynManatad
 
122. Reviewer Certificate in BP International
122. Reviewer Certificate in BP International122. Reviewer Certificate in BP International
122. Reviewer Certificate in BP InternationalManu Mitra
 
Application of Remote Sensing and GIS Technology in Agriculture by SOUMIQUE A...
Application of Remote Sensing and GIS Technology in Agriculture by SOUMIQUE A...Application of Remote Sensing and GIS Technology in Agriculture by SOUMIQUE A...
Application of Remote Sensing and GIS Technology in Agriculture by SOUMIQUE A...SOUMIQUE AHAMED
 

Recently uploaded (19)

Boost Your Job Search by Volunteering 2024
Boost Your Job Search by Volunteering 2024Boost Your Job Search by Volunteering 2024
Boost Your Job Search by Volunteering 2024
 
ACR-2ND-QUARTER-PORTFOLIO-DAY-JRNHS (1).docx
ACR-2ND-QUARTER-PORTFOLIO-DAY-JRNHS (1).docxACR-2ND-QUARTER-PORTFOLIO-DAY-JRNHS (1).docx
ACR-2ND-QUARTER-PORTFOLIO-DAY-JRNHS (1).docx
 
LinkedIn Strategic Guidelines February 2024
LinkedIn Strategic Guidelines February 2024LinkedIn Strategic Guidelines February 2024
LinkedIn Strategic Guidelines February 2024
 
fs-1-report-chapter-5 Assessment and reporting.pptx
fs-1-report-chapter-5 Assessment and reporting.pptxfs-1-report-chapter-5 Assessment and reporting.pptx
fs-1-report-chapter-5 Assessment and reporting.pptx
 
Cover Letter Examples For Biotechnology Job
Cover Letter Examples For Biotechnology JobCover Letter Examples For Biotechnology Job
Cover Letter Examples For Biotechnology Job
 
NinaIveyIshokir.Resume.CV.February2024.docx
NinaIveyIshokir.Resume.CV.February2024.docxNinaIveyIshokir.Resume.CV.February2024.docx
NinaIveyIshokir.Resume.CV.February2024.docx
 
Cyber security course in kerala | C|PENT | Blitz Academy
Cyber security course in kerala | C|PENT | Blitz AcademyCyber security course in kerala | C|PENT | Blitz Academy
Cyber security course in kerala | C|PENT | Blitz Academy
 
Grade 12 WORK IMMERSION Work Ethics.pptx
Grade 12 WORK IMMERSION Work Ethics.pptxGrade 12 WORK IMMERSION Work Ethics.pptx
Grade 12 WORK IMMERSION Work Ethics.pptx
 
Work Immersion SAFETY IN THE WORKPLACE PPT.pptx
Work Immersion SAFETY IN THE WORKPLACE PPT.pptxWork Immersion SAFETY IN THE WORKPLACE PPT.pptx
Work Immersion SAFETY IN THE WORKPLACE PPT.pptx
 
(17-02-24) UI and UX Designers in the UK!
(17-02-24) UI and UX Designers in the UK!(17-02-24) UI and UX Designers in the UK!
(17-02-24) UI and UX Designers in the UK!
 
Dr Jay Prakash Singh, Associate Professor Department of Education Netaji Subh...
Dr Jay Prakash Singh, Associate Professor Department of Education Netaji Subh...Dr Jay Prakash Singh, Associate Professor Department of Education Netaji Subh...
Dr Jay Prakash Singh, Associate Professor Department of Education Netaji Subh...
 
SELF INTRODUCTION about S.MOHAMED FAIZUL
SELF INTRODUCTION about S.MOHAMED FAIZULSELF INTRODUCTION about S.MOHAMED FAIZUL
SELF INTRODUCTION about S.MOHAMED FAIZUL
 
John Hart Havertown, PA: A Legacy of Academic Excellence, Leadership Prowess,...
John Hart Havertown, PA: A Legacy of Academic Excellence, Leadership Prowess,...John Hart Havertown, PA: A Legacy of Academic Excellence, Leadership Prowess,...
John Hart Havertown, PA: A Legacy of Academic Excellence, Leadership Prowess,...
 
SELF INTRODUCTION - SANGEETHA.S AD21047
SELF INTRODUCTION - SANGEETHA.S AD21047SELF INTRODUCTION - SANGEETHA.S AD21047
SELF INTRODUCTION - SANGEETHA.S AD21047
 
Mindshare Trends 2024_0.pdf
Mindshare Trends 2024_0.pdfMindshare Trends 2024_0.pdf
Mindshare Trends 2024_0.pdf
 
Module 1_Principles of Emergency Care.pptx
Module 1_Principles of Emergency Care.pptxModule 1_Principles of Emergency Care.pptx
Module 1_Principles of Emergency Care.pptx
 
Grade 12 WORK IMMERSION Work Ethics.pptx
Grade 12 WORK IMMERSION Work Ethics.pptxGrade 12 WORK IMMERSION Work Ethics.pptx
Grade 12 WORK IMMERSION Work Ethics.pptx
 
122. Reviewer Certificate in BP International
122. Reviewer Certificate in BP International122. Reviewer Certificate in BP International
122. Reviewer Certificate in BP International
 
Application of Remote Sensing and GIS Technology in Agriculture by SOUMIQUE A...
Application of Remote Sensing and GIS Technology in Agriculture by SOUMIQUE A...Application of Remote Sensing and GIS Technology in Agriculture by SOUMIQUE A...
Application of Remote Sensing and GIS Technology in Agriculture by SOUMIQUE A...
 

Big Data Certification

  • 1. Confidential and Proprietary to Daugherty Business Solutions Big Data Certifications Jan 2018
  • 2. Confidential and Proprietary to Daugherty Business Solutions Adam Doyle • Co-Organizer, St. Louis HUG • Big Data Community Lead, Daugherty Business Solutions • Formerly Lead Big Data developer at Mercy • Speaker at local and national Big Data conferences Adam Riggs • Sr. Recruiter at Daugherty Business • 6 years of technical recruiting experience specializing in application development and Big Data. • I’ve hired candidates for most of the local fortune 500 companies including Wells Fargo, Express Scripts, MasterCard, Charter, and Anheuser Busch. Introduction
  • 3. Confidential and Proprietary to Daugherty Business Solutions • Why get certified? • What certifications are available? • What is the certification test like? • Now what? Agenda
  • 4. Confidential and Proprietary to Daugherty Business Solutions • For consulting services organizations, there are reasons to get certified – Partnering with vendors – Discounts – Publicity • For companies, there are also reasons to get certified – Publicity – Recognition 4 Why get certified?
  • 5. Confidential and Proprietary to Daugherty Business Solutions Confidential and Proprietary to Daugherty Business Solutions Why get certified? Potential increase of 7-9% with Hadoop certification in. The big data market is still immature. Few companies have defined their big data strategy and hiring has been slow due to the lack of qualified candidates. As this matures the value of a certification will increase, in my opinion. 5 Big Data Supply and Demand Certification Demand
  • 6. Confidential and Proprietary to Daugherty Business Solutions • What do I want out of the certification? • How does my current employer value the credential and do they understand the business case? • Does this credential complement my experience or add depth to my skills? • Are the opportunities I’m seeking, requiring this credential? • What is my plan to continue education after this certification? 6 Evaluating Certifications
  • 7. Confidential and Proprietary to Daugherty Business Solutions Topic CCP DataEngineer HDPCD HDPCD-Java MCHD MCHBD CCA Data Analyst HCA MCDA AWSSA CCDK HDPCD-Spark MCSD CCA Spark and Hadoop Developer DCD CCA Administrator HDPCA MCCA CCAK Sqoop x x x HDFS x x x x x Hive x x x x x x Impala x x x Hive DDL x x x x Hive QL x x x x Spark x x x x Spark SQL x x x Spark MLLIB x x Spark Streaming x x Python x x x Java x x x Scala x x x x Cloudera Manager x x Hadoop Admin Utils x x Flume x x Avro x Parquet x Oozie x Pig x x x x Tez x HDP Ambari x Knox x Ranger x MapReduce x x MapR Admin x MapR FS x MapR DB x HBase x Drill x AWS x Kafka x Kafka Admin x 7 Certification Coverage
  • 8. Confidential and Proprietary to Daugherty Business Solutions 8 Hadoop Developer
  • 9. Confidential and Proprietary to Daugherty Business Solutions 9 Administrator
  • 10. Confidential and Proprietary to Daugherty Business Solutions 10 Data Analyst
  • 11. Confidential and Proprietary to Daugherty Business Solutions 11 Spark Developer
  • 12. Confidential and Proprietary to Daugherty Business Solutions 12 Other certifications
  • 13. Confidential and Proprietary to Daugherty Business Solutions Confidential and Proprietary to Daugherty Business Solutions • Data Ingest (HDFS, Sqoop, Flume) – Import and export from RDBMS – Ingest Streaming data – Use HDFS Commands • Transform, Stage, Store (Hive, Pig) – Convert formats – Use compression – Transform values – Purge bad values, Deduplication – Denormalize data – Evolve an Avro/Parquet schema – Partition data – Tune for query performance • Data Analysis (Hive) – Aggregate queries, statistics – Filter – Rank/sort data – Join data sets – Create Hive table from existing data on HDFS • Workflow (Oozie) – Linear workflow – Branching workflow – Schedule workflow Exam Objectives Example Cloudera Certified Professional
  • 14. Confidential and Proprietary to Daugherty Business Solutions Confidential and Proprietary to Daugherty Business Solutions 14 • Data Ingestion (HDFS, Sqoop, Flume) – Import data from a table in a relational database into HDFS – Import the results of a query from a relational database into HDFS – Import a table from a relational database into a new or existing Hive table – Insert or update data from HDFS into a table in a relational database – Given a Flume configuration file, start a Flume agent – Given a configured sink and source, configure a Flume memory channel with a specified capacity • Data Transformation (Pig) – Write and execute a Pig script – Load data into a Pig relation without a schema – Load data into a Pig relation with a schema – Load data from a Hive table into a Pig relation – Use Pig to transform data into a specified format – Transform data to match a given Hive schema – Group the data of one or more Pig relations – Use Pig to remove records with null values from a relation – Store the data from a Pig relation into a folder in HDFS – Store the data from a Pig relation into a Hive table – Sort the output of a Pig relation – Remove the duplicate tuples of a Pig relation – Specify the number of reduce tasks for a Pig MapReduce job – Join two datasets using Pig – Perform a replicated join using Pig – Run a Pig job using Tez – Within a Pig script, register a JAR file of User Defined Functions – Within a Pig script, define an alias for a User Defined Function – Within a Pig script, invoke a User Defined Function • Data Analysis (Hive) – Write and execute a Hive query – Define a Hive-managed table – Define a Hive external table – Define a partitioned Hive table – Define a bucketed Hive table – Define a Hive table from a select query – Define a Hive table that uses the ORCFile format – Create a new ORCFile table from the data in an existing non-ORCFile Hive table – Specify the storage format of a Hive table – Specify the delimiter of a Hive table – Load data into a Hive table from a local directory – Load data into a Hive table from an HDFS directory – Load data into a Hive table as the result of a query – Load a compressed data file into a Hive table – Update a row in a Hive table – Delete a row from a Hive table – Insert a new row into a Hive table – Join two Hive tables – Run a Hive query using Tez – Run a Hive query using vectorization – Output the execution plan for a Hive query – Use a subquery within a Hive query – Output data from a Hive query that is totally ordered across multiple reducers – Set a Hadoop or Hive configuration property from within a Hive query Exam Objectives Example Hortonworks HDPCD Certification
  • 15. Confidential and Proprietary to Daugherty Business Solutions Confidential and Proprietary to Daugherty Business Solutions 15 • Core Spark – Write a Spark Core application in Python or Scala – Initialize a Spark application – Run a Spark job on YARN – Create an RDD – Create an RDD from a file or directory in HDFS – Persist an RDD in memory or on disk – Perform Spark transformations on an RDD – Perform Spark actions on an RDD – Create and use broadcast variables and accumulators – Configure Spark properties • Spark SQL – Create Spark DataFrames from an existing RDD – Perform operations on a DataFrame – Write a Spark SQL application – Use Hive with ORC from Spark SQL – Write a Spark SQL application that reads and writes data from Hive tables Exam Objectives Example HDPCD: Spark
  • 16. Confidential and Proprietary to Daugherty Business Solutions • Some vendors have practice exams and study guides – Hortonworks • https://2xbbhjxc6wk3v21p62t8n4d4-wpengine.netdna-ssl.com/wp- content/uploads/2015/02/HDPCD-PracticeExamGuide1.pdf – Map R • http://learn.mapr.com/mapr-certified-data-analyst-mcda-study- guide?_ga=2.160653181.131562425.1514478687-888544134.1514478687 16 Practice Exams
  • 17. Confidential and Proprietary to Daugherty Business Solutions • Caveat • Register at examslocal.com • Cost – Cloudera $400 – MapR $250 – Hortonworks $250 • Remotely proctored • ~4 Hour time limit 17 Taking the test
  • 18. Confidential and Proprietary to Daugherty Business Solutions 18 Preparing your test space
  • 19. Confidential and Proprietary to Daugherty Business Solutions Exam delivery and cluster information CCP Data Engineer Exam (DE575) is a remote-proctored exam available anywhere, anytime. CCP Data Engineer Exam (DE575) is a hands-on, practical exam using Cloudera technologies. Each user is given their own CDH cluster (currently 5.10.1) cluster pre-loaded with Spark, Impala, Crunch, Hive, Pig, Sqoop, Kafka, Flume, Kite, Hue, Oozie, DataFu, and many others (See a full list). In addition the cluster also comes with Python 2.7 and 3.4, Perl 5.16, Elephant Bird, Cascading 2.6, Brickhouse, Hive Swarm, Scala 2.11, Scalding, IDEA, Sublime, Eclipse, and NetBeans. Documentation Available online during the exam Cloudera Product Documentation Apache Hadoop Apache Hive Apache Impala (Incubating) Apache Sqoop Spark Apache Crunch Apache Pig Kite SDK Apache Avro Apache Parquet Cloudera HUE Apache Oozie Apache Flume DataFu JDK 7 API Docs Python 2.7 Documentation Python 3.4 Documentation Scala Documentation 19 Available documentation
  • 20. Confidential and Proprietary to Daugherty Business Solutions • In my experience, the results have come back the same day. • The results include a pass/fail for the test and a pass fail marker for each problem. If you failed there is a high level description for why you failed. 20 Getting your results
  • 21. Confidential and Proprietary to Daugherty Business Solutions • Share the accomplishment with your current employer- explain the value add. – Its annual assessment time for many employers and it’s an excellent time to discuss the investment in your career with your employer. • Update your LinkedIn or other sites (GitHub, Stack overflow) to let your professional network know. – You can update your title or add the credential to your name and skills summary. • Keep investing in your career. Have a plan for your professional development beyond certifications (user groups, trade shows, conferences, open-source projects, hackathons). • If you are seeking the certification to increase salary, understand the market and how to present the value add to employers. – Leverage online resources like salary.com or payscale.com to get market averages. – Good recruiters should also be able to assess your market value after you achieve certifications and most are willing to conduct resume reviews with you. – Discuss your reasons for achieving the certification and how it adds to your value. • Do not let the certifications expire. 21 So you’ve got your certification, now what?
  • 22. Confidential and Proprietary to Daugherty Business Solutions Join Our Team Contact: Your.name@daugherty.com
  • 23. Confidential and Proprietary to Daugherty Business Solutions