SlideShare a Scribd company logo
© Cloudera, Inc. All rights reserved.
Road to Cloudera certification
© Cloudera, Inc. All rights reserved.
The demand for skills is high and Hadoop is the future. Customers
cannot afford to move slowly in staffing their Big Data projects.
Customers are building plans to ensure projects are staffed with
skilled employees, and supported by a qualified services provider.
Job Trends from Indeed.com
What are you most concerned about
when it comes to your readiness for big
data and Hadoop?
Cloudera MDP webinar poll results, July 2016
© Cloudera, Inc. All rights reserved.
Why Cloudera training?
Aligned to best practices and the pace of change
1 Broadest range of courses
Learning paths for Developer, Admin, Analyst
2 Most experienced instructors
More than 50,000 trained since 2009
6 Widest geographic coverage
Most classes offered: 50 cities worldwide plus online
7 Most relevant platform & community
CDH deployed more than all other distributions combined
3 Leader in certification
Over 12,000 accredited Cloudera professionals
Trusted source for training
100,000+ people have attended online courses4
8 Depth of training material
Hands-on labs and VMs support live instruction
9 Ongoing learning
Video tutorials and e-learning complement training
State of the art curriculum
Courses updated as Hadoop evolves5 10Commitment to big data education
University partnerships to teach Hadoop in colleges
© Cloudera, Inc. All rights reserved.
What is available from Cloudera University?
• Private training: Course delivered at location of customer choice to internal audience
• Public training: Courses regularly scheduled around the globe. Schedule available on web
• Virtual training: Live training accessed via the internet; available for public and private courses
• OnDemand training: Pre-recorded lecture with identical content/exercises as live training options
• Certification: Rigorously developed and meaningful bodies of knowledge
OnDemand Virtual live classroom Private onsitePublic live classroom
© Cloudera, Inc. All rights reserved.
Suggested Cloudera University curricula
Developers
• Python/Scala Training
• Developer for Spark and Hadoop
• CCA: Spark and Hadoop
Developer
• Spark ML & Kafka modules
• Topic specific training (Search,
HBase)
• Hands on practice
• CCP: Data Engineer
Administrators
• Cloudera Administration training
• CCA: Administrator
Data Analysts/Data Scientists
• Data Analyst: Using Hive, Pig & Impala
• CCA: Data Analyst
• Cloudera Data Science
© Cloudera, Inc. All rights reserved.
Let’s get certified!
© Cloudera, Inc. All rights reserved.
Certification Tiers
 CCA (Cloudera Certified Associate)
 Data Analyst, Admin and Spark & Hadoop Developer
 Basic exam – but its a complex subject area
 Maps to curriculum
 CCP (Cloudera Certified Professional)
 Data Engineer
 Combination of Developer, Analyst and Big Data services
 Mastery level – beyond the introduction course
 Real world experience
© Cloudera, Inc. All rights reserved.
Exam format CCA and CCP certification
 Not multiple choice
 Hands on, practical exams similar to student exercises
 Home based, no testing centres
 Proctored through ExamsLocal.com
 Webcam and desktop recorded and monitored
 No papers / phone / drinks on desk / no talking
 AWS Cloud-based cluster
 Guacamole remote desktop in web browser
 No Internet search during exam – only local documentation
© Cloudera, Inc. All rights reserved.
Sample CCA question
 Instructions
 Connect to the MySQL database on the cluster using Sqoop and import all of the
data from the customer table into HDFS. The result must be comma delimited
text format and put into hdfs dir /user/cert/solution3
 Data Description
 A MySQL instance is running on the gateway node. In that instance, you will find
a table that contains twenty-five million (25,000,000) rows of customer data.
MySQL database information:
Installation: On the cluster node gateway
Table name: customer
Username: cloudera
Password: cloudera
© Cloudera, Inc. All rights reserved.
Sample CCP Data Engineer question #1
Instructions
 Dualcore Inc. is a leading electronics retailer. All of their customer data is in a
relational database. Your task is to ingest all this data into their Hadoop
cluster in the proper file format and compression for their needs.
 Dualcore has a number of requirements for this data. It must be stored in a
binary file format. They will keep this data for a minimum of ten years, so
select a format that supports access from multiple programming languages
and backward compatibility if the schema ever changes. They also require
that the data be stored in a compressed format. The data is queried
regularly, so choose a compression codec that is fastest for compression and
decompression and included with CDH.
Data Description ...
© Cloudera, Inc. All rights reserved.
Sample CCP Data Engineer question #2
Instructions
LoudAcre Mobile is a mobile phone service provider that is moving a portion of their
customer analytics workload to Hadoop. Before they can use their customer data,
they want you to clean it and make it consistent.
Errors were found while looking at the customer records. Unfortunately, different input
methods wrote date fields in different formats. Your task is to standardize these
date fields into a consistent format..
Data Description ...
1943233 Chrisopher Rodrigez Jan 11, 1980
8989022 John Birchall 6/7/1967
2933321 Thomas Stewart 08/22/54
© Cloudera, Inc. All rights reserved.
How to Study for CCA and CCP certification
 Set aside 2 to 3 days of dedicated study time for certification
 These certification tests are not easy
 Review the certification webpage study points
 Only study using the certification open book linked documentation
 No Google, Cloudera Training material, favourite tutorial
 Practice with CDH and spark software versions found in the test
 Be familiar with Hive, Imapla shell, Basic Linux shell and Hue UI
© Cloudera, Inc. All rights reserved.
Practice all of the study points
 Stop when confident you know the topic by practising it
 Ensure your know the syntax and experienced the gotchas
 Read all the documentation concerned with the study topic
 Know the documented examples for your copy/paste go to
 Know where to lookup parameters, config and api docs
 Be able to adapt to different scenarios or link topics together
 Questions have multi parts and dependencies
© Cloudera, Inc. All rights reserved.
Taking the exam
 CCA Data Analyst and Developer 2 Hours 9 Questions - 13 mins per
question
 CCA Admin 2 hours 10 questions - 12 mins per question
 CCP Engineer 4 hours 7 questions - 34 mins per question
 Some questions are done in 5 mins some take 20+ or 45+ mins per question
 Questions are weighted in value and can have multiple parts
 Risk of a running out of time which means
 Can’t complete the easy questions to pass
 Can’t check your answers to fix any problems to pass
 Stop any question after 20 mins and come back at the end
 Skip any question that looks too hard after quick skim read and come
back
 Finished? Always double check your answers
© Cloudera, Inc. All rights reserved.
Common certification exam problems
 Review the certification FAQ for common problems and questions marked wrong
status
 https://www.cloudera.com/more/training/certification/faq.html
 Remote desktop or network too slow!
 Do exam off peak times. Use command line shell not Hue gui.
 Unfamiliar with the questions topic. Time wasted reading docs in exam time. Study!
 Don’t use localhost instead use the correct gateway/master/worker hostname
 Rushing and stressed makes mistakes:
 Misinterpreted what the question asked.
 Are directories/files/property/columns names spelled correctly?
 Is output data format 100% correct ? check column order, data types, null values
are what was asked. Don’t assume.
 Notice any errors in logs or console when running ? Scroll back and check!
© Cloudera, Inc. All rights reserved.
Tips for studying CCA Admin
 Know Cloudera Manager UI and how to search properties
 Breadcrumbs, instances, safety valve advanced settings
 Forget to apply setting or restart service, don’t break the cluster!
 Practice topics not in the admin course but in the exam:
 Sentry setup, Load balancer, Log redaction and Encrypted zones
 Practice all the hdfs dfs and dfsadmin commands
 Practice setting up services and service instances
 Practice troubleshooting and fixing common problem applications
 Know your way around the different log files
© Cloudera, Inc. All rights reserved.
Tips for studying Data Analyst certification
 Study how to use regex to manipulate strings well
 SQL subqueries have a temp table name, don’t forget it
 Understand Sqoop warehouse dir and target dir relationship
 Practice Sqoop help to quickly view and use parameters
 Practice window analytic functions - not easy to do
 Practice type conversions for Hive and Impala
 Practice how to create partitioned/bucketed tables – lots of syntax
 Copy and paste directly from the question to quickly create the table
 Practice using the command line: beeline and impala shell
© Cloudera, Inc. All rights reserved.
Tips for studying CCA Spark and Hadoop
 No need to be an expert in Scala or Python coding.
 Only testing Spark knowledge.
 Practice Sqoop, Hdfs dfs command line and your SQL
 Certification has not yet been updated to spark 2.0 (uses 1.6)
 New students may not be familiar with Spark 1.6. Minor differences.
 Read and practice using spark documentation
 Start the 1.6 spark shell with pyspark and spark-shell not spark2-shell or
pyspark2
© Cloudera, Inc. All rights reserved.
Tips for studying CCP Data Enginner
 Study non core topics found outside the training course material
 Ignore what is not Cloudera supported
 Oozie features one third of the test!
 See gethue.com website for short oozie ui tutorials
 How to get Oozie to run on your small default cluster:
 Adjust container memory so you can run multiple containers
 Increase Node manager max container size to 7 GB
 Limit container memory max size to 3 GB and 1 cpu
 Result on a dual core 8gb 3x worker node cluster: 6 containers.
 Currently Spark 1.6 not Spark 2.0 (will be updated in the future)
© Cloudera, Inc. All rights reserved.
Qualify for free certification
 Take part in a Data Analyst, Developer or Administrator Public class to
receive a free certification exam in the given discipline
 Valid till the end of April
© Cloudera, Inc. All rights reserved.
Thank you

More Related Content

What's hot

Apache Hadoop 3
Apache Hadoop 3Apache Hadoop 3
Apache Hadoop 3
Cloudera, Inc.
 
One Hadoop, Multiple Clouds - NYC Big Data Meetup
One Hadoop, Multiple Clouds - NYC Big Data MeetupOne Hadoop, Multiple Clouds - NYC Big Data Meetup
One Hadoop, Multiple Clouds - NYC Big Data Meetup
Andrei Savu
 
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the CloudPart 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Cloudera, Inc.
 
Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...
Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...
Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...
Cloudera, Inc.
 
Faster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for production
Faster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for productionFaster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for production
Faster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for production
Cloudera, Inc.
 
Introduction to Machine Learning on Apache Spark MLlib by Juliet Hougland, Se...
Introduction to Machine Learning on Apache Spark MLlib by Juliet Hougland, Se...Introduction to Machine Learning on Apache Spark MLlib by Juliet Hougland, Se...
Introduction to Machine Learning on Apache Spark MLlib by Juliet Hougland, Se...
Cloudera, Inc.
 
Data Science and CDSW
Data Science and CDSWData Science and CDSW
Data Science and CDSW
Jason Hubbard
 
Part 1: Lambda Architectures: Simplified by Apache Kudu
Part 1: Lambda Architectures: Simplified by Apache KuduPart 1: Lambda Architectures: Simplified by Apache Kudu
Part 1: Lambda Architectures: Simplified by Apache Kudu
Cloudera, Inc.
 
Extreme Sports & Beyond: Exploring a new frontier in data with GoPro
Extreme Sports & Beyond: Exploring a new frontier in data with GoProExtreme Sports & Beyond: Exploring a new frontier in data with GoPro
Extreme Sports & Beyond: Exploring a new frontier in data with GoPro
Cloudera, Inc.
 
Cloudera Showcase: SQL-on-Hadoop
Cloudera Showcase: SQL-on-HadoopCloudera Showcase: SQL-on-Hadoop
Cloudera Showcase: SQL-on-Hadoop
Cloudera, Inc.
 
Multi-Tenant Operations with Cloudera 5.7 & BT
Multi-Tenant Operations with Cloudera 5.7 & BTMulti-Tenant Operations with Cloudera 5.7 & BT
Multi-Tenant Operations with Cloudera 5.7 & BT
Cloudera, Inc.
 
Data Science at Scale Using Apache Spark and Apache Hadoop
Data Science at Scale Using Apache Spark and Apache HadoopData Science at Scale Using Apache Spark and Apache Hadoop
Data Science at Scale Using Apache Spark and Apache Hadoop
Cloudera, Inc.
 
How to use Impala query plan and profile to fix performance issues
How to use Impala query plan and profile to fix performance issuesHow to use Impala query plan and profile to fix performance issues
How to use Impala query plan and profile to fix performance issues
Cloudera, Inc.
 
Risk Management for Data: Secured and Governed
Risk Management for Data: Secured and GovernedRisk Management for Data: Secured and Governed
Risk Management for Data: Secured and Governed
Cloudera, Inc.
 
Hadoop on Cloud: Why and How?
Hadoop on Cloud: Why and How?Hadoop on Cloud: Why and How?
Hadoop on Cloud: Why and How?
Cloudera, Inc.
 
Intro to Apache Spark
Intro to Apache SparkIntro to Apache Spark
Intro to Apache Spark
Cloudera, Inc.
 
Solr consistency and recovery internals
Solr consistency and recovery internalsSolr consistency and recovery internals
Solr consistency and recovery internals
Cloudera, Inc.
 
Security implementation on hadoop
Security implementation on hadoopSecurity implementation on hadoop
Security implementation on hadoop
Wei-Chiu Chuang
 
Five Tips for Running Cloudera on AWS
Five Tips for Running Cloudera on AWSFive Tips for Running Cloudera on AWS
Five Tips for Running Cloudera on AWS
Cloudera, Inc.
 
Part 3: Models in Production: A Look From Beginning to End
Part 3: Models in Production: A Look From Beginning to EndPart 3: Models in Production: A Look From Beginning to End
Part 3: Models in Production: A Look From Beginning to End
Cloudera, Inc.
 

What's hot (20)

Apache Hadoop 3
Apache Hadoop 3Apache Hadoop 3
Apache Hadoop 3
 
One Hadoop, Multiple Clouds - NYC Big Data Meetup
One Hadoop, Multiple Clouds - NYC Big Data MeetupOne Hadoop, Multiple Clouds - NYC Big Data Meetup
One Hadoop, Multiple Clouds - NYC Big Data Meetup
 
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the CloudPart 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
 
Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...
Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...
Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...
 
Faster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for production
Faster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for productionFaster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for production
Faster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for production
 
Introduction to Machine Learning on Apache Spark MLlib by Juliet Hougland, Se...
Introduction to Machine Learning on Apache Spark MLlib by Juliet Hougland, Se...Introduction to Machine Learning on Apache Spark MLlib by Juliet Hougland, Se...
Introduction to Machine Learning on Apache Spark MLlib by Juliet Hougland, Se...
 
Data Science and CDSW
Data Science and CDSWData Science and CDSW
Data Science and CDSW
 
Part 1: Lambda Architectures: Simplified by Apache Kudu
Part 1: Lambda Architectures: Simplified by Apache KuduPart 1: Lambda Architectures: Simplified by Apache Kudu
Part 1: Lambda Architectures: Simplified by Apache Kudu
 
Extreme Sports & Beyond: Exploring a new frontier in data with GoPro
Extreme Sports & Beyond: Exploring a new frontier in data with GoProExtreme Sports & Beyond: Exploring a new frontier in data with GoPro
Extreme Sports & Beyond: Exploring a new frontier in data with GoPro
 
Cloudera Showcase: SQL-on-Hadoop
Cloudera Showcase: SQL-on-HadoopCloudera Showcase: SQL-on-Hadoop
Cloudera Showcase: SQL-on-Hadoop
 
Multi-Tenant Operations with Cloudera 5.7 & BT
Multi-Tenant Operations with Cloudera 5.7 & BTMulti-Tenant Operations with Cloudera 5.7 & BT
Multi-Tenant Operations with Cloudera 5.7 & BT
 
Data Science at Scale Using Apache Spark and Apache Hadoop
Data Science at Scale Using Apache Spark and Apache HadoopData Science at Scale Using Apache Spark and Apache Hadoop
Data Science at Scale Using Apache Spark and Apache Hadoop
 
How to use Impala query plan and profile to fix performance issues
How to use Impala query plan and profile to fix performance issuesHow to use Impala query plan and profile to fix performance issues
How to use Impala query plan and profile to fix performance issues
 
Risk Management for Data: Secured and Governed
Risk Management for Data: Secured and GovernedRisk Management for Data: Secured and Governed
Risk Management for Data: Secured and Governed
 
Hadoop on Cloud: Why and How?
Hadoop on Cloud: Why and How?Hadoop on Cloud: Why and How?
Hadoop on Cloud: Why and How?
 
Intro to Apache Spark
Intro to Apache SparkIntro to Apache Spark
Intro to Apache Spark
 
Solr consistency and recovery internals
Solr consistency and recovery internalsSolr consistency and recovery internals
Solr consistency and recovery internals
 
Security implementation on hadoop
Security implementation on hadoopSecurity implementation on hadoop
Security implementation on hadoop
 
Five Tips for Running Cloudera on AWS
Five Tips for Running Cloudera on AWSFive Tips for Running Cloudera on AWS
Five Tips for Running Cloudera on AWS
 
Part 3: Models in Production: A Look From Beginning to End
Part 3: Models in Production: A Look From Beginning to EndPart 3: Models in Production: A Look From Beginning to End
Part 3: Models in Production: A Look From Beginning to End
 

Similar to Road to Cloudera certification

Cloudera training: secure your Cloudera cluster
Cloudera training: secure your Cloudera clusterCloudera training: secure your Cloudera cluster
Cloudera training: secure your Cloudera cluster
Cloudera, Inc.
 
Part 2: A Visual Dive into Machine Learning and Deep Learning 

Part 2: A Visual Dive into Machine Learning and Deep Learning 
Part 2: A Visual Dive into Machine Learning and Deep Learning 

Part 2: A Visual Dive into Machine Learning and Deep Learning 

Cloudera, Inc.
 
Data Engineering Course Syllabus - WeCloudData
Data Engineering Course Syllabus - WeCloudDataData Engineering Course Syllabus - WeCloudData
Data Engineering Course Syllabus - WeCloudData
WeCloudData
 
Delivering Insights from 20M+ Smart Homes with 500M+ Devices
Delivering Insights from 20M+ Smart Homes with 500M+ DevicesDelivering Insights from 20M+ Smart Homes with 500M+ Devices
Delivering Insights from 20M+ Smart Homes with 500M+ Devices
Databricks
 
Best Practices For Workflow
Best Practices For WorkflowBest Practices For Workflow
Best Practices For Workflow
Timothy Spann
 
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr

Cloudera, Inc.
 
DevOps and Decoys How to Build a Successful Microsoft DevOps Including the Data
DevOps and Decoys  How to Build a Successful Microsoft DevOps Including the DataDevOps and Decoys  How to Build a Successful Microsoft DevOps Including the Data
DevOps and Decoys How to Build a Successful Microsoft DevOps Including the Data
Kellyn Pot'Vin-Gorman
 
NOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science Workbench
NOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science WorkbenchNOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science Workbench
NOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science Workbench
NOVA DATASCIENCE
 
Introduction to Cloudera Search Training
Introduction to Cloudera Search TrainingIntroduction to Cloudera Search Training
Introduction to Cloudera Search Training
Cloudera, Inc.
 
Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...
DataWorks Summit
 
Kafka for DBAs
Kafka for DBAsKafka for DBAs
Kafka for DBAs
Gwen (Chen) Shapira
 
Hadoop applicationarchitectures
Hadoop applicationarchitecturesHadoop applicationarchitectures
Hadoop applicationarchitectures
Doug Chang
 
Databricks Partner Enablement Guide.pdf
Databricks Partner Enablement Guide.pdfDatabricks Partner Enablement Guide.pdf
Databricks Partner Enablement Guide.pdf
ssuserb74636
 
Large-Scale Data Science on Hadoop (Intel Big Data Day)
Large-Scale Data Science on Hadoop (Intel Big Data Day)Large-Scale Data Science on Hadoop (Intel Big Data Day)
Large-Scale Data Science on Hadoop (Intel Big Data Day)
Uri Laserson
 
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road AheadCloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
DataWorks Summit
 
PySpark Best Practices
PySpark Best PracticesPySpark Best Practices
PySpark Best Practices
Cloudera, Inc.
 
Hadoop and Mapreduce Certification
Hadoop and Mapreduce CertificationHadoop and Mapreduce Certification
Hadoop and Mapreduce Certification
Vskills
 
Cloudera data-analyst-training
Cloudera data-analyst-trainingCloudera data-analyst-training
Cloudera data-analyst-training
Starman Anoa
 
Aws certified: the journey with tips n tricks
Aws certified: the journey with tips n tricksAws certified: the journey with tips n tricks
Aws certified: the journey with tips n tricks
Antoni Tzavelas
 
HadoopIntroduction.pptx
HadoopIntroduction.pptxHadoopIntroduction.pptx
HadoopIntroduction.pptx
BalasundaramSr
 

Similar to Road to Cloudera certification (20)

Cloudera training: secure your Cloudera cluster
Cloudera training: secure your Cloudera clusterCloudera training: secure your Cloudera cluster
Cloudera training: secure your Cloudera cluster
 
Part 2: A Visual Dive into Machine Learning and Deep Learning 

Part 2: A Visual Dive into Machine Learning and Deep Learning 
Part 2: A Visual Dive into Machine Learning and Deep Learning 

Part 2: A Visual Dive into Machine Learning and Deep Learning 

 
Data Engineering Course Syllabus - WeCloudData
Data Engineering Course Syllabus - WeCloudDataData Engineering Course Syllabus - WeCloudData
Data Engineering Course Syllabus - WeCloudData
 
Delivering Insights from 20M+ Smart Homes with 500M+ Devices
Delivering Insights from 20M+ Smart Homes with 500M+ DevicesDelivering Insights from 20M+ Smart Homes with 500M+ Devices
Delivering Insights from 20M+ Smart Homes with 500M+ Devices
 
Best Practices For Workflow
Best Practices For WorkflowBest Practices For Workflow
Best Practices For Workflow
 
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr

 
DevOps and Decoys How to Build a Successful Microsoft DevOps Including the Data
DevOps and Decoys  How to Build a Successful Microsoft DevOps Including the DataDevOps and Decoys  How to Build a Successful Microsoft DevOps Including the Data
DevOps and Decoys How to Build a Successful Microsoft DevOps Including the Data
 
NOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science Workbench
NOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science WorkbenchNOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science Workbench
NOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science Workbench
 
Introduction to Cloudera Search Training
Introduction to Cloudera Search TrainingIntroduction to Cloudera Search Training
Introduction to Cloudera Search Training
 
Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...
 
Kafka for DBAs
Kafka for DBAsKafka for DBAs
Kafka for DBAs
 
Hadoop applicationarchitectures
Hadoop applicationarchitecturesHadoop applicationarchitectures
Hadoop applicationarchitectures
 
Databricks Partner Enablement Guide.pdf
Databricks Partner Enablement Guide.pdfDatabricks Partner Enablement Guide.pdf
Databricks Partner Enablement Guide.pdf
 
Large-Scale Data Science on Hadoop (Intel Big Data Day)
Large-Scale Data Science on Hadoop (Intel Big Data Day)Large-Scale Data Science on Hadoop (Intel Big Data Day)
Large-Scale Data Science on Hadoop (Intel Big Data Day)
 
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road AheadCloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
 
PySpark Best Practices
PySpark Best PracticesPySpark Best Practices
PySpark Best Practices
 
Hadoop and Mapreduce Certification
Hadoop and Mapreduce CertificationHadoop and Mapreduce Certification
Hadoop and Mapreduce Certification
 
Cloudera data-analyst-training
Cloudera data-analyst-trainingCloudera data-analyst-training
Cloudera data-analyst-training
 
Aws certified: the journey with tips n tricks
Aws certified: the journey with tips n tricksAws certified: the journey with tips n tricks
Aws certified: the journey with tips n tricks
 
HadoopIntroduction.pptx
HadoopIntroduction.pptxHadoopIntroduction.pptx
HadoopIntroduction.pptx
 

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
Cloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
Cloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
Cloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
Cloudera, Inc.
 

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Recently uploaded

Top Digital Marketing Strategy in 2024.pdf
Top Digital Marketing Strategy in 2024.pdfTop Digital Marketing Strategy in 2024.pdf
Top Digital Marketing Strategy in 2024.pdf
Top IT Marketing
 
Growth Buyouts - The Dawn of the GBO (Slow Ventures)
Growth Buyouts - The  Dawn of the GBO (Slow Ventures)Growth Buyouts - The  Dawn of the GBO (Slow Ventures)
Growth Buyouts - The Dawn of the GBO (Slow Ventures)
Razin Mustafiz
 
Cheslyn Jacobs- TymeBank: Building Consumer Trust in Digital Banking
Cheslyn Jacobs- TymeBank: Building Consumer Trust in Digital  BankingCheslyn Jacobs- TymeBank: Building Consumer Trust in Digital  Banking
Cheslyn Jacobs- TymeBank: Building Consumer Trust in Digital Banking
itnewsafrica
 
Discover who your target audience is and reach them
Discover who your target audience is and reach themDiscover who your target audience is and reach them
Discover who your target audience is and reach them
Quibble
 
MEA Union Budget 2024-25 Final Presentation
MEA Union Budget 2024-25 Final PresentationMEA Union Budget 2024-25 Final Presentation
MEA Union Budget 2024-25 Final Presentation
PhysicsUtu
 
The-Three-Pillars-of-Doctoral-Research-What-Why-and-How (1).pptx
The-Three-Pillars-of-Doctoral-Research-What-Why-and-How (1).pptxThe-Three-Pillars-of-Doctoral-Research-What-Why-and-How (1).pptx
The-Three-Pillars-of-Doctoral-Research-What-Why-and-How (1).pptx
Jindal Global University, Sonipat Haryana 131001
 
Entrepreneurial mindset: An Introduction to Entrepreneurship
Entrepreneurial mindset: An Introduction to EntrepreneurshipEntrepreneurial mindset: An Introduction to Entrepreneurship
Entrepreneurial mindset: An Introduction to Entrepreneurship
Sanjay Joshi
 
Family/Indoor Entertainment Centers Market: Regulation and Compliance Updates
Family/Indoor Entertainment Centers Market: Regulation and Compliance UpdatesFamily/Indoor Entertainment Centers Market: Regulation and Compliance Updates
Family/Indoor Entertainment Centers Market: Regulation and Compliance Updates
AishwaryaDoiphode3
 
PETAVIT SIP-05.pdfAAAAAAAAAAAAAAAAAAAAAAAAA
PETAVIT SIP-05.pdfAAAAAAAAAAAAAAAAAAAAAAAAAPETAVIT SIP-05.pdfAAAAAAAAAAAAAAAAAAAAAAAAA
PETAVIT SIP-05.pdfAAAAAAAAAAAAAAAAAAAAAAAAA
lawrenceads01
 
You Get Me! Leveraging Communication Styles in Virtual Trainingpptx
You Get Me! Leveraging Communication Styles in Virtual TrainingpptxYou Get Me! Leveraging Communication Styles in Virtual Trainingpptx
You Get Me! Leveraging Communication Styles in Virtual Trainingpptx
Cynthia Clay
 
brojjeddah Home Services Company in Saudi Arabia
brojjeddah Home Services Company in Saudi Arabiabrojjeddah Home Services Company in Saudi Arabia
brojjeddah Home Services Company in Saudi Arabia
brojjeddah
 
What is Venture Client for Startup entrepreneur
What is Venture Client for Startup entrepreneurWhat is Venture Client for Startup entrepreneur
What is Venture Client for Startup entrepreneur
Gokul Rangarajan
 
Girls Call Kharghar 9910780858 Provide Best And Top Girl Service And No1 in City
Girls Call Kharghar 9910780858 Provide Best And Top Girl Service And No1 in CityGirls Call Kharghar 9910780858 Provide Best And Top Girl Service And No1 in City
Girls Call Kharghar 9910780858 Provide Best And Top Girl Service And No1 in City
maigasapphire
 
NewBase 05 July 2024 Energy News issue - 1736 by Khaled Al Awadi_compresse...
NewBase   05 July 2024  Energy News issue - 1736 by Khaled Al Awadi_compresse...NewBase   05 July 2024  Energy News issue - 1736 by Khaled Al Awadi_compresse...
NewBase 05 July 2024 Energy News issue - 1736 by Khaled Al Awadi_compresse...
Khaled Al Awadi
 
KineMaster Diamond APK v7.3.11.32200 (4K HD, No Watermark)
KineMaster Diamond APK v7.3.11.32200 (4K HD, No Watermark)KineMaster Diamond APK v7.3.11.32200 (4K HD, No Watermark)
KineMaster Diamond APK v7.3.11.32200 (4K HD, No Watermark)
APKs Pure
 
Transforming the Future of Limo Services.pptx
Transforming the Future of Limo Services.pptxTransforming the Future of Limo Services.pptx
Transforming the Future of Limo Services.pptx
limocaptaincom
 
AI at Work​ The demystification of AI and real-world stories on how to apply ...
AI at Work​ The demystification of AI and real-world stories on how to apply ...AI at Work​ The demystification of AI and real-world stories on how to apply ...
AI at Work​ The demystification of AI and real-world stories on how to apply ...
Auxis Consulting & Outsourcing
 
Managing Customer & User Experience of Customers
Managing Customer & User Experience of CustomersManaging Customer & User Experience of Customers
Managing Customer & User Experience of Customers
SalmanTahir60
 
Christmas Decorations_ A Guide to Small Christmas Trees, Candle Centerpieces,...
Christmas Decorations_ A Guide to Small Christmas Trees, Candle Centerpieces,...Christmas Decorations_ A Guide to Small Christmas Trees, Candle Centerpieces,...
Christmas Decorations_ A Guide to Small Christmas Trees, Candle Centerpieces,...
Lynch Creek Farm
 
DEKISH ELEVATOR INDIA PVT LTD Brochure.pdf
DEKISH ELEVATOR INDIA PVT LTD Brochure.pdfDEKISH ELEVATOR INDIA PVT LTD Brochure.pdf
DEKISH ELEVATOR INDIA PVT LTD Brochure.pdf
unosafeads
 

Recently uploaded (20)

Top Digital Marketing Strategy in 2024.pdf
Top Digital Marketing Strategy in 2024.pdfTop Digital Marketing Strategy in 2024.pdf
Top Digital Marketing Strategy in 2024.pdf
 
Growth Buyouts - The Dawn of the GBO (Slow Ventures)
Growth Buyouts - The  Dawn of the GBO (Slow Ventures)Growth Buyouts - The  Dawn of the GBO (Slow Ventures)
Growth Buyouts - The Dawn of the GBO (Slow Ventures)
 
Cheslyn Jacobs- TymeBank: Building Consumer Trust in Digital Banking
Cheslyn Jacobs- TymeBank: Building Consumer Trust in Digital  BankingCheslyn Jacobs- TymeBank: Building Consumer Trust in Digital  Banking
Cheslyn Jacobs- TymeBank: Building Consumer Trust in Digital Banking
 
Discover who your target audience is and reach them
Discover who your target audience is and reach themDiscover who your target audience is and reach them
Discover who your target audience is and reach them
 
MEA Union Budget 2024-25 Final Presentation
MEA Union Budget 2024-25 Final PresentationMEA Union Budget 2024-25 Final Presentation
MEA Union Budget 2024-25 Final Presentation
 
The-Three-Pillars-of-Doctoral-Research-What-Why-and-How (1).pptx
The-Three-Pillars-of-Doctoral-Research-What-Why-and-How (1).pptxThe-Three-Pillars-of-Doctoral-Research-What-Why-and-How (1).pptx
The-Three-Pillars-of-Doctoral-Research-What-Why-and-How (1).pptx
 
Entrepreneurial mindset: An Introduction to Entrepreneurship
Entrepreneurial mindset: An Introduction to EntrepreneurshipEntrepreneurial mindset: An Introduction to Entrepreneurship
Entrepreneurial mindset: An Introduction to Entrepreneurship
 
Family/Indoor Entertainment Centers Market: Regulation and Compliance Updates
Family/Indoor Entertainment Centers Market: Regulation and Compliance UpdatesFamily/Indoor Entertainment Centers Market: Regulation and Compliance Updates
Family/Indoor Entertainment Centers Market: Regulation and Compliance Updates
 
PETAVIT SIP-05.pdfAAAAAAAAAAAAAAAAAAAAAAAAA
PETAVIT SIP-05.pdfAAAAAAAAAAAAAAAAAAAAAAAAAPETAVIT SIP-05.pdfAAAAAAAAAAAAAAAAAAAAAAAAA
PETAVIT SIP-05.pdfAAAAAAAAAAAAAAAAAAAAAAAAA
 
You Get Me! Leveraging Communication Styles in Virtual Trainingpptx
You Get Me! Leveraging Communication Styles in Virtual TrainingpptxYou Get Me! Leveraging Communication Styles in Virtual Trainingpptx
You Get Me! Leveraging Communication Styles in Virtual Trainingpptx
 
brojjeddah Home Services Company in Saudi Arabia
brojjeddah Home Services Company in Saudi Arabiabrojjeddah Home Services Company in Saudi Arabia
brojjeddah Home Services Company in Saudi Arabia
 
What is Venture Client for Startup entrepreneur
What is Venture Client for Startup entrepreneurWhat is Venture Client for Startup entrepreneur
What is Venture Client for Startup entrepreneur
 
Girls Call Kharghar 9910780858 Provide Best And Top Girl Service And No1 in City
Girls Call Kharghar 9910780858 Provide Best And Top Girl Service And No1 in CityGirls Call Kharghar 9910780858 Provide Best And Top Girl Service And No1 in City
Girls Call Kharghar 9910780858 Provide Best And Top Girl Service And No1 in City
 
NewBase 05 July 2024 Energy News issue - 1736 by Khaled Al Awadi_compresse...
NewBase   05 July 2024  Energy News issue - 1736 by Khaled Al Awadi_compresse...NewBase   05 July 2024  Energy News issue - 1736 by Khaled Al Awadi_compresse...
NewBase 05 July 2024 Energy News issue - 1736 by Khaled Al Awadi_compresse...
 
KineMaster Diamond APK v7.3.11.32200 (4K HD, No Watermark)
KineMaster Diamond APK v7.3.11.32200 (4K HD, No Watermark)KineMaster Diamond APK v7.3.11.32200 (4K HD, No Watermark)
KineMaster Diamond APK v7.3.11.32200 (4K HD, No Watermark)
 
Transforming the Future of Limo Services.pptx
Transforming the Future of Limo Services.pptxTransforming the Future of Limo Services.pptx
Transforming the Future of Limo Services.pptx
 
AI at Work​ The demystification of AI and real-world stories on how to apply ...
AI at Work​ The demystification of AI and real-world stories on how to apply ...AI at Work​ The demystification of AI and real-world stories on how to apply ...
AI at Work​ The demystification of AI and real-world stories on how to apply ...
 
Managing Customer & User Experience of Customers
Managing Customer & User Experience of CustomersManaging Customer & User Experience of Customers
Managing Customer & User Experience of Customers
 
Christmas Decorations_ A Guide to Small Christmas Trees, Candle Centerpieces,...
Christmas Decorations_ A Guide to Small Christmas Trees, Candle Centerpieces,...Christmas Decorations_ A Guide to Small Christmas Trees, Candle Centerpieces,...
Christmas Decorations_ A Guide to Small Christmas Trees, Candle Centerpieces,...
 
DEKISH ELEVATOR INDIA PVT LTD Brochure.pdf
DEKISH ELEVATOR INDIA PVT LTD Brochure.pdfDEKISH ELEVATOR INDIA PVT LTD Brochure.pdf
DEKISH ELEVATOR INDIA PVT LTD Brochure.pdf
 

Road to Cloudera certification

  • 1. © Cloudera, Inc. All rights reserved. Road to Cloudera certification
  • 2. © Cloudera, Inc. All rights reserved. The demand for skills is high and Hadoop is the future. Customers cannot afford to move slowly in staffing their Big Data projects. Customers are building plans to ensure projects are staffed with skilled employees, and supported by a qualified services provider. Job Trends from Indeed.com What are you most concerned about when it comes to your readiness for big data and Hadoop? Cloudera MDP webinar poll results, July 2016
  • 3. © Cloudera, Inc. All rights reserved. Why Cloudera training? Aligned to best practices and the pace of change 1 Broadest range of courses Learning paths for Developer, Admin, Analyst 2 Most experienced instructors More than 50,000 trained since 2009 6 Widest geographic coverage Most classes offered: 50 cities worldwide plus online 7 Most relevant platform & community CDH deployed more than all other distributions combined 3 Leader in certification Over 12,000 accredited Cloudera professionals Trusted source for training 100,000+ people have attended online courses4 8 Depth of training material Hands-on labs and VMs support live instruction 9 Ongoing learning Video tutorials and e-learning complement training State of the art curriculum Courses updated as Hadoop evolves5 10Commitment to big data education University partnerships to teach Hadoop in colleges
  • 4. © Cloudera, Inc. All rights reserved. What is available from Cloudera University? • Private training: Course delivered at location of customer choice to internal audience • Public training: Courses regularly scheduled around the globe. Schedule available on web • Virtual training: Live training accessed via the internet; available for public and private courses • OnDemand training: Pre-recorded lecture with identical content/exercises as live training options • Certification: Rigorously developed and meaningful bodies of knowledge OnDemand Virtual live classroom Private onsitePublic live classroom
  • 5. © Cloudera, Inc. All rights reserved. Suggested Cloudera University curricula Developers • Python/Scala Training • Developer for Spark and Hadoop • CCA: Spark and Hadoop Developer • Spark ML & Kafka modules • Topic specific training (Search, HBase) • Hands on practice • CCP: Data Engineer Administrators • Cloudera Administration training • CCA: Administrator Data Analysts/Data Scientists • Data Analyst: Using Hive, Pig & Impala • CCA: Data Analyst • Cloudera Data Science
  • 6. © Cloudera, Inc. All rights reserved. Let’s get certified!
  • 7. © Cloudera, Inc. All rights reserved. Certification Tiers  CCA (Cloudera Certified Associate)  Data Analyst, Admin and Spark & Hadoop Developer  Basic exam – but its a complex subject area  Maps to curriculum  CCP (Cloudera Certified Professional)  Data Engineer  Combination of Developer, Analyst and Big Data services  Mastery level – beyond the introduction course  Real world experience
  • 8. © Cloudera, Inc. All rights reserved. Exam format CCA and CCP certification  Not multiple choice  Hands on, practical exams similar to student exercises  Home based, no testing centres  Proctored through ExamsLocal.com  Webcam and desktop recorded and monitored  No papers / phone / drinks on desk / no talking  AWS Cloud-based cluster  Guacamole remote desktop in web browser  No Internet search during exam – only local documentation
  • 9. © Cloudera, Inc. All rights reserved. Sample CCA question  Instructions  Connect to the MySQL database on the cluster using Sqoop and import all of the data from the customer table into HDFS. The result must be comma delimited text format and put into hdfs dir /user/cert/solution3  Data Description  A MySQL instance is running on the gateway node. In that instance, you will find a table that contains twenty-five million (25,000,000) rows of customer data. MySQL database information: Installation: On the cluster node gateway Table name: customer Username: cloudera Password: cloudera
  • 10. © Cloudera, Inc. All rights reserved. Sample CCP Data Engineer question #1 Instructions  Dualcore Inc. is a leading electronics retailer. All of their customer data is in a relational database. Your task is to ingest all this data into their Hadoop cluster in the proper file format and compression for their needs.  Dualcore has a number of requirements for this data. It must be stored in a binary file format. They will keep this data for a minimum of ten years, so select a format that supports access from multiple programming languages and backward compatibility if the schema ever changes. They also require that the data be stored in a compressed format. The data is queried regularly, so choose a compression codec that is fastest for compression and decompression and included with CDH. Data Description ...
  • 11. © Cloudera, Inc. All rights reserved. Sample CCP Data Engineer question #2 Instructions LoudAcre Mobile is a mobile phone service provider that is moving a portion of their customer analytics workload to Hadoop. Before they can use their customer data, they want you to clean it and make it consistent. Errors were found while looking at the customer records. Unfortunately, different input methods wrote date fields in different formats. Your task is to standardize these date fields into a consistent format.. Data Description ... 1943233 Chrisopher Rodrigez Jan 11, 1980 8989022 John Birchall 6/7/1967 2933321 Thomas Stewart 08/22/54
  • 12. © Cloudera, Inc. All rights reserved. How to Study for CCA and CCP certification  Set aside 2 to 3 days of dedicated study time for certification  These certification tests are not easy  Review the certification webpage study points  Only study using the certification open book linked documentation  No Google, Cloudera Training material, favourite tutorial  Practice with CDH and spark software versions found in the test  Be familiar with Hive, Imapla shell, Basic Linux shell and Hue UI
  • 13. © Cloudera, Inc. All rights reserved. Practice all of the study points  Stop when confident you know the topic by practising it  Ensure your know the syntax and experienced the gotchas  Read all the documentation concerned with the study topic  Know the documented examples for your copy/paste go to  Know where to lookup parameters, config and api docs  Be able to adapt to different scenarios or link topics together  Questions have multi parts and dependencies
  • 14. © Cloudera, Inc. All rights reserved. Taking the exam  CCA Data Analyst and Developer 2 Hours 9 Questions - 13 mins per question  CCA Admin 2 hours 10 questions - 12 mins per question  CCP Engineer 4 hours 7 questions - 34 mins per question  Some questions are done in 5 mins some take 20+ or 45+ mins per question  Questions are weighted in value and can have multiple parts  Risk of a running out of time which means  Can’t complete the easy questions to pass  Can’t check your answers to fix any problems to pass  Stop any question after 20 mins and come back at the end  Skip any question that looks too hard after quick skim read and come back  Finished? Always double check your answers
  • 15. © Cloudera, Inc. All rights reserved. Common certification exam problems  Review the certification FAQ for common problems and questions marked wrong status  https://www.cloudera.com/more/training/certification/faq.html  Remote desktop or network too slow!  Do exam off peak times. Use command line shell not Hue gui.  Unfamiliar with the questions topic. Time wasted reading docs in exam time. Study!  Don’t use localhost instead use the correct gateway/master/worker hostname  Rushing and stressed makes mistakes:  Misinterpreted what the question asked.  Are directories/files/property/columns names spelled correctly?  Is output data format 100% correct ? check column order, data types, null values are what was asked. Don’t assume.  Notice any errors in logs or console when running ? Scroll back and check!
  • 16. © Cloudera, Inc. All rights reserved. Tips for studying CCA Admin  Know Cloudera Manager UI and how to search properties  Breadcrumbs, instances, safety valve advanced settings  Forget to apply setting or restart service, don’t break the cluster!  Practice topics not in the admin course but in the exam:  Sentry setup, Load balancer, Log redaction and Encrypted zones  Practice all the hdfs dfs and dfsadmin commands  Practice setting up services and service instances  Practice troubleshooting and fixing common problem applications  Know your way around the different log files
  • 17. © Cloudera, Inc. All rights reserved. Tips for studying Data Analyst certification  Study how to use regex to manipulate strings well  SQL subqueries have a temp table name, don’t forget it  Understand Sqoop warehouse dir and target dir relationship  Practice Sqoop help to quickly view and use parameters  Practice window analytic functions - not easy to do  Practice type conversions for Hive and Impala  Practice how to create partitioned/bucketed tables – lots of syntax  Copy and paste directly from the question to quickly create the table  Practice using the command line: beeline and impala shell
  • 18. © Cloudera, Inc. All rights reserved. Tips for studying CCA Spark and Hadoop  No need to be an expert in Scala or Python coding.  Only testing Spark knowledge.  Practice Sqoop, Hdfs dfs command line and your SQL  Certification has not yet been updated to spark 2.0 (uses 1.6)  New students may not be familiar with Spark 1.6. Minor differences.  Read and practice using spark documentation  Start the 1.6 spark shell with pyspark and spark-shell not spark2-shell or pyspark2
  • 19. © Cloudera, Inc. All rights reserved. Tips for studying CCP Data Enginner  Study non core topics found outside the training course material  Ignore what is not Cloudera supported  Oozie features one third of the test!  See gethue.com website for short oozie ui tutorials  How to get Oozie to run on your small default cluster:  Adjust container memory so you can run multiple containers  Increase Node manager max container size to 7 GB  Limit container memory max size to 3 GB and 1 cpu  Result on a dual core 8gb 3x worker node cluster: 6 containers.  Currently Spark 1.6 not Spark 2.0 (will be updated in the future)
  • 20. © Cloudera, Inc. All rights reserved. Qualify for free certification  Take part in a Data Analyst, Developer or Administrator Public class to receive a free certification exam in the given discipline  Valid till the end of April
  • 21. © Cloudera, Inc. All rights reserved. Thank you