SlideShare a Scribd company logo
Hadoop By SyEd
In Depth Training
In Depth HadoopTraining
Module I. Introduction to Big Data and Hadoop
* What is Big Data?
* What are the challenges for processing big data?
* What technologies support big data?
*3V’s of BigData and Growing.
* What is Hadoop?
* Why Hadoop and its Use cases
* History of Hadoop
* Different Ecosystems of Hadoop.
* Advantages and Disadvantages of Hadoop
* Real Life Use Cases
Module II. HDFS (Hadoop Distributed File System)
* HDFS architecture
* Features of HDFS
* Where does it fit and Where doesn't fit?
* HDFS daemons and its functionalities
* Name Node and its functionality
* Data Node and its functionality
* Secondary Name Node and its functionality
* Data Storage in HDFS
* Introduction about Blocks
* Data replication
*Accessing HDFS
* CLI(Command Line Interface) and admin commands
* Java Based Approach
*Hadoop Administration
*Hadoop Configuration Files
*Configuring Hadoop Domains
*Precedence of Hadoop Configuration
*Diving into Hadoop Configuration
*Scheduler
*RackAwareness
*Cluster Administration Utilities
*Rebalancing HDFS DATA
*Copy Large amount of data from HDFS
*FSImage and Edit.log file theoretically and practically.
*HDFS Federation
*MultiNameNodes
*Shared Storage
*JournalNode
*NameSpaces
*BlockPools
In Depth HadoopTraining
Module III. MAPREDUCE - YARN
* Map Reduce architecture
* JobTracker ,TaskTracker, NodeManager, ResourceManager,
ApplicationMaster and its functionality
* Job execution flow
* Configuring development environment using Eclipse
* Map Reduce Programming Model
* How to write a basic Map Reduce jobs
* Running the Map Reduce jobs in local mode and distributed mode
* Different Data types in Map Reduce
* How to use Input Formatters and Output Formatters in Map Reduce Jobs
* Input formatters and its associated Record Readers with examples
* Text Input Formatter
* Key Value Text Input Formatter
* Sequence File Input Formatter
* How to write custom Input Formatters and its Record Readers
* Output formatters and its associated Record Writers with examples
* Text Output Formatter
* Sequence File Output Formatter
* How to write custom Output Formatters and its Record Writers
* How to write Combiners, Partitioners and use of these
* Importance of Distributed Cache
* Importance Counters and how to use Counters
Module IV. Advance MapReduce Programming – Program’ssharing as part of POC’s
* Joins - Map Side and Reduce Side
* Use of Secondary Sorting
* Importance of Writable and Writable Comparable Api's
* How to write Map Reduce Keys and Values
* Use of Compression techniques
* Snappy, LZO and Zip
* How to debug Map Reduce Jobs in Local and Pseudo Mode.
* Introduction to Map Reduce Streaming and Pipes with examples
*Job Submission
*Job Initialization
*Task Assignment
*Task Execution
*Progress and status bar
*Job Completion
*Failures
*Task Failure
*Tasktracker failure
*JobTracker failure
*NodeManager Failure
*ResourceManager Failure
*ApplicationMaster Failure
In Depth HadoopTraining
*Job Scheduling
*Shuffle & Sort in depth
* Diving into Shuffle and Sort
* Dive into Input Splits
* Dive into Buffer Concepts
*Dive into Configuration Tuning
*Dive into Task Execution
*The Task assignment Environment
*Speculative Execution
*Output Committers
*Task JVM Reuse
*Multiple Inputs & Multiple Outputs
*Build In Counters
* Dive into Counters – Job Counters & User Defined Counters
* Sql operations using Java MapReduce
* Introduction to YARN (Next Generation Map Reduce)
Module V. Apache HIVE
* Hive Introduction
* Hive architecture
* Driver
* Compiler
* Semantic Analyzer
* Hive Integration with Hadoop
* Hive Query Language(Hive QL)
* SQL VS Hive QL
* Hive Installation and Configuration
* Hive, Map-Reduce and Local-Mode
* Hive DLL and DML Operations
* Hive Services
* CLI
*Schema Design
*Views
*Indexes
* Hiveserver
Metastore
* embedded metastore configuration
* externalmetastore configuration
* Transformations in Hive
* UDFs in Hive
* How to write a simple hive queries
* Usage
*Tuning
* Hive with HBASE Integration
* DML & DDL Operations
* HiveServer2
* Beeline
In Depth HadoopTraining
Module VI. Apache PIG
* Introduction to Apache Pig
* Map Reduce Vs Apache Pig
* SQL Vs Apache Pig
* Different data types in Pig
* Modes of Execution in Pig
* Local Mode
* Map Reduce Mode
* Execution Mechanism
* Grunt Shell
* Script
* Embedded
* Transformations in Pig
* How to write a simple pig script
* UDFs in Pig
* Pig with HBASE Integration
Module VII. Apache SQOOP
* Introduction to Sqoop
* MySQL client and Server Installation
* How to connect to Relational Database using Sqoop
* Sqoop Commands and Examples on Import and Export commands.
*Transferring an Entire Table
*Specifying a Target Directory
*Importing only a Subset of data
*Protectingyour password
*Using a file format other than CSV
*Compressing Imported Data
*Speeding up Transfers
*Overriding Type Mapping
*Controlling Parallelism
*Encoding Null Values
*Importing all your tables
*Incremental Import
*Importing only new data
*Incrementing Importing Mutable data
*Preserving the last imported value
*Storing Password in the Metastore
*Overriding arguments to a saved job
*Sharing the MetaStore between sqoop client
*Importing data from two tables
*Using Custom Boundary Queries
*Renaming Sqoop Job instances
*Importing Queries with duplicate columns
*Transferring data from Hadoop
*Inserting Data in Batches
In Depth HadoopTraining
*Updating or Inserting at the same time
*Exporting Corrupted Data
Module VIII. Apache FLUME
* Introduction to flume
* Flume agent usage
Module IX Apache Hbase
* Hbase introduction
* Hbase basics
*Catgories of NoSQL DataBases
- Key-Value Database
- Document DataBase
- Column Family DataBase
- Graph DataBase
* Column families
* Scans
* Hbase installation
* Hbase Architecture
* Storage
* WriteAhead Log
* Mapreduce integration
* Mapreduce over Hbase
* Hbase Usage
* Key design
* Bloom Filters
* Versioning
* Filters
* Hbase Clients
* REST
* Thrift
* Hive with HBase Integration
* Web Based UI
* Hbase Admin
* Schema definition
* Basic CRUD operations
*CRUD Operation using Java API
*Zookeeper
Module X. Apache OOZIE
* Introduction to Oozie
* Executing workflow jobs
In Depth HadoopTraining
Module XI. HUE
* Installation and Browsing HDFS Data
Module XII. Hadoop Installation on Linux, All other ecosystems installations on Linux.
Module XIII. Cluster setup (200 Nodes cluster) knowledge sharing with setup document.
Module IVX. Cloudera & Hortonworks
Module XV. Real Time Use Cases & Project Discussion
BigData - Hadoop Overview and In Depth Sessions of Hadoop Architecture 4 Hours
Hadoop Installation 1 Hour
Unix Commands, HDFS Commands 1 Hours
MapReduce 8 Hours
Hive 3 Hours
Pig 3 Hours
Sqoop 2 Hours
HBASE and Zookeeper 4 Hours
Flume 1 Hours
Oozie 1 Hours
Hue 1 Hour
Project Knowledge Sharing 1 Hour
POC's Program's Sharing
Programs
Sharing
Total Hours 30 Hours
Duration 4 Weeks

More Related Content

What's hot

SQOOP PPT
SQOOP PPTSQOOP PPT
SQOOP PPT
Dushhyant Kumar
 
hbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beamhbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beam
HBaseCon
 
Big data_hadoop_spark_kafka_nosql_training
Big data_hadoop_spark_kafka_nosql_trainingBig data_hadoop_spark_kafka_nosql_training
Big data_hadoop_spark_kafka_nosql_training
Kamal A
 
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree
Ashnikbiz
 
Intro to Apache Hadoop
Intro to Apache HadoopIntro to Apache Hadoop
Intro to Apache Hadoop
Sufi Nawaz
 
Gobblin meetup-whats new in 0.7
Gobblin meetup-whats new in 0.7Gobblin meetup-whats new in 0.7
Gobblin meetup-whats new in 0.7
Vasanth Rajamani
 

What's hot (6)

SQOOP PPT
SQOOP PPTSQOOP PPT
SQOOP PPT
 
hbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beamhbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beam
 
Big data_hadoop_spark_kafka_nosql_training
Big data_hadoop_spark_kafka_nosql_trainingBig data_hadoop_spark_kafka_nosql_training
Big data_hadoop_spark_kafka_nosql_training
 
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree
 
Intro to Apache Hadoop
Intro to Apache HadoopIntro to Apache Hadoop
Intro to Apache Hadoop
 
Gobblin meetup-whats new in 0.7
Gobblin meetup-whats new in 0.7Gobblin meetup-whats new in 0.7
Gobblin meetup-whats new in 0.7
 

Similar to Hadoop course content Syed Academy

Big data
Big dataBig data
Big data
ankitjain sinha
 
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UKIntroduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
Skills Matter
 
Hw09 Sqoop Database Import For Hadoop
Hw09   Sqoop Database Import For HadoopHw09   Sqoop Database Import For Hadoop
Hw09 Sqoop Database Import For HadoopCloudera, Inc.
 
Big data - Online Training
Big data - Online TrainingBig data - Online Training
Big data - Online Training
Learntek1
 
SQL Server 2012 and Big Data
SQL Server 2012 and Big DataSQL Server 2012 and Big Data
SQL Server 2012 and Big Data
Microsoft TechNet - Belgium and Luxembourg
 
Sql on everything with drill
Sql on everything with drillSql on everything with drill
Sql on everything with drill
Julien Le Dem
 
1. Apache HIVE
1. Apache HIVE1. Apache HIVE
1. Apache HIVE
Anuja Gunale
 
What's the Hadoop-la about Kubernetes?
What's the Hadoop-la about Kubernetes?What's the Hadoop-la about Kubernetes?
What's the Hadoop-la about Kubernetes?
DataWorks Summit
 
Big SQL Competitive Summary - Vendor Landscape
Big SQL Competitive Summary - Vendor LandscapeBig SQL Competitive Summary - Vendor Landscape
Big SQL Competitive Summary - Vendor Landscape
Nicolas Morales
 
Voiture tech talk
Voiture tech talkVoiture tech talk
Voiture tech talk
Hoppinger
 
Real-Time Data Loading from MySQL to Hadoop
Real-Time Data Loading from MySQL to HadoopReal-Time Data Loading from MySQL to Hadoop
Real-Time Data Loading from MySQL to Hadoop
Continuent
 
Real-Time Data Loading from MySQL to Hadoop with New Tungsten Replicator 3.0
Real-Time Data Loading from MySQL to Hadoop with New Tungsten Replicator 3.0Real-Time Data Loading from MySQL to Hadoop with New Tungsten Replicator 3.0
Real-Time Data Loading from MySQL to Hadoop with New Tungsten Replicator 3.0
Continuent
 
Big Data Developers Moscow Meetup 1 - sql on hadoop
Big Data Developers Moscow Meetup 1  - sql on hadoopBig Data Developers Moscow Meetup 1  - sql on hadoop
Big Data Developers Moscow Meetup 1 - sql on hadoop
bddmoscow
 
Hive_Pig.pptx
Hive_Pig.pptxHive_Pig.pptx
Hive_Pig.pptx
PAVANKUMARNOOKALA
 
Using Hadoop and Hive to Optimize Travel Search , WindyCityDB 2010
Using Hadoop and Hive to Optimize Travel Search, WindyCityDB 2010Using Hadoop and Hive to Optimize Travel Search, WindyCityDB 2010
Using Hadoop and Hive to Optimize Travel Search , WindyCityDB 2010
Jonathan Seidman
 
Apache Hadoop 1.1
Apache Hadoop 1.1Apache Hadoop 1.1
Apache Hadoop 1.1Sperasoft
 
Oracle hadoop let them talk together !
Oracle hadoop let them talk together !Oracle hadoop let them talk together !
Oracle hadoop let them talk together !
Laurent Leturgez
 
Unveiling Hive: A Comprehensive Exploration of Hive in Hadoop Ecosystem
Unveiling Hive: A Comprehensive Exploration of Hive in Hadoop EcosystemUnveiling Hive: A Comprehensive Exploration of Hive in Hadoop Ecosystem
Unveiling Hive: A Comprehensive Exploration of Hive in Hadoop Ecosystem
mashoodsyed66
 

Similar to Hadoop course content Syed Academy (20)

Big data
Big dataBig data
Big data
 
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UKIntroduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
 
Hw09 Sqoop Database Import For Hadoop
Hw09   Sqoop Database Import For HadoopHw09   Sqoop Database Import For Hadoop
Hw09 Sqoop Database Import For Hadoop
 
Big data - Online Training
Big data - Online TrainingBig data - Online Training
Big data - Online Training
 
SQL Server 2012 and Big Data
SQL Server 2012 and Big DataSQL Server 2012 and Big Data
SQL Server 2012 and Big Data
 
Sql on everything with drill
Sql on everything with drillSql on everything with drill
Sql on everything with drill
 
Unit V.pdf
Unit V.pdfUnit V.pdf
Unit V.pdf
 
1. Apache HIVE
1. Apache HIVE1. Apache HIVE
1. Apache HIVE
 
What's the Hadoop-la about Kubernetes?
What's the Hadoop-la about Kubernetes?What's the Hadoop-la about Kubernetes?
What's the Hadoop-la about Kubernetes?
 
Big SQL Competitive Summary - Vendor Landscape
Big SQL Competitive Summary - Vendor LandscapeBig SQL Competitive Summary - Vendor Landscape
Big SQL Competitive Summary - Vendor Landscape
 
Voiture tech talk
Voiture tech talkVoiture tech talk
Voiture tech talk
 
Real-Time Data Loading from MySQL to Hadoop
Real-Time Data Loading from MySQL to HadoopReal-Time Data Loading from MySQL to Hadoop
Real-Time Data Loading from MySQL to Hadoop
 
Real-Time Data Loading from MySQL to Hadoop with New Tungsten Replicator 3.0
Real-Time Data Loading from MySQL to Hadoop with New Tungsten Replicator 3.0Real-Time Data Loading from MySQL to Hadoop with New Tungsten Replicator 3.0
Real-Time Data Loading from MySQL to Hadoop with New Tungsten Replicator 3.0
 
Big Data Developers Moscow Meetup 1 - sql on hadoop
Big Data Developers Moscow Meetup 1  - sql on hadoopBig Data Developers Moscow Meetup 1  - sql on hadoop
Big Data Developers Moscow Meetup 1 - sql on hadoop
 
Hive_Pig.pptx
Hive_Pig.pptxHive_Pig.pptx
Hive_Pig.pptx
 
Using Hadoop and Hive to Optimize Travel Search , WindyCityDB 2010
Using Hadoop and Hive to Optimize Travel Search, WindyCityDB 2010Using Hadoop and Hive to Optimize Travel Search, WindyCityDB 2010
Using Hadoop and Hive to Optimize Travel Search , WindyCityDB 2010
 
Apache Hadoop 1.1
Apache Hadoop 1.1Apache Hadoop 1.1
Apache Hadoop 1.1
 
Prashanth Kumar_Hadoop_NEW
Prashanth Kumar_Hadoop_NEWPrashanth Kumar_Hadoop_NEW
Prashanth Kumar_Hadoop_NEW
 
Oracle hadoop let them talk together !
Oracle hadoop let them talk together !Oracle hadoop let them talk together !
Oracle hadoop let them talk together !
 
Unveiling Hive: A Comprehensive Exploration of Hive in Hadoop Ecosystem
Unveiling Hive: A Comprehensive Exploration of Hive in Hadoop EcosystemUnveiling Hive: A Comprehensive Exploration of Hive in Hadoop Ecosystem
Unveiling Hive: A Comprehensive Exploration of Hive in Hadoop Ecosystem
 

More from Syed Hadoop

Kafka syed academy_v1_introduction
Kafka syed academy_v1_introductionKafka syed academy_v1_introduction
Kafka syed academy_v1_introduction
Syed Hadoop
 
Spark SQL In Depth www.syedacademy.com
Spark SQL In Depth www.syedacademy.comSpark SQL In Depth www.syedacademy.com
Spark SQL In Depth www.syedacademy.com
Syed Hadoop
 
Spark Streaming In Depth - www.syedacademy.com
Spark Streaming In Depth - www.syedacademy.comSpark Streaming In Depth - www.syedacademy.com
Spark Streaming In Depth - www.syedacademy.com
Syed Hadoop
 
Spark_RDD_SyedAcademy
Spark_RDD_SyedAcademySpark_RDD_SyedAcademy
Spark_RDD_SyedAcademy
Syed Hadoop
 
Spark_Intro_Syed_Academy
Spark_Intro_Syed_AcademySpark_Intro_Syed_Academy
Spark_Intro_Syed_Academy
Syed Hadoop
 
Hadoop Architecture in Depth
Hadoop Architecture in DepthHadoop Architecture in Depth
Hadoop Architecture in Depth
Syed Hadoop
 

More from Syed Hadoop (6)

Kafka syed academy_v1_introduction
Kafka syed academy_v1_introductionKafka syed academy_v1_introduction
Kafka syed academy_v1_introduction
 
Spark SQL In Depth www.syedacademy.com
Spark SQL In Depth www.syedacademy.comSpark SQL In Depth www.syedacademy.com
Spark SQL In Depth www.syedacademy.com
 
Spark Streaming In Depth - www.syedacademy.com
Spark Streaming In Depth - www.syedacademy.comSpark Streaming In Depth - www.syedacademy.com
Spark Streaming In Depth - www.syedacademy.com
 
Spark_RDD_SyedAcademy
Spark_RDD_SyedAcademySpark_RDD_SyedAcademy
Spark_RDD_SyedAcademy
 
Spark_Intro_Syed_Academy
Spark_Intro_Syed_AcademySpark_Intro_Syed_Academy
Spark_Intro_Syed_Academy
 
Hadoop Architecture in Depth
Hadoop Architecture in DepthHadoop Architecture in Depth
Hadoop Architecture in Depth
 

Recently uploaded

一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 

Recently uploaded (20)

一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 

Hadoop course content Syed Academy

  • 1. Hadoop By SyEd In Depth Training
  • 2. In Depth HadoopTraining Module I. Introduction to Big Data and Hadoop * What is Big Data? * What are the challenges for processing big data? * What technologies support big data? *3V’s of BigData and Growing. * What is Hadoop? * Why Hadoop and its Use cases * History of Hadoop * Different Ecosystems of Hadoop. * Advantages and Disadvantages of Hadoop * Real Life Use Cases Module II. HDFS (Hadoop Distributed File System) * HDFS architecture * Features of HDFS * Where does it fit and Where doesn't fit? * HDFS daemons and its functionalities * Name Node and its functionality * Data Node and its functionality * Secondary Name Node and its functionality * Data Storage in HDFS * Introduction about Blocks * Data replication *Accessing HDFS * CLI(Command Line Interface) and admin commands * Java Based Approach *Hadoop Administration *Hadoop Configuration Files *Configuring Hadoop Domains *Precedence of Hadoop Configuration *Diving into Hadoop Configuration *Scheduler *RackAwareness *Cluster Administration Utilities *Rebalancing HDFS DATA *Copy Large amount of data from HDFS *FSImage and Edit.log file theoretically and practically. *HDFS Federation *MultiNameNodes *Shared Storage *JournalNode *NameSpaces *BlockPools
  • 3. In Depth HadoopTraining Module III. MAPREDUCE - YARN * Map Reduce architecture * JobTracker ,TaskTracker, NodeManager, ResourceManager, ApplicationMaster and its functionality * Job execution flow * Configuring development environment using Eclipse * Map Reduce Programming Model * How to write a basic Map Reduce jobs * Running the Map Reduce jobs in local mode and distributed mode * Different Data types in Map Reduce * How to use Input Formatters and Output Formatters in Map Reduce Jobs * Input formatters and its associated Record Readers with examples * Text Input Formatter * Key Value Text Input Formatter * Sequence File Input Formatter * How to write custom Input Formatters and its Record Readers * Output formatters and its associated Record Writers with examples * Text Output Formatter * Sequence File Output Formatter * How to write custom Output Formatters and its Record Writers * How to write Combiners, Partitioners and use of these * Importance of Distributed Cache * Importance Counters and how to use Counters Module IV. Advance MapReduce Programming – Program’ssharing as part of POC’s * Joins - Map Side and Reduce Side * Use of Secondary Sorting * Importance of Writable and Writable Comparable Api's * How to write Map Reduce Keys and Values * Use of Compression techniques * Snappy, LZO and Zip * How to debug Map Reduce Jobs in Local and Pseudo Mode. * Introduction to Map Reduce Streaming and Pipes with examples *Job Submission *Job Initialization *Task Assignment *Task Execution *Progress and status bar *Job Completion *Failures *Task Failure *Tasktracker failure *JobTracker failure *NodeManager Failure *ResourceManager Failure *ApplicationMaster Failure
  • 4. In Depth HadoopTraining *Job Scheduling *Shuffle & Sort in depth * Diving into Shuffle and Sort * Dive into Input Splits * Dive into Buffer Concepts *Dive into Configuration Tuning *Dive into Task Execution *The Task assignment Environment *Speculative Execution *Output Committers *Task JVM Reuse *Multiple Inputs & Multiple Outputs *Build In Counters * Dive into Counters – Job Counters & User Defined Counters * Sql operations using Java MapReduce * Introduction to YARN (Next Generation Map Reduce) Module V. Apache HIVE * Hive Introduction * Hive architecture * Driver * Compiler * Semantic Analyzer * Hive Integration with Hadoop * Hive Query Language(Hive QL) * SQL VS Hive QL * Hive Installation and Configuration * Hive, Map-Reduce and Local-Mode * Hive DLL and DML Operations * Hive Services * CLI *Schema Design *Views *Indexes * Hiveserver Metastore * embedded metastore configuration * externalmetastore configuration * Transformations in Hive * UDFs in Hive * How to write a simple hive queries * Usage *Tuning * Hive with HBASE Integration * DML & DDL Operations * HiveServer2 * Beeline
  • 5. In Depth HadoopTraining Module VI. Apache PIG * Introduction to Apache Pig * Map Reduce Vs Apache Pig * SQL Vs Apache Pig * Different data types in Pig * Modes of Execution in Pig * Local Mode * Map Reduce Mode * Execution Mechanism * Grunt Shell * Script * Embedded * Transformations in Pig * How to write a simple pig script * UDFs in Pig * Pig with HBASE Integration Module VII. Apache SQOOP * Introduction to Sqoop * MySQL client and Server Installation * How to connect to Relational Database using Sqoop * Sqoop Commands and Examples on Import and Export commands. *Transferring an Entire Table *Specifying a Target Directory *Importing only a Subset of data *Protectingyour password *Using a file format other than CSV *Compressing Imported Data *Speeding up Transfers *Overriding Type Mapping *Controlling Parallelism *Encoding Null Values *Importing all your tables *Incremental Import *Importing only new data *Incrementing Importing Mutable data *Preserving the last imported value *Storing Password in the Metastore *Overriding arguments to a saved job *Sharing the MetaStore between sqoop client *Importing data from two tables *Using Custom Boundary Queries *Renaming Sqoop Job instances *Importing Queries with duplicate columns *Transferring data from Hadoop *Inserting Data in Batches
  • 6. In Depth HadoopTraining *Updating or Inserting at the same time *Exporting Corrupted Data Module VIII. Apache FLUME * Introduction to flume * Flume agent usage Module IX Apache Hbase * Hbase introduction * Hbase basics *Catgories of NoSQL DataBases - Key-Value Database - Document DataBase - Column Family DataBase - Graph DataBase * Column families * Scans * Hbase installation * Hbase Architecture * Storage * WriteAhead Log * Mapreduce integration * Mapreduce over Hbase * Hbase Usage * Key design * Bloom Filters * Versioning * Filters * Hbase Clients * REST * Thrift * Hive with HBase Integration * Web Based UI * Hbase Admin * Schema definition * Basic CRUD operations *CRUD Operation using Java API *Zookeeper Module X. Apache OOZIE * Introduction to Oozie * Executing workflow jobs
  • 7. In Depth HadoopTraining Module XI. HUE * Installation and Browsing HDFS Data Module XII. Hadoop Installation on Linux, All other ecosystems installations on Linux. Module XIII. Cluster setup (200 Nodes cluster) knowledge sharing with setup document. Module IVX. Cloudera & Hortonworks Module XV. Real Time Use Cases & Project Discussion BigData - Hadoop Overview and In Depth Sessions of Hadoop Architecture 4 Hours Hadoop Installation 1 Hour Unix Commands, HDFS Commands 1 Hours MapReduce 8 Hours Hive 3 Hours Pig 3 Hours Sqoop 2 Hours HBASE and Zookeeper 4 Hours Flume 1 Hours Oozie 1 Hours Hue 1 Hour Project Knowledge Sharing 1 Hour POC's Program's Sharing Programs Sharing Total Hours 30 Hours Duration 4 Weeks