SlideShare a Scribd company logo
Hadoop Course content
Hardware Requirements :-- Systems must have at least 2gb RAM.
Software Requirements :-- I will provide all softwares (Operating System also).
Contents:--
Virtualbox/VM Ware
a. Basics
b. Installations
c. Backups
d. Snapshots
Linux
a. Basics
b. Installations
c. Commands
Hadoop
a. Why Hadoop?
b. Scaling
c. Distributed Framework
d. Hadoop v/s RDBMS
e. Brief history of hadoop
Setup hadoop
a. Pseudo mode
b. Cluster mode
c. Ipv6
d. Ssh, keygen
e. Installation of java, hadoop
f. Configurations of hadoop
g. Hadoop Processes ( NN, SNN, JT, DN, TT)
h. Temporary directory
i. UI
j. Common errors when running hadoop cluster, solutions
HDFS- Hadoop distributed File System
a. HDFS Design and Architecture
b. HDFS Concepts
c. Interacting HDFS using command line
d. Interacting HDFS using Java APIs
e. Dataflow
f. Blocks
g. Replica
Hadoop Processes
a. Name node
b. Secondary name node
c. Job tracker
d. Task tracker
e. Data node
Map Reduce
a. Developing Map Reduce Application
b. Phases in Map Reduce Framework
c. Map Reduce Input and Output Formats
d. Advanced Concepts
e. Sample Applications
f. Combiner
g. HAR
Joining datasets in Mapreduce jobs
a. Map-side join
b. Reduce-Side join
Cluster Configuration
Map reduce – customization
a. Custom Input format class
b. Hash Partitioner
c. Custom Partitioner
d. Sorting techniques
e. Custom Output format class
Hadoop Programming Languages :-
PIG
a. Introduction
b. Installation and Configuration
c. Interacting HDFS using PIG
d. Map Reduce Programs through PIG
e. PIG Commands
f. Loading, Filtering, Grouping….
g. Data types, Operators…..
h. Joins, Groups….
i. Sample programs in PIG
Hive
a. Basics
b. Installation and Configurations
c. Commands….
NOSQL Databases Concepts
Databases :
HBASE
a. Basics
b. Installation and Configurations
c. Commands
MONGO DB:
a. Basics
b. Installation and Configurations
c. Commands
Specialities :--
ETL tool ( PDI ) ( Data Warehousing BI Tools)
a. Introduction
b. Creating RDBMS database
c. Establishing Connection between PDI to RDMS database
d. Creating data in hadoop
e. Establishing Connection between PDI to Hadoop data
f. Summarization
Hadoop course content

More Related Content

What's hot

GPU based password recovery on Linux. TXLF 2013
GPU based password recovery on Linux. TXLF 2013GPU based password recovery on Linux. TXLF 2013
GPU based password recovery on Linux. TXLF 2013
Brad Richardson
 
Introduction to hadoop administration jk
Introduction to hadoop administration   jkIntroduction to hadoop administration   jk
Introduction to hadoop administration jkEdureka!
 
Hadoop over rgw
Hadoop over rgwHadoop over rgw
Hadoop over rgw
zhouyuan
 
NoSQL 동향
NoSQL 동향NoSQL 동향
NoSQL 동향NAVER D2
 
Quick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage ClusterQuick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage Cluster
Patrick Quairoli
 
Best hadoop-online-training
Best hadoop-online-trainingBest hadoop-online-training
Best hadoop-online-training
Geohedrick
 
Configure h base hadoop and hbase client
Configure h base hadoop and hbase clientConfigure h base hadoop and hbase client
Configure h base hadoop and hbase client
Shashwat Shriparv
 
SUSE Storage: Sizing and Performance (Ceph)
SUSE Storage: Sizing and Performance (Ceph)SUSE Storage: Sizing and Performance (Ceph)
SUSE Storage: Sizing and Performance (Ceph)
Lars Marowsky-Brée
 
Hadoop administration
Hadoop administrationHadoop administration
Hadoop administration
Aneesh Pulickal Karunakaran
 
Hadoop installation with an example
Hadoop installation with an exampleHadoop installation with an example
Hadoop installation with an example
Nikita Kesharwani
 
Single node-cluster-hadoop-2-0
Single node-cluster-hadoop-2-0Single node-cluster-hadoop-2-0
Single node-cluster-hadoop-2-0
sparrowAnalytics.com
 
Deployment and Management of Hadoop Clusters
Deployment and Management of Hadoop ClustersDeployment and Management of Hadoop Clusters
Deployment and Management of Hadoop Clusters
Amal G Jose
 
Ceph Day KL - Ceph Tiering with High Performance Archiecture
Ceph Day KL - Ceph Tiering with High Performance ArchiectureCeph Day KL - Ceph Tiering with High Performance Archiecture
Ceph Day KL - Ceph Tiering with High Performance Archiecture
Ceph Community
 
Hadoop online training
Hadoop online trainingHadoop online training
Hadoop online training
srikanthhadoop
 
Intorduce to Ceph
Intorduce to CephIntorduce to Ceph
Intorduce to Ceph
kao kuo-tung
 
Hadoop Installation presentation
Hadoop Installation presentationHadoop Installation presentation
Hadoop Installation presentation
puneet yadav
 
Introduction to apache hadoop
Introduction to apache hadoopIntroduction to apache hadoop
Introduction to apache hadoop
Shashwat Shriparv
 
Hadoop cluster configuration
Hadoop cluster configurationHadoop cluster configuration
Hadoop cluster configurationprabakaranbrick
 
Administer Hadoop Cluster
Administer Hadoop ClusterAdminister Hadoop Cluster
Administer Hadoop Cluster
Edureka!
 

What's hot (19)

GPU based password recovery on Linux. TXLF 2013
GPU based password recovery on Linux. TXLF 2013GPU based password recovery on Linux. TXLF 2013
GPU based password recovery on Linux. TXLF 2013
 
Introduction to hadoop administration jk
Introduction to hadoop administration   jkIntroduction to hadoop administration   jk
Introduction to hadoop administration jk
 
Hadoop over rgw
Hadoop over rgwHadoop over rgw
Hadoop over rgw
 
NoSQL 동향
NoSQL 동향NoSQL 동향
NoSQL 동향
 
Quick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage ClusterQuick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage Cluster
 
Best hadoop-online-training
Best hadoop-online-trainingBest hadoop-online-training
Best hadoop-online-training
 
Configure h base hadoop and hbase client
Configure h base hadoop and hbase clientConfigure h base hadoop and hbase client
Configure h base hadoop and hbase client
 
SUSE Storage: Sizing and Performance (Ceph)
SUSE Storage: Sizing and Performance (Ceph)SUSE Storage: Sizing and Performance (Ceph)
SUSE Storage: Sizing and Performance (Ceph)
 
Hadoop administration
Hadoop administrationHadoop administration
Hadoop administration
 
Hadoop installation with an example
Hadoop installation with an exampleHadoop installation with an example
Hadoop installation with an example
 
Single node-cluster-hadoop-2-0
Single node-cluster-hadoop-2-0Single node-cluster-hadoop-2-0
Single node-cluster-hadoop-2-0
 
Deployment and Management of Hadoop Clusters
Deployment and Management of Hadoop ClustersDeployment and Management of Hadoop Clusters
Deployment and Management of Hadoop Clusters
 
Ceph Day KL - Ceph Tiering with High Performance Archiecture
Ceph Day KL - Ceph Tiering with High Performance ArchiectureCeph Day KL - Ceph Tiering with High Performance Archiecture
Ceph Day KL - Ceph Tiering with High Performance Archiecture
 
Hadoop online training
Hadoop online trainingHadoop online training
Hadoop online training
 
Intorduce to Ceph
Intorduce to CephIntorduce to Ceph
Intorduce to Ceph
 
Hadoop Installation presentation
Hadoop Installation presentationHadoop Installation presentation
Hadoop Installation presentation
 
Introduction to apache hadoop
Introduction to apache hadoopIntroduction to apache hadoop
Introduction to apache hadoop
 
Hadoop cluster configuration
Hadoop cluster configurationHadoop cluster configuration
Hadoop cluster configuration
 
Administer Hadoop Cluster
Administer Hadoop ClusterAdminister Hadoop Cluster
Administer Hadoop Cluster
 

Similar to Hadoop course content

Hadoop Cluster With High Availability
Hadoop Cluster With High AvailabilityHadoop Cluster With High Availability
Hadoop Cluster With High Availability
Edureka!
 
Power Hadoop Cluster with AWS Cloud
Power Hadoop Cluster with AWS CloudPower Hadoop Cluster with AWS Cloud
Power Hadoop Cluster with AWS Cloud
Edureka!
 
Secure Hadoop Cluster With Kerberos
Secure Hadoop Cluster With KerberosSecure Hadoop Cluster With Kerberos
Secure Hadoop Cluster With Kerberos
Edureka!
 
Hadoop Architecture and HDFS
Hadoop Architecture and HDFSHadoop Architecture and HDFS
Hadoop Architecture and HDFS
Edureka!
 
Learn Hadoop Administration
Learn Hadoop AdministrationLearn Hadoop Administration
Learn Hadoop Administration
Edureka!
 
Playing with Hadoop (NPW2013)
Playing with Hadoop (NPW2013)Playing with Hadoop (NPW2013)
Playing with Hadoop (NPW2013)Søren Lund
 
Hadoop Developer
Hadoop DeveloperHadoop Developer
Hadoop DeveloperEdureka!
 
Single node setup
Single node setupSingle node setup
Single node setupKBCHOW123
 
July 2010 Triangle Hadoop Users Group - Chad Vawter Slides
July 2010 Triangle Hadoop Users Group - Chad Vawter SlidesJuly 2010 Triangle Hadoop Users Group - Chad Vawter Slides
July 2010 Triangle Hadoop Users Group - Chad Vawter Slides
ryancox
 
Learn to setup a Hadoop Multi Node Cluster
Learn to setup a Hadoop Multi Node ClusterLearn to setup a Hadoop Multi Node Cluster
Learn to setup a Hadoop Multi Node Cluster
Edureka!
 
H2O on Hadoop Dec 12
H2O on Hadoop Dec 12 H2O on Hadoop Dec 12
H2O on Hadoop Dec 12
Sri Ambati
 
2 Hadoop 1.x presentation in understading .pptx
2 Hadoop 1.x presentation in understading .pptx2 Hadoop 1.x presentation in understading .pptx
2 Hadoop 1.x presentation in understading .pptx
Kishanhari3
 
Single node hadoop cluster installation
Single node hadoop cluster installation Single node hadoop cluster installation
Single node hadoop cluster installation
Mahantesh Angadi
 
02 Hadoop deployment and configuration
02 Hadoop deployment and configuration02 Hadoop deployment and configuration
02 Hadoop deployment and configuration
Subhas Kumar Ghosh
 
Apache HDFS - Lab Assignment
Apache HDFS - Lab AssignmentApache HDFS - Lab Assignment
Apache HDFS - Lab Assignment
Farzad Nozarian
 
Unit 5
Unit  5Unit  5
Unit 5
Ravi Kumar
 
Introduction to Hadoop part1
Introduction to Hadoop part1Introduction to Hadoop part1
Introduction to Hadoop part1
Giovanna Roda
 
Hadoop online training by certified trainer
Hadoop online training by certified trainerHadoop online training by certified trainer
Hadoop online training by certified trainersriram0233
 
Setting High Availability in Hadoop Cluster
Setting High Availability in Hadoop ClusterSetting High Availability in Hadoop Cluster
Setting High Availability in Hadoop Cluster
Edureka!
 
Hadoop Interview Questions and Answers by rohit kapa
Hadoop Interview Questions and Answers by rohit kapaHadoop Interview Questions and Answers by rohit kapa
Hadoop Interview Questions and Answers by rohit kapa
kapa rohit
 

Similar to Hadoop course content (20)

Hadoop Cluster With High Availability
Hadoop Cluster With High AvailabilityHadoop Cluster With High Availability
Hadoop Cluster With High Availability
 
Power Hadoop Cluster with AWS Cloud
Power Hadoop Cluster with AWS CloudPower Hadoop Cluster with AWS Cloud
Power Hadoop Cluster with AWS Cloud
 
Secure Hadoop Cluster With Kerberos
Secure Hadoop Cluster With KerberosSecure Hadoop Cluster With Kerberos
Secure Hadoop Cluster With Kerberos
 
Hadoop Architecture and HDFS
Hadoop Architecture and HDFSHadoop Architecture and HDFS
Hadoop Architecture and HDFS
 
Learn Hadoop Administration
Learn Hadoop AdministrationLearn Hadoop Administration
Learn Hadoop Administration
 
Playing with Hadoop (NPW2013)
Playing with Hadoop (NPW2013)Playing with Hadoop (NPW2013)
Playing with Hadoop (NPW2013)
 
Hadoop Developer
Hadoop DeveloperHadoop Developer
Hadoop Developer
 
Single node setup
Single node setupSingle node setup
Single node setup
 
July 2010 Triangle Hadoop Users Group - Chad Vawter Slides
July 2010 Triangle Hadoop Users Group - Chad Vawter SlidesJuly 2010 Triangle Hadoop Users Group - Chad Vawter Slides
July 2010 Triangle Hadoop Users Group - Chad Vawter Slides
 
Learn to setup a Hadoop Multi Node Cluster
Learn to setup a Hadoop Multi Node ClusterLearn to setup a Hadoop Multi Node Cluster
Learn to setup a Hadoop Multi Node Cluster
 
H2O on Hadoop Dec 12
H2O on Hadoop Dec 12 H2O on Hadoop Dec 12
H2O on Hadoop Dec 12
 
2 Hadoop 1.x presentation in understading .pptx
2 Hadoop 1.x presentation in understading .pptx2 Hadoop 1.x presentation in understading .pptx
2 Hadoop 1.x presentation in understading .pptx
 
Single node hadoop cluster installation
Single node hadoop cluster installation Single node hadoop cluster installation
Single node hadoop cluster installation
 
02 Hadoop deployment and configuration
02 Hadoop deployment and configuration02 Hadoop deployment and configuration
02 Hadoop deployment and configuration
 
Apache HDFS - Lab Assignment
Apache HDFS - Lab AssignmentApache HDFS - Lab Assignment
Apache HDFS - Lab Assignment
 
Unit 5
Unit  5Unit  5
Unit 5
 
Introduction to Hadoop part1
Introduction to Hadoop part1Introduction to Hadoop part1
Introduction to Hadoop part1
 
Hadoop online training by certified trainer
Hadoop online training by certified trainerHadoop online training by certified trainer
Hadoop online training by certified trainer
 
Setting High Availability in Hadoop Cluster
Setting High Availability in Hadoop ClusterSetting High Availability in Hadoop Cluster
Setting High Availability in Hadoop Cluster
 
Hadoop Interview Questions and Answers by rohit kapa
Hadoop Interview Questions and Answers by rohit kapaHadoop Interview Questions and Answers by rohit kapa
Hadoop Interview Questions and Answers by rohit kapa
 

Hadoop course content

  • 1. Hadoop Course content Hardware Requirements :-- Systems must have at least 2gb RAM. Software Requirements :-- I will provide all softwares (Operating System also). Contents:-- Virtualbox/VM Ware a. Basics b. Installations c. Backups d. Snapshots Linux a. Basics b. Installations c. Commands Hadoop a. Why Hadoop? b. Scaling c. Distributed Framework d. Hadoop v/s RDBMS e. Brief history of hadoop Setup hadoop a. Pseudo mode b. Cluster mode c. Ipv6 d. Ssh, keygen e. Installation of java, hadoop
  • 2. f. Configurations of hadoop g. Hadoop Processes ( NN, SNN, JT, DN, TT) h. Temporary directory i. UI j. Common errors when running hadoop cluster, solutions HDFS- Hadoop distributed File System a. HDFS Design and Architecture b. HDFS Concepts c. Interacting HDFS using command line d. Interacting HDFS using Java APIs e. Dataflow f. Blocks g. Replica Hadoop Processes a. Name node b. Secondary name node c. Job tracker d. Task tracker e. Data node Map Reduce a. Developing Map Reduce Application b. Phases in Map Reduce Framework c. Map Reduce Input and Output Formats d. Advanced Concepts e. Sample Applications
  • 3. f. Combiner g. HAR Joining datasets in Mapreduce jobs a. Map-side join b. Reduce-Side join Cluster Configuration Map reduce – customization a. Custom Input format class b. Hash Partitioner c. Custom Partitioner d. Sorting techniques e. Custom Output format class Hadoop Programming Languages :- PIG a. Introduction b. Installation and Configuration c. Interacting HDFS using PIG d. Map Reduce Programs through PIG e. PIG Commands f. Loading, Filtering, Grouping…. g. Data types, Operators….. h. Joins, Groups…. i. Sample programs in PIG Hive
  • 4. a. Basics b. Installation and Configurations c. Commands…. NOSQL Databases Concepts Databases : HBASE a. Basics b. Installation and Configurations c. Commands MONGO DB: a. Basics b. Installation and Configurations c. Commands Specialities :-- ETL tool ( PDI ) ( Data Warehousing BI Tools) a. Introduction b. Creating RDBMS database c. Establishing Connection between PDI to RDMS database d. Creating data in hadoop e. Establishing Connection between PDI to Hadoop data f. Summarization