Apache Hadoop and Amazon EMR Course Details
APACHE HADOOP AND AMAZON EMR COURSE
Introduction to HADOOP
• Distributed computing , cloud computing
• Big data Basics and Need for Parallel Processing
• How Hadoop works ?
• Introduction to HDFS and Map Reduce
Hadoop Architecture Details
• Name Node
• Data Node
• Secondary Name Node
• Job Tracker
• Task Tracker
HDFS ( Hadoop - Distributed File System)
• Hadoop Distributed file system , Background, GFS
• Data Replication
• Data Storage
• Data Retrieval
• Additional HDFS commands
MapReduce Programming
• MapReduce, Background
• Writing MapReduce Programs
• Writable and WritableComparable
• Input Format, Output Format
• Input Split and Block size
• Combiner
• Partitioner
• Number of Mappers and Reducers
• Counters
Map Reduce Algorithms and Exercises
• Word Count
• Distributed GREP
• Sorting Data
• Log file Analysis
Hadoop Streaming
• How Streaming Works ?
• Writing MapReduce programs in other languages
• Amazon MR based exercise session.
Introduction to Amazon Map Reduce (AWS-EMR)
• Hadoop using Amaozon Web Service
• AWS MapReduce and EC2
• AWS-MR Architecture.
• Multipl Cluster Deployment using AWS-S3
Hadoop Ecosystem and Other Related Projects
• Hive Introduction
• Hive Installation
• Hive Exercises and Samples
• Pig Installation
• Pig Scripts execution
• HBase Installation
• Hbase Exercises
• Sqoop Installation
• Using SQOOP for RDBMS to HDFS data flow.
Hadoop Deployment
• Basic Hadoop deployment techniques
• Directory Layout and component details
• Networking challenges in Hadoop Deployment
• Disaster Recovery ( DR ) in Hadoop .
Hadoop Cluster Configuration and Monitoring
• Master / Salve Configuration
• Important Directroires
• Small, Medium and Large Cluster considerations
• Hadoop Monitoring - GANGLIA ,NAGIOS
Hadoop Business Case
• Why Hadoop is NOT a Silver Bullet for all your problems.
• When to use Hadoop- Business Cases
• When NOT to use Hadoop - Business Case
Hadoop and Cloud Computing
• Using Cloud technologies for distributed processing
• Hadoop on Amazon Web Service.
• Hadoop in Oracle Cloud / RackSpace
===========================================
HADOOP AND AWS EXERCISES:
• Hadoop Virtual Machine Setup.
• Configuring Hadoop in Single Cluster.
• Loading/UnLoading Data in Distributed HDFS System.
• Map Reduce Programs - WordCount, Grep, Sort,etc
• Amazon Map Reduce Programs- Hadoop Streaming.
• Process and Metrics Analysis for Hadoop Output.
• Apache Pig Installation and script execution.
• Hadoop and Flume examples.
• HiveQL commands and scripts .
• HBASE Installation and samples.
• Many more examples and exercises /assignments
=====================================================
PEOPLE WHOSE HOW ARE INTERSTED TO LEARN COURSES , DROP A
MAIL TO GEEKTRAININGS@GMAIL.COM , GEEKTRAININGS@MAIL.COM
(OR) CALL TO US +918884977541 ,+919739313183
For more details VISIT OUR BLOG http://geektrainings.blogspot.in//

Hadoop

  • 1.
    Apache Hadoop andAmazon EMR Course Details APACHE HADOOP AND AMAZON EMR COURSE Introduction to HADOOP • Distributed computing , cloud computing • Big data Basics and Need for Parallel Processing • How Hadoop works ? • Introduction to HDFS and Map Reduce Hadoop Architecture Details • Name Node • Data Node • Secondary Name Node • Job Tracker • Task Tracker HDFS ( Hadoop - Distributed File System) • Hadoop Distributed file system , Background, GFS • Data Replication • Data Storage • Data Retrieval • Additional HDFS commands MapReduce Programming
  • 2.
    • MapReduce, Background •Writing MapReduce Programs • Writable and WritableComparable • Input Format, Output Format • Input Split and Block size • Combiner • Partitioner • Number of Mappers and Reducers • Counters Map Reduce Algorithms and Exercises • Word Count • Distributed GREP • Sorting Data • Log file Analysis Hadoop Streaming • How Streaming Works ? • Writing MapReduce programs in other languages • Amazon MR based exercise session. Introduction to Amazon Map Reduce (AWS-EMR) • Hadoop using Amaozon Web Service • AWS MapReduce and EC2 • AWS-MR Architecture. • Multipl Cluster Deployment using AWS-S3 Hadoop Ecosystem and Other Related Projects • Hive Introduction • Hive Installation • Hive Exercises and Samples • Pig Installation • Pig Scripts execution • HBase Installation • Hbase Exercises • Sqoop Installation
  • 3.
    • Using SQOOPfor RDBMS to HDFS data flow. Hadoop Deployment • Basic Hadoop deployment techniques • Directory Layout and component details • Networking challenges in Hadoop Deployment • Disaster Recovery ( DR ) in Hadoop . Hadoop Cluster Configuration and Monitoring • Master / Salve Configuration • Important Directroires • Small, Medium and Large Cluster considerations • Hadoop Monitoring - GANGLIA ,NAGIOS Hadoop Business Case • Why Hadoop is NOT a Silver Bullet for all your problems. • When to use Hadoop- Business Cases • When NOT to use Hadoop - Business Case Hadoop and Cloud Computing • Using Cloud technologies for distributed processing • Hadoop on Amazon Web Service. • Hadoop in Oracle Cloud / RackSpace
  • 4.
    =========================================== HADOOP AND AWSEXERCISES: • Hadoop Virtual Machine Setup. • Configuring Hadoop in Single Cluster. • Loading/UnLoading Data in Distributed HDFS System. • Map Reduce Programs - WordCount, Grep, Sort,etc • Amazon Map Reduce Programs- Hadoop Streaming. • Process and Metrics Analysis for Hadoop Output. • Apache Pig Installation and script execution. • Hadoop and Flume examples. • HiveQL commands and scripts . • HBASE Installation and samples. • Many more examples and exercises /assignments ===================================================== PEOPLE WHOSE HOW ARE INTERSTED TO LEARN COURSES , DROP A MAIL TO GEEKTRAININGS@GMAIL.COM , GEEKTRAININGS@MAIL.COM (OR) CALL TO US +918884977541 ,+919739313183 For more details VISIT OUR BLOG http://geektrainings.blogspot.in//