Hadoop online trainings


Published on

Geek Trainings started by a group of Trainers and HR Specialists team is truly a pioneer in the field of Training on different technologies with a proven track record of successfully undertaking Corporate, Class Room and Online Trainings with brilliant and qualitative professionals Trainers in multifarious positions in the ever-expanding arena of Information Technology ( IT ) in India.

Published in: Education, Technology, Business
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Hadoop online trainings

  1. 1. Apache Hadoop and Amazon EMR Course Details APACHE HADOOP AND AMAZON EMR COURSE Introduction to HADOOP • Distributed computing , cloud computing • Big data Basics and Need for Parallel Processing • How Hadoop works ? • Introduction to HDFS and Map Reduce Hadoop Architecture Details • Name Node • Data Node • Secondary Name Node • Job Tracker • Task Tracker HDFS ( Hadoop - Distributed File System) • Hadoop Distributed file system , Background, GFS • Data Replication • Data Storage • Data Retrieval • Additional HDFS commands MapReduce Programming
  2. 2. • MapReduce, Background • Writing MapReduce Programs • Writable and WritableComparable • Input Format, Output Format • Input Split and Block size • Combiner • Partitioner • Number of Mappers and Reducers • Counters Map Reduce Algorithms and Exercises • Word Count • Distributed GREP • Sorting Data • Log file Analysis Hadoop Streaming • How Streaming Works ? • Writing MapReduce programs in other languages • Amazon MR based exercise session. Introduction to Amazon Map Reduce (AWS-EMR) • Hadoop using Amaozon Web Service • AWS MapReduce and EC2 • AWS-MR Architecture. • Multipl Cluster Deployment using AWS-S3 Hadoop Ecosystem and Other Related Projects • Hive Introduction • Hive Installation • Hive Exercises and Samples • Pig Installation • Pig Scripts execution • HBase Installation • Hbase Exercises • Sqoop Installation
  3. 3. • Using SQOOP for RDBMS to HDFS data flow. Hadoop Deployment • Basic Hadoop deployment techniques • Directory Layout and component details • Networking challenges in Hadoop Deployment • Disaster Recovery ( DR ) in Hadoop . Hadoop Cluster Configuration and Monitoring • Master / Salve Configuration • Important Directroires • Small, Medium and Large Cluster considerations • Hadoop Monitoring - GANGLIA ,NAGIOS Hadoop Business Case • Why Hadoop is NOT a Silver Bullet for all your problems. • When to use Hadoop- Business Cases • When NOT to use Hadoop - Business Case Hadoop and Cloud Computing • Using Cloud technologies for distributed processing • Hadoop on Amazon Web Service. • Hadoop in Oracle Cloud / RackSpace
  4. 4. =========================================== HADOOP AND AWS EXERCISES: • Hadoop Virtual Machine Setup. • Configuring Hadoop in Single Cluster. • Loading/UnLoading Data in Distributed HDFS System. • Map Reduce Programs - WordCount, Grep, Sort,etc • Amazon Map Reduce Programs- Hadoop Streaming. • Process and Metrics Analysis for Hadoop Output. • Apache Pig Installation and script execution. • Hadoop and Flume examples. • HiveQL commands and scripts . • HBASE Installation and samples. • Many more examples and exercises /assignments =====================================================