2. Course Objective Summary During this course, you will
learn:
• Introduction to Big Data and Hadoop
• Hadoop ecosystem - Concepts
• Hadoop Map-reduce concepts and features
• Developing the map-reduce Applications
• Pig concepts
• Hive concepts
• Oozie workflow concepts
• Flume Concepts
• Hue Concepts
• HBASE Concepts
• Real Life Use Cases
3. Virtual box/VM Ware
• Basics
• Installations
• Backups
• Snapshots
Linux
• Basics
• Installations • Commands
Hadoop
• Why Hadoop?
• Scaling
• Distributed Framework
• Hadoop v/s RDBMS
• Brief history of Hadoop
7. Map reduce – customization
• Custom Input format class
• Hash Partitioner
• Custom Partitioner
• Sorting techniques
• Custom Output format class
Hadoop Programming Languages :-
I).HIVE
• Introduction
• Installation and Configuration
• Interacting HDFS using HIVE
• Map Reduce Programs through HIVE
• HIVE Commands
• Loading, Filtering, Grouping….
• Data types, Operators…..
• Joins, Groups….
• Sample programs in HIVE
8. II).PIG
• Basics
• Installation and Configurations
• Commands….
OVERVIEW HADOOP DEVELOPER
Introduction
The Motivation for Hadoop
• Problems with traditional large-scale systems
• Requirements for a new approach
9. • Hadoop: Basic Concepts
• Map-side join
• Reduce-Side join
Introduction
• An Overview of Hadoop
• The Hadoop Distributed File System
• Hands-On Exercise
• How MapReduce Works
• Hands-On Exercise
• Anatomy of a Hadoop Cluster
• Other Hadoop Ecosystem Components
10. Writing a MapReduce Program
• The MapReduce Flow
• Examining a Sample MapReduce Program
• Basic MapReduce API Concepts
• The Driver Code
• The Mapper
• The Reducer
• Hadoop’s Streaming API
• Using Eclipse for Rapid Development
• Hands-on exercise
• The New MapReduce API
11. Common MapReduce Algorithms
• Sorting and Searching
• Indexing
• Machine Learning With Mahout
• Term Frequency – Inverse Document Frequency
• Word Co-Occurrence
• Hands-On Exercise.
PIG Concepts..
• Data loading in PIG.
• Data Extraction in PIG.
• Data Transformation in PIG.
• Hands on exercise on PIG.
12. Hive Concepts.
• Hive Query Language.
• Alter and Delete in Hive.
• Partition in Hive.
• Indexing.
• Joins in Hive.Unions in hive.
• Industry specific configuration of hive parameters.
• Authentication & Authorization.
• Statistics with Hive.
• Archiving in Hive.
• Hands-on exercise
Working with Sqoop
• Introduction.
• Import Data.
• Export Data.
• Sqoop Syntaxs.
• Databases connection.
• Hands-on exercise