HADOOP COURSE CONTENT 
1. THE MOTIVATION OF HADOOP 
 Problems with traditional large scale systems 
 Requirement for a new apache 
 Introducing Hadoop 
2. HADOOP BASIC CONCEPTS 
 Hadoop project and Hadoop components 
 Hadoop distributed file system 
 Hadoop on exercise using HDFS 
 How map reduce works 
 Hands on exercise running a map reduce job 
 How a Hadoop cluster operates 
 Other Hadoop Ecosystem projects 
3. WRITING A MAP REDUCE PROGRAM 
 The Map reduce flow 
 Basic map reduce API concepts 
 Writing map reduce drivers, mappers and reducers in java 
 Writing mappers and reducers in another languages using the streaming API 
 Speeding up hadoop development by using eclipse 
 Hands on exercise writing a Map reduce program 
 Difference between old and new Map reduces APIs 
4. UNIT TESTING MAP REDUCE PROGRAMS 
 Unit testing 
 The J unit and MR unit testing frame works 
 Writing unit tests and MR units 
 Hand on exercise writing unit test and MR test frame works 
5. DELVING DEPER IN TO HADOOP API 
 Using the tool runner class 
 Decreasing the amount of intermediate data with combiners 
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
INDIA Trainingicon USA 
Phone: +91-966-690-0051 Email: info@trainingicon.com | www.trainingicon.com Phone: +1-408-791-8864
 Hands on experience writing and implementing combiners 
 Setting up and tearing down mappers and reducers by using the configure and 
close methods 
 Writing custom practitioners for better load balancing 
 Hands-on exercise on writing a practitioner 
 Accessing HDFS programmatically 
 Using the distributed cache 
 Using the Hadoop APIs library of mappers, reducers and practitioners 
6. PRACTICAL DEVELOPMENT TIPS AND TECHNIQUES 
 Strategies for debugging map reduce code 
 Testing map reduce code locally by using local job reducer 
 Writing and viewing log files 
 Retrieving job information with counters 
 Determining the optimal number of reducers for a job 
 Creating map only map reduce jobs 
 Hands on exercise using counters and a map only job 
7. DATA INPUT AND OUTPUT 
 Creating custom writable and writable comparable implementations 
 Saving binary data using sequence file and Avro data files 
 Implementing custom input formats and output formats 
 Issues to consider when using file compression 
 Hands-on exercises using sequence files and file compression 
8. COMMAN MAP REDUCE ALLOGORITHMS 
 Sorting and searching large data sets 
 Performing a secondary sort 
 Indexing data 
 Hand-on exercise creating an inverted index 
 Computing term frequency -inverse document frequency 
 Calculating word concurrence 
 Hands-on exercise calculating word concurrence 
 Hands-on exercise implementing word concurrence with a customer writable 
comparable 
9. JOINING DATA SETS IN MAP REDUCE JOBS 
 Writing a map-side join 
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
INDIA Trainingicon USA 
Phone: +91-966-690-0051 Email: info@trainingicon.com | www.trainingicon.com Phone: +1-408-791-8864
 Writing a reduce -side join 
10. INTEGRATING HADOOP IN TO ENTERPRISE WORK FLOW 
 Integrating hadoop in to an existing enterprise 
 Loading data from an RDBMS in to HDFS by using sqoop 
 Hands-on exercise importing data with sqoop 
 Managing real-time data using flume 
 Accessing HDFS from legacy systems with fuse DFS and HTTP FS 
11. MACHINE LEARNING AN MAHOUT 
 Introduction to machine learning 
 using mahout 
 Hands-on exercise using a mahout recommended 
12. AN INTRODUCTION HIVE AND PIG 
 The motivation for HIVE and PIG 
 Hive basics 
 Hands-on exercise manipulating data with HIVE 
 PIG basics 
 Hand-on exercise using PIG to retrieve movie names from our recommender 
 Choosing between HIVE and PIG 
 Introduction to OOZIE,HADOOP ONLINE TRAINING,HADOOP TRAINING 
 Creating OOZE work flow 
 Hand-on exercise running and OOZE work flow 
CONCLUSION 
APPENDIX: GRAPH PROCESSING IN MAP REDUCE AN INTRODUCTION TO 
OOZIE 
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
INDIA Trainingicon USA 
Phone: +91-966-690-0051 Email: info@trainingicon.com | www.trainingicon.com Phone: +1-408-791-8864

HADOOP ONLINE TRAINING

  • 1.
    HADOOP COURSE CONTENT 1. THE MOTIVATION OF HADOOP  Problems with traditional large scale systems  Requirement for a new apache  Introducing Hadoop 2. HADOOP BASIC CONCEPTS  Hadoop project and Hadoop components  Hadoop distributed file system  Hadoop on exercise using HDFS  How map reduce works  Hands on exercise running a map reduce job  How a Hadoop cluster operates  Other Hadoop Ecosystem projects 3. WRITING A MAP REDUCE PROGRAM  The Map reduce flow  Basic map reduce API concepts  Writing map reduce drivers, mappers and reducers in java  Writing mappers and reducers in another languages using the streaming API  Speeding up hadoop development by using eclipse  Hands on exercise writing a Map reduce program  Difference between old and new Map reduces APIs 4. UNIT TESTING MAP REDUCE PROGRAMS  Unit testing  The J unit and MR unit testing frame works  Writing unit tests and MR units  Hand on exercise writing unit test and MR test frame works 5. DELVING DEPER IN TO HADOOP API  Using the tool runner class  Decreasing the amount of intermediate data with combiners ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- INDIA Trainingicon USA Phone: +91-966-690-0051 Email: info@trainingicon.com | www.trainingicon.com Phone: +1-408-791-8864
  • 2.
     Hands onexperience writing and implementing combiners  Setting up and tearing down mappers and reducers by using the configure and close methods  Writing custom practitioners for better load balancing  Hands-on exercise on writing a practitioner  Accessing HDFS programmatically  Using the distributed cache  Using the Hadoop APIs library of mappers, reducers and practitioners 6. PRACTICAL DEVELOPMENT TIPS AND TECHNIQUES  Strategies for debugging map reduce code  Testing map reduce code locally by using local job reducer  Writing and viewing log files  Retrieving job information with counters  Determining the optimal number of reducers for a job  Creating map only map reduce jobs  Hands on exercise using counters and a map only job 7. DATA INPUT AND OUTPUT  Creating custom writable and writable comparable implementations  Saving binary data using sequence file and Avro data files  Implementing custom input formats and output formats  Issues to consider when using file compression  Hands-on exercises using sequence files and file compression 8. COMMAN MAP REDUCE ALLOGORITHMS  Sorting and searching large data sets  Performing a secondary sort  Indexing data  Hand-on exercise creating an inverted index  Computing term frequency -inverse document frequency  Calculating word concurrence  Hands-on exercise calculating word concurrence  Hands-on exercise implementing word concurrence with a customer writable comparable 9. JOINING DATA SETS IN MAP REDUCE JOBS  Writing a map-side join ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- INDIA Trainingicon USA Phone: +91-966-690-0051 Email: info@trainingicon.com | www.trainingicon.com Phone: +1-408-791-8864
  • 3.
     Writing areduce -side join 10. INTEGRATING HADOOP IN TO ENTERPRISE WORK FLOW  Integrating hadoop in to an existing enterprise  Loading data from an RDBMS in to HDFS by using sqoop  Hands-on exercise importing data with sqoop  Managing real-time data using flume  Accessing HDFS from legacy systems with fuse DFS and HTTP FS 11. MACHINE LEARNING AN MAHOUT  Introduction to machine learning  using mahout  Hands-on exercise using a mahout recommended 12. AN INTRODUCTION HIVE AND PIG  The motivation for HIVE and PIG  Hive basics  Hands-on exercise manipulating data with HIVE  PIG basics  Hand-on exercise using PIG to retrieve movie names from our recommender  Choosing between HIVE and PIG  Introduction to OOZIE,HADOOP ONLINE TRAINING,HADOOP TRAINING  Creating OOZE work flow  Hand-on exercise running and OOZE work flow CONCLUSION APPENDIX: GRAPH PROCESSING IN MAP REDUCE AN INTRODUCTION TO OOZIE ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- INDIA Trainingicon USA Phone: +91-966-690-0051 Email: info@trainingicon.com | www.trainingicon.com Phone: +1-408-791-8864