Hadoop training kit from lcc infotech

Big Data
Learn the basics of the Hadoop Distributed File System (HDFS) and MapReduce framework and how
to write programs against its API, as well as discuss design techniques for larger workflows. This
training also covers advanced skills for debugging MapReduce programs and optimizing their
performance, plus introduces participants to related projects in Distribution for Hadoop such as
Hive, Pig, and Oozie.
The course/training is designed specifically for CEO, CTO to Managers, Software Architect to an
Individual Developers and Testers to enhance their skills in BigData world. You will learn when the
use of Hadoop is appropriate, what problems Hadoop addresses, how Hadoop fits into your existing
environment, and what you need to know about deploying Hadoop.
After completing the training, attendees can leverage our Hadoop Certification Exam Simulator for
Developer as well as Administrator to clear the Hadoop Certification.
Since launch 150+ attendees already cleared the exam with the help of our simulator.
Hadoop is one of the top Job trends right now. There are various top most MNCs like IBM,
Microsoft, Oracle, Accenture and many more companies have all incorporated Hadoop. Few
other companies like Amazon, Ebay, Yahoo, Hortonworks, and Facebook are looking for
Hadoop professionals. Many of the companies are finding enough IT professionals with
certain skills like Hadoop. That spells high pay.
You have a query about, what are the prerequisites to learn Hadoop?
There is no strict prerequisite to start learning Hadoop. However, if you want to become
expert in Hadoop and make excellent career, you should have at least basic knowledge of
Java and Linux.
If you don’t have any idea on Java or Linux? Don’t worry, you still can start learning Hadoop.
The best way would be parallel spend some time on Java and Linux too. We can train and
help you on learning Basics of Java and Linux. Holding Java is an added advantage, but it is
not strictly prerequisite for working or learning Hadoop. Tools like Hive and Pig that are built
on top of Hadoop offer their own high-level languages for working with data on your cluster.
Industry Where Hadoop is Being Used:
 Energy & Utilities
 Financial Services
 Government
 Healthcare & Life Sciences
 Media & Entertainment
 Retail
 E-Commerce Consumer

 Product
 Technology
 Telecommunications
 Start ups(They are trying their each & every resource should have Hadoop knowledge)
Faculty profile:-
 They are having the training and consulting experience for more than 14 years with the
intention of dramatically increasing profit, productivity, and the performance of people by
building high scale computing solutions.
 They have been developing Hadoop based technology for the past 5 years.
 Our Trainer’s recent innovation, while working as the principal architect, was recognized by
the Fast Company magazine as the most innovative healthcare big data platform in the
world and was featured in the magazine.
 Our Trainer holds US patents related to healthcare big data technology.
 They have architected, developed and brought one of the most innovative electronic
medical record systems in the US and Indian market.
Why Hadoop?
Big Data is defined as high volume, velocity and variety information assets that demand cost-
effective, innovative forms of information processing for enhanced insight and decision making.
High Amount of the data extracted from sensors used to gather climate information, posts to social
media sites, digital pictures and videos, purchase transaction records, and cell phone GPS signals, to
name a few are unstructured. All of this unstructured data is Big Data.
Organizations are discovering that important decisions can be made by sorting through and
analyzing Big Data.
As the highest amount of this data is "unstructured/unorganized", it must be formatted / structured
in a way that that makes it suitable for data mining and subsequent analysis.
Apache Hadoop is open source core platform for structuring / organizing Big Data, and solves the
problem of making it useful / helpful for analytics purposes.

Course Curriculum :
Courses:- Hadoop Training:-
We provide Apache Hadoop training & Our training programs are highly interactive, hands-on and
are scheduled to meet the demands of working professionals.
Training Highlights
 Prepare you to be a Hadoop expert
 Every student builds and play with their own Hadoop cluster
 Our multi-node clusters are available for each student to practice after the training.
 Evening classes for working professionals
 Hands-on training with real-world Hadoop use cases
Audience profile:-
 Excellent written and oral communication skills in English
 Innovative and creative thinking and strong initiative
 Minimum three years related field work experience
1. Big Data and conventional approaches
 What is Big Data
 conventional approaches
 Problems with conventional approaches
2. Hadoop introduction
 Open source, developed / backed by top communities
 Different flavors comparison
 Use-cases
 HDFS
 Map-Reduce
3. HDFS concepts
 Architecture
 Distributed storage
 High Availability
 Fault-tolerance and Reliable data storage
 Scalability
4. Map-Reduce concepts
 Architecture
 High performance parallel data processing
 Network and disk transfer optimization (Data Locality)
 Scalability
 Fault tolerance

5. Hadoop setup on single node
 Standalone mode
 Pseudo-distributed mode
6. HDFS hands-on
7. Map-Reduce hands-on
8. Cloud Computing
9. Hadoop Cluster setup
10. Work-Shops on real use-cases
11. Advance Map-Reduce concepts
12. Hadoop cluster optimization
13. Hadoop Problems and Troubleshooting
14. Hive
 Basics
 Installation
 Hive hands-on
 meta-store and configurations
15. Pig
 Basics
 Installation
 Hands-on
16. HBase
 Basics
 Installation
 HBase hands-on
17. Flume
 Basics
 Installation
 Hands-on

We present real world scenario-based training developed by the software architects and builders of
highly scalable solutions based on Apache Hadoop with unmatched depth and expertise so that you
can be assured you are learning from the experts. We offer the following courses designed for
software developers, architects and cluster administrators.
Example
Hadoop Administration Consultant:-
 Should have 4+ years of IT experience and 1+ years of Hadoop Administration
experience
 Hadoop cluster administration that includes adding and removing cluster nodes
 Recovering Name node.
 Importing and exporting data from HDFS
 Management of Hadoop log files
 Had done Hadoop cluster maintenance, monitoring and trouble shooting
 Experience in day to day production support of Hadoop infrastructure like HDFS
maintenance, backups, manage and review Hadoop log files.
 Hands-on experience on building large scale systems utilizing Big Data Technologies
 Installation and configuration of Hadoop/HBase cluster
 Linux/UNIX commands, Shell scripting, vi editor.
 Willingness to learn new tools and technology.
 Design and develop solutions using Hadoop to tackle big data, information retrieval, and
analytics problems.
 Experience with Core Java and SQL
Hadoop Administration Consultant:- Job Description
 Data loading and optimization (various formats) to Hadoop.
 Management of Hadoop log files.
 Recovery of nodes.
 Maintenance of Hadoop configuration files.
 Hadoop cluster maintenance, Performance monitoring and trouble shooting
 Linux/UNIX development.
 Usage of tools such as Ambari, HCatalog, HBase, Oozie, Hive etc.
 Use Teradata adapters to Hadoop to manage Hadoop environment with the Teradata
relational database.
 Interface with Teradata Engineering staff as needed to resolve complex technical issues
 Transfer knowledge and expertise to other TERADATA professional services associates.
 Provide technical expertise to data warehouse clients that will contribute to innovative
business solutions.

Sr. Hadoop Administration:- Requirements
 Strong understanding of SQL in accessing and manipulating data
 Strong understanding of system architecture.
 Understanding of Unix operating system
 Strong understanding of general programming concepts
 Advanced analytical and problem solving skills
 Experience with the following software required:
Unix, shell scripting, Netezza Hadoop/MapReduce Infrastructure and associated
Apache projects, Apache Hadoop project skills, Oracle, MySQL, Java, HBase,
Hive, Pig, Mahout, etc
 Experience with the following software desired:
Exposure to Business Intelligence tools (Tableau/Business Objects), Application
Development experience preferred, No-SQL Database experience a plus
Manager - Projects - Hadoop Admin
Job Description:-
 Candidate must have experience of atleast 1+ year in installing and setting up a Hadoop
cluster in Production Environment.
 Must have knowledge of various Hadoop Distributions, pros and cons of each
distribution.
 Must have knowledge in implementing security features for Hadoop cluster and has
understanding of Distributed Computing concepts.
 Should be capable of installing, configuring and administering on any of the Linux
distributions.
 Should have basic knowledge of setting up CRON jobs and monitoring Oozie jobs.
Hadoop Professional:- Job Description
 Data loading and optimization (various formats) to Hadoop.
 Management of Hadoop log files.
 Recovery of nodes.
 Maintenance of Hadoop configuration files.
 Hadoop cluster maintenance, Performance monitoring and trouble shooting
 Linux/UNIX development.
 Usage of tools such as Ambari, HCatalog, HBase, Oozie, Hive etc.
 Use Teradata adapters to Hadoop to manage Hadoop environment with the
Teradata relational database.
 Interface with Teradata Engineering staff as needed to resolve complex technical
issues
 Transfer knowledge and expertise to other TERADATA professional services
associates.
 Provide technical expertise to data warehouse clients that will contribute to
innovative business solutions.

Team Lead :- Hadoop Stack
Job Description:
 2+ years of hands-on experience with the Hadoop stack (Map Reduce
Programming Paradigm, HBase, Pig, Hive, Sqoop)
 Experience with key-value store technologies such as Cassandra and documents
based storages like Mongo DB would be a plus
 2+ years of hands-on experience with some level of administration,
configuration management, monitoring, debugging, benchmarking and
performance tuning of Hadoop/Cassandra
 4+ years hands-on experience with open source software platforms and
languages (e.g. Java, Linux, Apache, Perl/Python/PH
 Previous experience with RDBMS, SQL, database performance tuning, high scale
application handling is highly desirable
 Hands-on writing Map Reduce Job and scheduling and monitoring Map Reduce
jobs
Desired Profile:
 Good development experience in Hadoop
 Good communication skills
 Experience with optimizing performance of front-end applications
 SEO best-practice experience
 Understanding of data warehousing and business intelligence technologies
 Experience with architecture and design of analytics platforms Experience in
AWS is a plus
 Excellent problem solving and analytical skills
Hadoop Developers and Architects - Job Description
 Hands on experience in application development using Java/J2EE, Perl, JSP, XML
 Experience with job/workflow scheduling and monitoring tools like oozie, Zookeeper
 Programming lang like Python, Ruby, RoR, BI Tools like Informatica, Pentaho, Talend
Hadoop Developer and Architect :- Position Summary:-
Our divisions objectives is architecting and implementing Hadoop System. We provide highly
available Hadoop Cluster Environment. We are currently looking for Developer, SA (software
architect), DBA (database administrator), and Network Architect.
 Developer for Hadoop
This position is responsible for the design and implementation Hadoop platform and
service. Task includes develop and customize Hadoop and related Service.

A successful candidate must have experience on designing and implementing
Hadoop platform and service.
 Software Architect for Hadoop Cluster
The Hadoop software architect designs and implements Hadoop systems using open
source and/or commercial Haddop management software as well as designing highly
scalable software applications for clients. A successful candidate must have deep
understanding of Distribute File System, MapReduce, NoSQL(Cassandra, MongDB,
CouchDB), Scheduling and Program Languages (Java).
Desired Skills & Experience:-
A successful candidate must meet one or more of the following requirements:
 MapReduce Data Programming Framework
 Distributed File System
 Data caching & Data processing Optimization skill
 Distrubuted Processing (Hive, Pig, Sqoop, ZooKeeper)
 NoSQL (Cassandra, MongoDB, CouchDB)
 Linux system administration (RedHat, Ubuntu, CentOS)
 Shell programming in bash, ksh, perl, expect, php
 Server / Storage (Dell, IBM, HP, EMC, Netapp, Cisco) must have Rack n Stack
experience
 Storage management experience with EMC, Hitachi, NetApp - especially
Network storage clustering experience: DRBD, NFS, iSCSI, SAN.

Hadoop training kit from lcc infotech

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Hadoop training kit from lcc infotech

Similar to Hadoop training kit from lcc infotech (20)

More from lccinfotech

More from lccinfotech (10)

Recently uploaded

Recently uploaded (20)

Hadoop training kit from lcc infotech