Hadoop

CONTENT
 Introduction
 What is Hadoop?
 Hadoop Applications
 Hadoop Architecture
 Importance
 Advantages
 Disadvantages
 When to use Hadoop?
 Reference
3

 Hadoop is an Apache open source
framework written in java that allows
distributed processing of large datasets
across clusters of computers using simple
programming models.
 A Hadoop frame-worked application works in
an environment that provides distributed
storage and computation across clusters of
computers.
INTRODUCTION
4

 Hadoop is sub-project of Lucene (a
collection of industrial-strength search tools),
under the umbrella of the Apache Software
Foundation.
 Hadoop parallelizes data processing across
many nodes (computers) in a compute
cluster, speeding up large computations and
hiding I/O latency through increased
concurrency.
WHAT IS HADOOP?
5

 Making Hadoop Applications More Widely
Accessible
 A Graphical Abstraction Layer on Top of
Hadoop Applications
HADOOP APPLICATIONS
6

 Ability to store and process huge amounts of
any kind of data, quickly
 Computing power
 Fault tolerance
 Flexibility
 Low cost
 Scalability
WHY IS HADOOP IMPORTANT?
8

 Scalable
 Cost effective
 Flexible
 Fast
 Resilient to failure
ADVANTAGES OF HADOOP
9

 Security Concerns
 Not Fit for Small Data
 Potential Stability Issues
 General Limitations
DISADVANTAGES
10

 Hadoop Common (formerly Hadoop Core)
 Hadoop MapReduce
 Hadoop YARN (MapReduce 2.0)
 Hadoop Distributed File System (HDFS)
“CORE” HADOOP
12

 Ambari, Zookeeper (managing & monitoring)
 HBase, Cassandra (database)
 Hive, Pig (data warehouse and query language)
 Mahout (machine learning)
 Chukwa, Avro, Oozie, Giraph, and many more
THE WIDER HADOOP ECOSYSTEM
13

 Generally, always when “standard tools” don’t work
anymore because of sheer data size
(rule of thumb: if your data fits on a regular hard
drive, your better off sticking to
Python/SQL/Bash/etc.!)
 Aggregation across large data sets: use the power
of Reducers!
 Large-scale ETL operations (extract, transform,
load)
WHEN TO USE HADOOP?
14

REFERENCE
 www.google.com
 www.wikipedia.com
 www.studymafia.org
 www.projectsreports.org

Hadoop

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Hadoop

Similar to Hadoop (20)

More from reddivarihareesh

More from reddivarihareesh (15)

Recently uploaded

Recently uploaded (20)

Hadoop