BIG DATA Session 6

Session Objectives
Introduction
Hadoop: Introduction
Hadoop: Architecture
Hadoop:Map-reduce
DFS and HDFS
Summary

Introduction
Big Data:?
Challenges of BIG DATA
Big Data Sources
Use cases of Big
Big data characteristics :
• Volume : Extremely large amount of data
• Velocity: Rate at which the data is generated
• Variety: Structured, semi-structured, unstructured data,
Big data Technologies:

Hadoop : Introduction
Hadoop. It allows the distributed processing on large
clusters and datasets
It’s a data management technology or a framework
Why Hadoop?
• Scalable.
• Cost effective.
• Fast,
• Resilient to failure.

Hadoop : Cluster
Master with 5 slave , 50TB data for next 5 months

Hadoop2 : Daemons
1. Name Node
2. Data node
3. Secondary name node
4. Resource manager :
5. Node Manager

Hadoop : Master Slave Architecture
Hadoop
HDFS
Name Node Data Node
MR/YARN
Resource Manager
Node Manager
Master
Slave

Hadoop : Master Slave Architecture

Hadoop1 Limitations
Single point failure
Low resource utilization
Less scalable as compare to Hadoop2

BIG DATA Session 6

Recommended

Recommended

More Related Content

What's hot

What's hot (17)

Similar to BIG DATA Session 6

Similar to BIG DATA Session 6 (20)

More from Infinity Tech Solutions

More from Infinity Tech Solutions (20)

Recently uploaded

Recently uploaded (20)

BIG DATA Session 6