Understanding Hadoop

HadoopUnderstanding HDFS, MR & YARN

Hadoop
 Hadoop is an open-source software framework for distributed storage and
distributed processing of large structured, semi-structured, and unstructured data
sets across clusters of commodity servers

Google MR
Hadoop
becomes
Apaches top
level project
Yahoo!'s 4000 node cluster followed by Facebook's 2300 node cluster
are the largest clusters
FB
launche
s hive
Nutch:
Doug Cutting &
Mike Cafarella
NDFS &
MR to
N tch
Doug
Cutting
Joins
Yahoo!
Hadoop
Subproject
of Lucene
Spins out of
Nutch
Yahoo!
Fastest
sort of a
TB, 910
nodes, 3.5
mins
Google GFS
2002 2003 2004 2005 2006 2007 2008 2009
NY Times
converts 4
TB of Image
archives
over 100
EC2s
Cloudera
founded
Doug Cutting
joined
cloudera
Fastest sort of a
TB, 62 secs over
1460 nodes.
Petabyte Sort :
hrs: 16.25
Nodes :3658

Hadoop core components
HDFS
MapReduce
Hadoop Distributed File System
Programming model, Distributed processing
engine
YARN
(MRv2)
Yet Another Resourced Negotiator
Resource Management/Central Operating
platform

Design of HDFS
 Designed for
 Very Large Files
 Streaming Data Access
 Commodity Hardware
 Not meant for
 Low Latency data access
 Lots of Small Files
 Multiple Writers, arbitrary file modifications

Hadoop Storage: HDFS Architecture
 Datanodes, Block Replication, Namenode[FsImage, Edits log]
 Block Replication/Data Replication determines how redundant data is stored in hdfs
 Replication factor determins the number of copies
 2 is store one copy on different rack
 3 is store one copy on different rack and one on same rack
 Datanodes store the actual data
 stored as blocks
 the size of blocks can be tuned
 default is usually 64 or 128MB
 smaller the block size(the more blocks) the more the namenode would have to manage
 Namenode manages block locations
 stores "metadata"
 names nodes are a point of failure
 RAM is important here

RACK2
HDFS Architecture
Client
DataNode
b1
b2
b3
RACK 1
DataNode
b1
b2
b3
DataNode
b2
b3
Read
Block ops
[heartbeat, block info]
Client
write
Replications
DataNode
b1b5
b5
Namenode active
Metadata
File/directory name, permissions, ownerships, assigned
blocks
/user/foo/data,3,rw-rw-r--,dev:hdfs,

Hadoop 2.x Cluster Architecture
 ResourceManager
 Master that arbitrates all the available cluster resources
 ApplicationMaster
 Negotiates resources with the ResourceManager and for working with the
NodeManagers to start the containers.
 Is the middleman between NM and RM
 Allows for greater scalability
 NodeManager
 Takes instructions from the ResourceManager and manage resources
available on a single node.

Federation
 allows for multiple namespaces
 separation of namespace and storage
 Namespace: manages directories, files and blocks. It supports file system
operations such as creation, modification, deletion and listing of files and
directories.
 Block Storage: It supports block-related operations such as creation,
deletion, modification and getting location of the blocks. It also takes care
of replica placement and replication. stores the blocks and provides
read/write access to it.
 improve scalability and isolation
 without federation namespace does not scale as easily

HDFS Federation
Hadoop 1.0
Datanode 1
Namenode
Block Management
NS1
Datanode n
Hadoop 2.0
Block Pool
Datanode 1
NN 1
Pool1
NS1
NN 2
Pool2
NS2
NN n
Pooln
NS n
Datanode 2 Datanode n
Blockstorage

HDFS FED Example
Hadoop 2.0
Datanode 1
NN 1
NS1
/user/data/et
l
NN 2
NS2
/user/data/x
ml
NN n
NS n
/home/strea
ming/data/w
eather
Datanode 2 Datanode n

HA
 Prior to Hadoop 2.0 –
 One NameNode for metadata management
 Single point of failure
 HDFS High Availability –
 Two NameNodes in the same cluster
 Active NameNode: responsive for all client operations
 Standby NameNode: maintain enough state to provide a fast failover
 Shared storage
 Active NN writes edit log
 Standby NN reads edit log and applies to its own namespace
 During failover, Standby NN reads all the edits and transitions to Active state

High Availability
http://www.slideshare.net/cloudera/ha-phase-2-with-atm-updates

Anatomy of File Write
NameNode
2. create
4
5
4
5
RACK 2
DataNode
b1
b2
b3
7. Complete
Client Node
Client JVM
HDFS Client DistributedFileSystem
FSDataOutputStream
1. create
3. write
6. close
RACK 1
DataNode
b1
b2
b3
DataNode
b2
b3
4. Write Packet 5. Ack Packet
Pipeline of datanodes

Anatomy of File Read
NameNode
2. Get Block locations
5. read
RACK 1
DataNode
b1
b2
b3
DataNode
b2
b3
RACK 2
DataNode
b1
b2
b3
Client Node
Client JVM
HDFS Client DistributedFileSystem
FSDataInputStream
1.open
3. read
6 .close
4. read

Map Reduce 1
Client Node
job tracker Node
JobTracker
HDFS
tasktracker Node
TaskTracker
Child JVM
child
Map or Reduce Task
Client JVM
MR job
1. Run job
2. Get new Job ID
4. Submit job
3. Copy job
resources
6. Retrieve input splits
6. Retrieve job
resources
6. heartbeat
9. launch
10. run
5. Init job

YARN (Map Reduce 2)
ResourceManager
1. Run job 4. submit applications
9a : start container7. Retrieve input splits
Client Node
Client JVM
MR job
HDFS
tasktracker Node
Node Manager
task JVM
Yarn child
Map or Reduce Task
2. Get new application id
3. Copy job resources
9b. launch
11. run
tasktracker Node
Node Manager
MR App Master
5a : start container
8. Allocate resources
10. Retrieve job resources
6. Init job
5b. launch

Coherency Model
 First block is visible to read once more than a block’s worth of data is
written
 The current block is the one that’s not always visible to reader

map reduce
cats, dogs, cows,
cats, dogs, dogs,
cows, cats
cats, dogs, cows
cows, cats
cats, dogs, dogs
cats, 1
dogs, 1
cows, 1
cats, 1
dogs, 1
dogs 1
cats, 1
cows, 1
cats, 1
cats, 1
cats, 1
cows, 1
cows, 1
dogs, 1
dogs, 1
dogs, 1
cats, 3
cows, 2
dogs, 3
cats,3
cows,2
dogs,3
input split map shuffle reduce output

DataNode
HDFS
InputSplit
Memory
Buffer
p1 p2
p1 p2 p3
p3 p2
p1
p2
p3
Map 1
DataNode
p1
p1
Reduce
DataNode
p2
p2
Reduce
p1
p2
• Intermediate
map output
files.
• Sorted by key
• Part-m-00000
• Combine()
• Spills data to disk.
• Partitions data.
• Sorts by key
• Map takes <k,v>.
• Applies map() in <k,v>
• Writes the o/p to
mem
Merge.1file/partition
Sort/merge Reduce
Output
HDFS
Output
HDFS
DataNode n
p1 p4
p1 p2 p4
p4
p1
p2
p4
HDFS
InputSplit
Memory
Buffer
Map 1
100
MB
shuffleMap [o/p is sorted by key]

MR gotchas
 Map takes input splits as key Value pairs
 Output from mapper is always sorted but based on Key.
 context.write(outKey, outValue);
 then result will be sorted based on outKey
 Default partition is hashing keys
 Reducer reduces a set of intermediate values which share a key to a smaller set
of values.
 reduce() function is called for each key
 setNumOfReducetasks(0)

Hadoop Distributions
 Cloudera
 HortonWorks
 MapR
 Pivotal
 Microsoft Azure HDInsight
 IBM Biginsights
 https://www.cloudera.com/content/dam/www/static/documents/analyst-
reports/forrester-wave-big-data-hadoop-distributions.pdf

What’s next
 Hadoop IO
 Serialization
 Avro, Sequence, Map Files
 File Formats
 Text, Binary, XML, DB
 Joins
 Map-side, Reduce-side
 Secondary Sorting
 Side Data Distribution
 Distributed Cache
 Using jobconfig

References
 Hadoop: The Definitive Guide
 https://hadoop.apache.org/docs/current/hadoop-mapreduce-
client/hadoop-mapreduce-client-core/MapReduceTutorial.html
 http://stackoverflow.com/questions/24771006/is-the-output-of-map-
phase-of-the-mapreduce-job-always-sorted
 http://www.slideshare.net/AdamKawa/apache-hadoop-yarn-namenode-
ha-hdfs-federation
 http://www.datanami.com/2016/05/11/open-source-tour-de-force-
apache-big-data-2016/

Understanding Hadoop

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Understanding Hadoop

Similar to Understanding Hadoop (20)

Recently uploaded

Recently uploaded (20)

Understanding Hadoop