Files are broken up in ‘blocks’

Log file a.txt

Block
1

Essentials of Hadoop for Big Data

Block
2

Log file b.txt

Block
3

Block
4

1

Log file e.txt

Block
5
Blocks are replicated (at least 3x) throughout
the cluster on Data Nodes
Block
5

Node
1

Node
2

Node
3

Node
4

Block
4
Block
3

Block
2
Block
1
Essentials of Hadoop for Big Data

2
Blocks are replicated (at least 3x) throughout
the cluster on Data Nodes
Block
5

Block
1

Block
2
Block
1
Essentials of Hadoop for Big Data

Block
3

Block
1

Block
2

Block
4

Block
4
Block
3

Node
2

Node
1

Block
5

Block
4

Block
5

Block
1

Block
2

Block
3

Block
5

Node
3

Node
4

Block
2

Block
3

Block
4
3

Block replication on HDFS

  • 1.
    Files are brokenup in ‘blocks’ Log file a.txt Block 1 Essentials of Hadoop for Big Data Block 2 Log file b.txt Block 3 Block 4 1 Log file e.txt Block 5
  • 2.
    Blocks are replicated(at least 3x) throughout the cluster on Data Nodes Block 5 Node 1 Node 2 Node 3 Node 4 Block 4 Block 3 Block 2 Block 1 Essentials of Hadoop for Big Data 2
  • 3.
    Blocks are replicated(at least 3x) throughout the cluster on Data Nodes Block 5 Block 1 Block 2 Block 1 Essentials of Hadoop for Big Data Block 3 Block 1 Block 2 Block 4 Block 4 Block 3 Node 2 Node 1 Block 5 Block 4 Block 5 Block 1 Block 2 Block 3 Block 5 Node 3 Node 4 Block 2 Block 3 Block 4 3