Workshop on data analytics
using big data tools ‘ 2016 –
bharathiar uniVErsity
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
introduction to
prEsEntEd by
K.SANTHIYA
ph.d rEsEarch scholar
dEpartmEnt of computEr
applications
bharathiar uniVErsity
undEr thE guidancE of
dr.V.bhuVanEsWari
assistant profEssor
dEpartmEnt of computEr
applications
bharathiar uniVErsityK.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
agEnda
• WORLD OF DATA
 Few Instances
• CONVENTIONAL APPROACHES
 Limitations
• HADOOP FRAMEWORK
 Terminology Review
• HADOOP COMPONENTS
 HDFS & MAPREDUCE
• HDFS – IN DETAIL
• HADOOP ECOSYSTEM
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
data EXplosion
2.5 quintillion bytes of data is
created each day…..
1
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
World WidE data
Since the
beginning of
Time
Last two years
2
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
2.9 375 20 24 50 700 1.3 72
Million MB Hrs PB Million Billion Exabytes items
thE World of data
3
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
minimum sizE that a big data
filE starts With is at lEast
1 tErabytE
4
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
5
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
&
6
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
conVEntional
approachEs
RDBMS
OS FILE SYSTEM
SQL QUERIES
CUSTOM FRAMEWORK
* C / C++
* PERL
* PYTHON
35
7
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
issuEs in lEgacy
systEms
LIMITED STORAgE CAPACITY
LIMITED PROCESSINg CAPACITY
NO SCALABILITY
SINgLE POINT OF FAILURE
SEQUENTIAL PROCESSINg
RDBMSS CAN HANDLE STRUCTURED DATA
REQUIRES PREPROCESSINg OF DATA
INFORMATION IS COLLECTED ACCORDINg
TO CURRENT BUSINESS NEEDS
8
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
How do we
mine (and mind)
all this data?
HOW TO RESOLVE ALL THESE
ISSUES?
9
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
Mr. HADOOP sAys He HAs
A sOlutiOn tO Our BiG
PrOBleM !
1
0K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
1
1K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
43
1
2K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
COMPAnies usinG
1
3K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
WHAt is
APACHe HADOOP is A frAMeWOrk tHAt
AllOWs
fOr tHe DistriButeD PrOCessinG Of lArGe
DAtAsets ACrOss Clusters Of COMMODity
COMPuters usinG A siMPle PrOGrAMMinG
MODel.
Concept
Moving computation is more efficient than moving
large data
1
4K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
STORAGE
COMPUTATION
COMPLEXITY
1
5K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
tWO DAeMOns Of
HADOOP
44
1
6K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
ARCHITECTURE
1
7K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
terMiniOlOGy reVieW
Node 1
Node 2
Node N
:
:
Rack 1
Node 1
Node 2
Node N
:
:
Rack 2
:
:
clusteR
1
8
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
HADOOP Cluster
ArCHiteCture
1
9K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
2
0K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
HADOOP COre serViCes
i. nAMe nODe
ii.DAtA nODe
iii.resOurCe MAnAGer
iV.APPliCAtiOn MAster
V.nODe MAnAGer
Vi.seCOnDAry nAMe nODe
2
1K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
HDFS – REAL LIFE CONNECT
• A college library was gifted a massive collection of books by a patron. The
books were very popular titles. The librarian decided to arrange the books in
a small rack, and distribute multiple copies of each book in other racks, so
that students can find the books easily. Similarly, HDFS creates multiple
copies of a data block, and keeps them in separate systems for easy access.
2
2
K.Santhiya , Ph.d Research
Scholar , Dr.V.Bhuvaneswari,
WHAT IS HDFS
• Hadoop distributed File system
Highly Fault tolerant , distributed , reliable ,
scalable file system for data storage.
Stores multiple copies of data on different
nodes
A File is split up into blocks and stored on
multiple machines
Hadoop cluster typically has a single
namenode and no. of data nodes to form a
hadoop cluster.
2
3
K.Santhiya , Ph.d Research
Scholar , Dr.V.Bhuvaneswari,
HDFS BLOCKS
• Files are broken in to large blocks.
 Typically 128 MB block size
 Blocks are replicated for reliability
 One replica on local node, Another replica on a remote rack,
 Third replica on local rack, Additional replicas are randomly placed
2
4K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
HDFS BLOCKS CONTD.,
ADVANTAGES OF HDFS BLOCKS
Fixed Size
Chunk of file < block size : Only needed space is
used.
Eg : 420 MB file is split as
2
5K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
HDFS OpERATION pRINCIpLE
2
6K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
NAME NODE
2
7K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
DATA NODE
2
8K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
SECONDARY NAME NODE
2
9K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
HDFS ARCHITECTURE
3
0
K.Santhiya , Ph.d Research
Scholar , Dr.V.Bhuvaneswari,
HDFS – BLOCK REpLICATION
ARCHITECTURE
3
1K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
NAMENODE IN HA MODE
3
2K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
NAME NODE HA ARCHITECTURE
3
3K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
BUSINESS SCENARIO
olivia tyler is the evp of it operations
with
nutri worldwide, inc.,and she has
decided to use hdfs for storing big data.
she will use hdfs shell to store the data
in a hadoop file system, and she will
execute various commands on it.
3
4K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
3
5K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
hadoop shell commands
hadoop fs -mkdir /learning
hadoop fs –copyFromLocal test.txt /learning
hadoop fs -ls /learning
hadoop fs -cat/learning/test.txt
3
6K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
hadoop ecosystem
components
3
7K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
data transfer components
3
8K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
data store components
• following are the data store components of
the hadoop ecosystem.
DISTRIBUTED
SCALABLE
BIG DATA STORE
SCALABLE
CONSISTENT
DISTRIBUTED
STRUCTURED KEY
VALUE STORE
SORTED
DISTRIBUTED KEY
VALUE DATA
STORAGE AND
RETRIEVAL SYSTEM
HBASE CASSANDRA ACCUMULO
3
9K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
serialization components
• The serialization components are Avro,
Trevni, and Thrift.
• Avro is a data serialization system.
• Trevni is a column file format used to
permit compatible, independent
implementations that read and /or write
files in this format.
• Thrift is a framework for scalable, cross-
language services development. 4
0
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
Job execution components
• Following are the job execution components :
4
1K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
worK management
components
4
2K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
conclusion
56
4
3K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
references
• J. Gantz and D. Reinsel, ``The digital universe in 2020: Big data, bigger digital shadows,
and biggest growth in the far east,'' in Proc. IDC iView,IDC Anal. Future, 2012.
• (2015) Available : [online] http://expandedramblings.com/index.php/by-the-numbers-a-
gigantic-list-of-google-stats-and-facts/
• D. Evans and R. Hutley, ``The explosion of data,'' white paper, 2010.
• Seema Acharya, Subhashini Chelleppan " Big Data and Analytics "Wiley India Pvt Ltd ,
2015
• Dhruba Borthakur , " HDFS Architecture Guide " , 2013.
• Available:[Online]http:// hortonworks.com/hadoop/flume/#section_2
• Marko Grobelnik , " Big-Data tutorial" , white paper,2012.
K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
WDABT 2016
4
4K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016

Introduction to hadoop

  • 1.
    Workshop on dataanalytics using big data tools ‘ 2016 – bharathiar uniVErsity K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 2.
    introduction to prEsEntEd by K.SANTHIYA ph.drEsEarch scholar dEpartmEnt of computEr applications bharathiar uniVErsity undEr thE guidancE of dr.V.bhuVanEsWari assistant profEssor dEpartmEnt of computEr applications bharathiar uniVErsityK.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 3.
    agEnda • WORLD OFDATA  Few Instances • CONVENTIONAL APPROACHES  Limitations • HADOOP FRAMEWORK  Terminology Review • HADOOP COMPONENTS  HDFS & MAPREDUCE • HDFS – IN DETAIL • HADOOP ECOSYSTEM K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 4.
    data EXplosion 2.5 quintillionbytes of data is created each day….. 1 K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 5.
    World WidE data Sincethe beginning of Time Last two years 2 K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 6.
    2.9 375 2024 50 700 1.3 72 Million MB Hrs PB Million Billion Exabytes items thE World of data 3 K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 7.
    minimum sizE thata big data filE starts With is at lEast 1 tErabytE 4 K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 8.
    5 K.Santhiya , Ph.dResearch Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 9.
    & 6 K.Santhiya , Ph.dResearch Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 10.
    conVEntional approachEs RDBMS OS FILE SYSTEM SQLQUERIES CUSTOM FRAMEWORK * C / C++ * PERL * PYTHON 35 7 K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 11.
    issuEs in lEgacy systEms LIMITEDSTORAgE CAPACITY LIMITED PROCESSINg CAPACITY NO SCALABILITY SINgLE POINT OF FAILURE SEQUENTIAL PROCESSINg RDBMSS CAN HANDLE STRUCTURED DATA REQUIRES PREPROCESSINg OF DATA INFORMATION IS COLLECTED ACCORDINg TO CURRENT BUSINESS NEEDS 8 K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 12.
    How do we mine(and mind) all this data? HOW TO RESOLVE ALL THESE ISSUES? 9 K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 13.
    Mr. HADOOP sAysHe HAs A sOlutiOn tO Our BiG PrOBleM ! 1 0K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 14.
    1 1K.Santhiya , Ph.dResearch Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 15.
    43 1 2K.Santhiya , Ph.dResearch Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 16.
    COMPAnies usinG 1 3K.Santhiya ,Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 17.
    WHAt is APACHe HADOOPis A frAMeWOrk tHAt AllOWs fOr tHe DistriButeD PrOCessinG Of lArGe DAtAsets ACrOss Clusters Of COMMODity COMPuters usinG A siMPle PrOGrAMMinG MODel. Concept Moving computation is more efficient than moving large data 1 4K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 18.
    STORAGE COMPUTATION COMPLEXITY 1 5K.Santhiya , Ph.dResearch Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 19.
    tWO DAeMOns Of HADOOP 44 1 6K.Santhiya, Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 20.
    ARCHITECTURE 1 7K.Santhiya , Ph.dResearch Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 21.
    terMiniOlOGy reVieW Node 1 Node2 Node N : : Rack 1 Node 1 Node 2 Node N : : Rack 2 : : clusteR 1 8 K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 22.
    HADOOP Cluster ArCHiteCture 1 9K.Santhiya ,Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 23.
    2 0K.Santhiya , Ph.dResearch Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 24.
    HADOOP COre serViCes i.nAMe nODe ii.DAtA nODe iii.resOurCe MAnAGer iV.APPliCAtiOn MAster V.nODe MAnAGer Vi.seCOnDAry nAMe nODe 2 1K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 25.
    HDFS – REALLIFE CONNECT • A college library was gifted a massive collection of books by a patron. The books were very popular titles. The librarian decided to arrange the books in a small rack, and distribute multiple copies of each book in other racks, so that students can find the books easily. Similarly, HDFS creates multiple copies of a data block, and keeps them in separate systems for easy access. 2 2 K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari,
  • 26.
    WHAT IS HDFS •Hadoop distributed File system Highly Fault tolerant , distributed , reliable , scalable file system for data storage. Stores multiple copies of data on different nodes A File is split up into blocks and stored on multiple machines Hadoop cluster typically has a single namenode and no. of data nodes to form a hadoop cluster. 2 3 K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari,
  • 27.
    HDFS BLOCKS • Filesare broken in to large blocks.  Typically 128 MB block size  Blocks are replicated for reliability  One replica on local node, Another replica on a remote rack,  Third replica on local rack, Additional replicas are randomly placed 2 4K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 28.
    HDFS BLOCKS CONTD., ADVANTAGESOF HDFS BLOCKS Fixed Size Chunk of file < block size : Only needed space is used. Eg : 420 MB file is split as 2 5K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 29.
    HDFS OpERATION pRINCIpLE 2 6K.Santhiya, Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 30.
    NAME NODE 2 7K.Santhiya ,Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 31.
    DATA NODE 2 8K.Santhiya ,Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 32.
    SECONDARY NAME NODE 2 9K.Santhiya, Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 33.
    HDFS ARCHITECTURE 3 0 K.Santhiya ,Ph.d Research Scholar , Dr.V.Bhuvaneswari,
  • 34.
    HDFS – BLOCKREpLICATION ARCHITECTURE 3 1K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 35.
    NAMENODE IN HAMODE 3 2K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 36.
    NAME NODE HAARCHITECTURE 3 3K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 37.
    BUSINESS SCENARIO olivia tyleris the evp of it operations with nutri worldwide, inc.,and she has decided to use hdfs for storing big data. she will use hdfs shell to store the data in a hadoop file system, and she will execute various commands on it. 3 4K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 38.
    3 5K.Santhiya , Ph.dResearch Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 39.
    hadoop shell commands hadoopfs -mkdir /learning hadoop fs –copyFromLocal test.txt /learning hadoop fs -ls /learning hadoop fs -cat/learning/test.txt 3 6K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 40.
    hadoop ecosystem components 3 7K.Santhiya ,Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 41.
    data transfer components 3 8K.Santhiya, Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 42.
    data store components •following are the data store components of the hadoop ecosystem. DISTRIBUTED SCALABLE BIG DATA STORE SCALABLE CONSISTENT DISTRIBUTED STRUCTURED KEY VALUE STORE SORTED DISTRIBUTED KEY VALUE DATA STORAGE AND RETRIEVAL SYSTEM HBASE CASSANDRA ACCUMULO 3 9K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 43.
    serialization components • Theserialization components are Avro, Trevni, and Thrift. • Avro is a data serialization system. • Trevni is a column file format used to permit compatible, independent implementations that read and /or write files in this format. • Thrift is a framework for scalable, cross- language services development. 4 0 K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 44.
    Job execution components •Following are the job execution components : 4 1K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 45.
    worK management components 4 2K.Santhiya ,Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 46.
    conclusion 56 4 3K.Santhiya , Ph.dResearch Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 47.
    references • J. Gantzand D. Reinsel, ``The digital universe in 2020: Big data, bigger digital shadows, and biggest growth in the far east,'' in Proc. IDC iView,IDC Anal. Future, 2012. • (2015) Available : [online] http://expandedramblings.com/index.php/by-the-numbers-a- gigantic-list-of-google-stats-and-facts/ • D. Evans and R. Hutley, ``The explosion of data,'' white paper, 2010. • Seema Acharya, Subhashini Chelleppan " Big Data and Analytics "Wiley India Pvt Ltd , 2015 • Dhruba Borthakur , " HDFS Architecture Guide " , 2013. • Available:[Online]http:// hortonworks.com/hadoop/flume/#section_2 • Marko Grobelnik , " Big-Data tutorial" , white paper,2012. K.Santhiya , Ph.d Research Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 48.
    4 4K.Santhiya , Ph.dResearch Scholar , Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016