SlideShare a Scribd company logo
1 of 36
Hadoop manages:
› processor time
› memory
› disk space
› network bandwidth
 Does not have a security model
 Can handle HW failure
www.kellytechno.com
 Issues:
› race conditions
› synchronization
› deadlock
 i.e., same issues as distributed OS &
distributed filesystem
www.kellytechno.com
 Grid computing: (What is this?)
 e.g. Condor
› MPI model is more complicated
› does not automatically distribute data
› requires separate managed SAN
www.kellytechno.com
 Hadoop:
› simplified programming model
› data distributed as it is loaded
 HDFS splits large data files across machines
 HDFS replicates data
› failure causes additional replication
www.kellytechno.com
www.kellytechno.com
 Core idea: records are processed in
isolation
 Benefit: reduced communication
 Jargon:
› mapper – task that processes records
› Reducer – task that aggregates results from
mappers
www.kellytechno.com
www.kellytechno.com
How is the previous picture different from
normal grid/cluster computing?
Grid/cluster:
Programmer manages communication via MPI
vs
Hadoop:
communication is implicit
Hadoop manages data transfer and cluster topology
issues
www.kellytechno.com
 Hadoop overhead
› MPI does better for small numbers of nodes
 Hadoop – flat scalabity  pays off with
large data
› Little extra work to go from few to many nodes
 MPI – requires explicit refactoring from small
to larger number of nodes
www.kellytechno.com
 NFS: the Network File System
› Saw this in OS class
› Supports file system exporting
› Supports mounting of remote file system
www.kellytechno.com
www.kellytechno.com
Mounts Cascading mounts
www.kellytechno.com
 Establishes logical connection between server
and client.
 Mount operation: name of remote directory &
name of server
› Mount request is mapped to corresponding RPC
and forwarded to mount server running on server
machine.
› Export list – specifies local file systems that server
exports for mounting, along with names of
machines that are permitted to mount them.
www.kellytechno.com
 server returns a file handle—a key for further
accesses.
 File handle – a file-system identifier, and an
inode number to identify the mounted directory
 The mount operation changes only the user’s
view and does not affect the server side.
www.kellytechno.com
NFS Advantages
› Transparency – clients unaware of local vs remote
› Standard operations - open(), close(), fread(), etc.
NFS disadvantages
› Files in an NFS volume reside on a single machine
› No reliability guarantees if that machine goes down
› All clients must go to this machine to retrieve their
data
www.kellytechno.com
 HDFS Advantages:
› designed to store terabytes or petabytes
› data spread across a large number of machines
› supports much larger file sizes than NFS
› stores data reliably (replication)
www.kellytechno.com
 HDFS Advantages:
›  provides fast, scalable access
› serve more clients by adding more machines
› integrates with MapReduce local
computation
www.kellytechno.com
 HDFS Disadvantages
› Not as general-purpose as NFS
› Design restricts use to a particular class of
applications
› HDFS optimized for streaming read performance
not good at random access
www.kellytechno.com
 HDFS Disadvantages
› Write once read many model
› Updating a files after it has been closed is not
supported (can’t append data)
› System does not provide a mechanism for local
caching of data
www.kellytechno.com
 HDFS – block-structured file system
 File broken into blocks distributed among
DataNodes
 DataNodes – machines used to store data blocks
www.kellytechno.com
 Target machines chosen randomly on a block-
by-block basis
 Supports file sizes far larger than a single-
machine DFS
 Each block replicated across a number of
machines (3, by default)
www.kellytechno.com
www.kellytechno.com
 Expects large file size
› Small number of large files
› Hundreds of MB to GB each
 Expects sequential access
 Default block size in HDFS is 64MB
 Result:
› Reduces amount of metadata storage per file
›  Supports fast streaming of data (large amounts
of contiguous data)
www.kellytechno.com
 HDFS expects to read a block start-to-
finish
› Useful for MapReduce
› Not good for random access
› Not a good general purpose file system
www.kellytechno.com
 HDFS files are NOT part of the ordinary file system
 HDFS files are in separate name space
 Not possible to interact with files using ls, cp, mv,
etc.
 Don’t worry: HDFS provides similar utilities
www.kellytechno.com
 Meta data handled by NameNode
› Deal with synchronization by only allowing
one machine to handle it
› Store meta data for entire file system
› Not much data: file names, permissions, &
locations of each block of each file
www.kellytechno.com
www.kellytechno.com
 What happens if the NameNode fails?
› Bigger problem than failed DataNode
› Better be using RAID ;-)
› Cluster is kaput until NameNode restored
 Not exactly relevant but:
› DataNodes are more likely to fail.
› Why?
www.kellytechno.com
 First download and unzip a copy of Hadoop (
http://hadoop.apache.org/releases.html)
 Or better yet, follow this lecture first ;-)
  Important links:
› Hadoop website http://hadoop.apache.org/index.html
› Hadoop Users Guide http://hadoop.apache.org/docs/current/hadoop-
project-dist/hadoop-hdfs/HdfsUserGuide.html
› 2012 Edition of Hadoop User’s Guide http://it-ebooks.info/book/635/
www.kellytechno.com
  HDFS configuration is in conf/hadoop-defaults.xml
› Don’t change this file.
› Instead modify conf/hadoop-site.xml
› Be sure to replicate this file across all nodes in your cluster
› Format of entries in this file:
<property>
<name>property-name</name>
<value>property-value</value>
</property>
www.kellytechno.com
Necessary settings:
1.fs.default.name - describes the NameNode
› Format: protocol specifier, hostname, port
› Example: hdfs://punchbowl.cse.sc.edu:9000
1.dfs.data.dir – path on the local file system in which the
DataNode instance should store its data
› Format: pathname
› Example: /home/sauron/hdfs/data
› Can differ from DataNode to DataNode
› Default is /tmp
› /tmp is not a good idea in a production system ;-)
www.kellytechno.com
3. dfs.name.dir - path on the local FS of the
NameNode where the NameNode metadata is
stored
› Format: pathname
› Example: /home/sauron/hdfs/name
› Only used by NameNode
› Default is /tmp
› /tmp is not a good idea in a production system ;-)
3. dfs.replication – default replication factor
› Default is 3
› Fewer than 3 will impact availability of data.
www.kellytechno.com
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://your.server.name.com:9000</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/username/hdfs/data</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/home/username/hdfs/name</value>
</property>
</configuration>
www.kellytechno.com
 The Master Node needs to know the names of the
DataNode machines
› Add hostnames to conf/slaves
› One fully-qualified hostname per line
› (NameNode runs on Master Node)
 Create Necessary directories
› user@EachMachine$ mkdir -p $HOME/hdfs/data
› user@namenode$ mkdir -p $HOME/hdfs/name
› Note: owner needs read/write access to all directories
› Can run under your own name in a single machine cluster
› Do not run Hadoop as root. Duh!
www.kellytechno.com
www.kellytechno.com

More Related Content

What's hot

Hadoop HDFS by rohitkapa
Hadoop HDFS by rohitkapaHadoop HDFS by rohitkapa
Hadoop HDFS by rohitkapakapa rohit
 
Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
Storage Systems for big data - HDFS, HBase, and intro to KV Store - RedisStorage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
Storage Systems for big data - HDFS, HBase, and intro to KV Store - RedisSameer Tiwari
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File SystemRutvik Bapat
 
Hadoop HDFS Detailed Introduction
Hadoop HDFS Detailed IntroductionHadoop HDFS Detailed Introduction
Hadoop HDFS Detailed IntroductionHanborq Inc.
 
Introduction to HDFS and MapReduce
Introduction to HDFS and MapReduceIntroduction to HDFS and MapReduce
Introduction to HDFS and MapReduceUday Vakalapudi
 
HDFS User Reference
HDFS User ReferenceHDFS User Reference
HDFS User ReferenceBiju Nair
 
Apache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS FederationApache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS FederationAdam Kawa
 
HDFS introduction
HDFS introductionHDFS introduction
HDFS introductioninjae yeo
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File SystemAnand Kulkarni
 
Hadoop Interacting with HDFS
Hadoop Interacting with HDFSHadoop Interacting with HDFS
Hadoop Interacting with HDFSApache Apex
 

What's hot (20)

Interacting with hdfs
Interacting with hdfsInteracting with hdfs
Interacting with hdfs
 
Hadoop admin
Hadoop adminHadoop admin
Hadoop admin
 
Hadoop HDFS by rohitkapa
Hadoop HDFS by rohitkapaHadoop HDFS by rohitkapa
Hadoop HDFS by rohitkapa
 
Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
Storage Systems for big data - HDFS, HBase, and intro to KV Store - RedisStorage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File System
 
Hadoop HDFS
Hadoop HDFSHadoop HDFS
Hadoop HDFS
 
Anatomy of file write in hadoop
Anatomy of file write in hadoopAnatomy of file write in hadoop
Anatomy of file write in hadoop
 
HDFS Design Principles
HDFS Design PrinciplesHDFS Design Principles
HDFS Design Principles
 
Hadoop HDFS Concepts
Hadoop HDFS ConceptsHadoop HDFS Concepts
Hadoop HDFS Concepts
 
Hadoop HDFS Detailed Introduction
Hadoop HDFS Detailed IntroductionHadoop HDFS Detailed Introduction
Hadoop HDFS Detailed Introduction
 
Introduction to HDFS and MapReduce
Introduction to HDFS and MapReduceIntroduction to HDFS and MapReduce
Introduction to HDFS and MapReduce
 
HDFS User Reference
HDFS User ReferenceHDFS User Reference
HDFS User Reference
 
Apache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS FederationApache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS Federation
 
HDFS introduction
HDFS introductionHDFS introduction
HDFS introduction
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File System
 
Redis vs Memcached
Redis vs MemcachedRedis vs Memcached
Redis vs Memcached
 
Hadoop Architecture
Hadoop ArchitectureHadoop Architecture
Hadoop Architecture
 
Hadoop HDFS Concepts
Hadoop HDFS ConceptsHadoop HDFS Concepts
Hadoop HDFS Concepts
 
Hadoop Interacting with HDFS
Hadoop Interacting with HDFSHadoop Interacting with HDFS
Hadoop Interacting with HDFS
 
Hadoop hdfs
Hadoop hdfsHadoop hdfs
Hadoop hdfs
 

Viewers also liked

독채팬션//LG77。XYZ//부산광안리호텔
독채팬션//LG77。XYZ//부산광안리호텔독채팬션//LG77。XYZ//부산광안리호텔
독채팬션//LG77。XYZ//부산광안리호텔bwetdf
 
AO Bayer - Letter of Credence_ENG
AO Bayer - Letter of Credence_ENGAO Bayer - Letter of Credence_ENG
AO Bayer - Letter of Credence_ENGAnastasia Lipatkina
 
게임
게임게임
게임bwetdf
 
배내골팬션 어비산
배내골팬션 어비산배내골팬션 어비산
배내골팬션 어비산bwetdf
 
Hospital app project (how to upload app)
Hospital app project (how to upload app)Hospital app project (how to upload app)
Hospital app project (how to upload app)MAHFUZ RAIHAN
 
STUDENT ACADEMIC RECORD
STUDENT ACADEMIC RECORDSTUDENT ACADEMIC RECORD
STUDENT ACADEMIC RECORDKendon Sharp
 
Market segmentation success
Market segmentation successMarket segmentation success
Market segmentation successMAHFUZ RAIHAN
 
Recommendaton Letter - Professor Ellen Scheible
Recommendaton Letter - Professor Ellen ScheibleRecommendaton Letter - Professor Ellen Scheible
Recommendaton Letter - Professor Ellen ScheibleThomas (TJ) Horrego
 
Nuevos paradigmas del aprendizaje
Nuevos paradigmas del aprendizajeNuevos paradigmas del aprendizaje
Nuevos paradigmas del aprendizajeBreidys Barranco
 
Cv of kendon sharp 2016
Cv of kendon sharp 2016Cv of kendon sharp 2016
Cv of kendon sharp 2016Kendon Sharp
 
2015 APSC - Time interval (stroke onset and hospital arrival)
2015 APSC - Time interval (stroke onset and hospital arrival)2015 APSC - Time interval (stroke onset and hospital arrival)
2015 APSC - Time interval (stroke onset and hospital arrival)Yvonne Lee
 
De Rotta Toren Concept For City Planning
De Rotta Toren  Concept For City PlanningDe Rotta Toren  Concept For City Planning
De Rotta Toren Concept For City PlanningAAD KETTING RETAIL B.V.
 
Hadoop in the Clouds, Virtualization and Virtual Machines
Hadoop in the Clouds, Virtualization and Virtual MachinesHadoop in the Clouds, Virtualization and Virtual Machines
Hadoop in the Clouds, Virtualization and Virtual MachinesDataWorks Summit
 

Viewers also liked (20)

ensayo
ensayoensayo
ensayo
 
독채팬션//LG77。XYZ//부산광안리호텔
독채팬션//LG77。XYZ//부산광안리호텔독채팬션//LG77。XYZ//부산광안리호텔
독채팬션//LG77。XYZ//부산광안리호텔
 
AO Bayer - Letter of Credence_ENG
AO Bayer - Letter of Credence_ENGAO Bayer - Letter of Credence_ENG
AO Bayer - Letter of Credence_ENG
 
게임
게임게임
게임
 
배내골팬션 어비산
배내골팬션 어비산배내골팬션 어비산
배내골팬션 어비산
 
Resume april
Resume aprilResume april
Resume april
 
Hospital app project (how to upload app)
Hospital app project (how to upload app)Hospital app project (how to upload app)
Hospital app project (how to upload app)
 
Cover2_en
Cover2_enCover2_en
Cover2_en
 
STUDENT ACADEMIC RECORD
STUDENT ACADEMIC RECORDSTUDENT ACADEMIC RECORD
STUDENT ACADEMIC RECORD
 
Market segmentation success
Market segmentation successMarket segmentation success
Market segmentation success
 
Recommendaton Letter - Professor Ellen Scheible
Recommendaton Letter - Professor Ellen ScheibleRecommendaton Letter - Professor Ellen Scheible
Recommendaton Letter - Professor Ellen Scheible
 
Nuevos paradigmas del aprendizaje
Nuevos paradigmas del aprendizajeNuevos paradigmas del aprendizaje
Nuevos paradigmas del aprendizaje
 
Cv of kendon sharp 2016
Cv of kendon sharp 2016Cv of kendon sharp 2016
Cv of kendon sharp 2016
 
Casa De Tango Hospitality Concept
Casa De Tango  Hospitality ConceptCasa De Tango  Hospitality Concept
Casa De Tango Hospitality Concept
 
2015 APSC - Time interval (stroke onset and hospital arrival)
2015 APSC - Time interval (stroke onset and hospital arrival)2015 APSC - Time interval (stroke onset and hospital arrival)
2015 APSC - Time interval (stroke onset and hospital arrival)
 
De Rotta Toren Concept For City Planning
De Rotta Toren  Concept For City PlanningDe Rotta Toren  Concept For City Planning
De Rotta Toren Concept For City Planning
 
BAS 250 Lecture 1
BAS 250 Lecture 1BAS 250 Lecture 1
BAS 250 Lecture 1
 
LetterOfRec-HCT
LetterOfRec-HCTLetterOfRec-HCT
LetterOfRec-HCT
 
Hadoop in the Clouds, Virtualization and Virtual Machines
Hadoop in the Clouds, Virtualization and Virtual MachinesHadoop in the Clouds, Virtualization and Virtual Machines
Hadoop in the Clouds, Virtualization and Virtual Machines
 
Keamanan pangan 3
Keamanan pangan 3Keamanan pangan 3
Keamanan pangan 3
 

Similar to Hadoop training institute in hyderabad

Apache hadoop and hive
Apache hadoop and hiveApache hadoop and hive
Apache hadoop and hivesrikanthhadoop
 
Big data with HDFS and Mapreduce
Big data  with HDFS and MapreduceBig data  with HDFS and Mapreduce
Big data with HDFS and Mapreducesenthil0809
 
Unit-1 Introduction to Big Data.pptx
Unit-1 Introduction to Big Data.pptxUnit-1 Introduction to Big Data.pptx
Unit-1 Introduction to Big Data.pptxAnkitChauhan817826
 
Hadoop at a glance
Hadoop at a glanceHadoop at a glance
Hadoop at a glanceTan Tran
 
Hadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologiesHadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologiesappaji intelhunt
 
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...Simplilearn
 
Introduction to hadoop and hdfs
Introduction to hadoop and hdfsIntroduction to hadoop and hdfs
Introduction to hadoop and hdfsshrey mehrotra
 
Big data interview questions and answers
Big data interview questions and answersBig data interview questions and answers
Big data interview questions and answersKalyan Hadoop
 
Hadoop tutorial for beginners-tibacademy.in
Hadoop tutorial for beginners-tibacademy.inHadoop tutorial for beginners-tibacademy.in
Hadoop tutorial for beginners-tibacademy.inTIB Academy
 
Apache Hadoop Big Data Technology
Apache Hadoop Big Data TechnologyApache Hadoop Big Data Technology
Apache Hadoop Big Data TechnologyJay Nagar
 
Hadoop installation by santosh nage
Hadoop installation by santosh nageHadoop installation by santosh nage
Hadoop installation by santosh nageSantosh Nage
 
Hadoop operations basic
Hadoop operations basicHadoop operations basic
Hadoop operations basicHafizur Rahman
 
Dfs (Distributed computing)
Dfs (Distributed computing)Dfs (Distributed computing)
Dfs (Distributed computing)Sri Prasanna
 
Hadoop training in hyderabad-kellytechnologies
Hadoop training in hyderabad-kellytechnologiesHadoop training in hyderabad-kellytechnologies
Hadoop training in hyderabad-kellytechnologiesKelly Technologies
 
Hadoop professional-software-development-course-in-mumbai
Hadoop professional-software-development-course-in-mumbaiHadoop professional-software-development-course-in-mumbai
Hadoop professional-software-development-course-in-mumbaiUnmesh Baile
 
Hadoop-professional-software-development-course-in-mumbai
Hadoop-professional-software-development-course-in-mumbaiHadoop-professional-software-development-course-in-mumbai
Hadoop-professional-software-development-course-in-mumbaiUnmesh Baile
 
Hadoop File System.pptx
Hadoop File System.pptxHadoop File System.pptx
Hadoop File System.pptxAakashBerlia1
 

Similar to Hadoop training institute in hyderabad (20)

Apache hadoop and hive
Apache hadoop and hiveApache hadoop and hive
Apache hadoop and hive
 
Big data with HDFS and Mapreduce
Big data  with HDFS and MapreduceBig data  with HDFS and Mapreduce
Big data with HDFS and Mapreduce
 
Unit-1 Introduction to Big Data.pptx
Unit-1 Introduction to Big Data.pptxUnit-1 Introduction to Big Data.pptx
Unit-1 Introduction to Big Data.pptx
 
Hadoop at a glance
Hadoop at a glanceHadoop at a glance
Hadoop at a glance
 
Hdfs
HdfsHdfs
Hdfs
 
Hadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologiesHadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologies
 
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
 
Introduction to hadoop and hdfs
Introduction to hadoop and hdfsIntroduction to hadoop and hdfs
Introduction to hadoop and hdfs
 
Big data interview questions and answers
Big data interview questions and answersBig data interview questions and answers
Big data interview questions and answers
 
Hadoop tutorial for beginners-tibacademy.in
Hadoop tutorial for beginners-tibacademy.inHadoop tutorial for beginners-tibacademy.in
Hadoop tutorial for beginners-tibacademy.in
 
Apache Hadoop Big Data Technology
Apache Hadoop Big Data TechnologyApache Hadoop Big Data Technology
Apache Hadoop Big Data Technology
 
Hadoop installation by santosh nage
Hadoop installation by santosh nageHadoop installation by santosh nage
Hadoop installation by santosh nage
 
Lecture 2 part 1
Lecture 2 part 1Lecture 2 part 1
Lecture 2 part 1
 
Hadoop operations basic
Hadoop operations basicHadoop operations basic
Hadoop operations basic
 
Hadoop data management
Hadoop data managementHadoop data management
Hadoop data management
 
Dfs (Distributed computing)
Dfs (Distributed computing)Dfs (Distributed computing)
Dfs (Distributed computing)
 
Hadoop training in hyderabad-kellytechnologies
Hadoop training in hyderabad-kellytechnologiesHadoop training in hyderabad-kellytechnologies
Hadoop training in hyderabad-kellytechnologies
 
Hadoop professional-software-development-course-in-mumbai
Hadoop professional-software-development-course-in-mumbaiHadoop professional-software-development-course-in-mumbai
Hadoop professional-software-development-course-in-mumbai
 
Hadoop-professional-software-development-course-in-mumbai
Hadoop-professional-software-development-course-in-mumbaiHadoop-professional-software-development-course-in-mumbai
Hadoop-professional-software-development-course-in-mumbai
 
Hadoop File System.pptx
Hadoop File System.pptxHadoop File System.pptx
Hadoop File System.pptx
 

More from Kelly Technologies

Hadoop trainting in hyderabad@kelly technologies
Hadoop trainting in hyderabad@kelly technologiesHadoop trainting in hyderabad@kelly technologies
Hadoop trainting in hyderabad@kelly technologiesKelly Technologies
 
Hadoop trainting-in-hyderabad@kelly technologies
Hadoop trainting-in-hyderabad@kelly technologiesHadoop trainting-in-hyderabad@kelly technologies
Hadoop trainting-in-hyderabad@kelly technologiesKelly Technologies
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabadKelly Technologies
 
Data science institutes in hyderabad
Data science institutes in hyderabadData science institutes in hyderabad
Data science institutes in hyderabadKelly Technologies
 
Data science training in hyderabad
Data science training in hyderabadData science training in hyderabad
Data science training in hyderabadKelly Technologies
 
Hadoop institutes in hyderabad
Hadoop institutes in hyderabadHadoop institutes in hyderabad
Hadoop institutes in hyderabadKelly Technologies
 
Websphere mb training in hyderabad
Websphere mb training in hyderabadWebsphere mb training in hyderabad
Websphere mb training in hyderabadKelly Technologies
 
Hadoop institutes-in-bangalore
Hadoop institutes-in-bangaloreHadoop institutes-in-bangalore
Hadoop institutes-in-bangaloreKelly Technologies
 
Oracle training-institutes-in-hyderabad
Oracle training-institutes-in-hyderabadOracle training-institutes-in-hyderabad
Oracle training-institutes-in-hyderabadKelly Technologies
 
Hadoop training institutes in bangalore
Hadoop training institutes in bangaloreHadoop training institutes in bangalore
Hadoop training institutes in bangaloreKelly Technologies
 
Hadoop training institute in bangalore
Hadoop training institute in bangaloreHadoop training institute in bangalore
Hadoop training institute in bangaloreKelly Technologies
 
Salesforce crm-training-in-bangalore
Salesforce crm-training-in-bangaloreSalesforce crm-training-in-bangalore
Salesforce crm-training-in-bangaloreKelly Technologies
 
Qlikview training in hyderabad
Qlikview training in hyderabadQlikview training in hyderabad
Qlikview training in hyderabadKelly Technologies
 
Project Management Planning training in hyderabad
Project Management Planning training in hyderabadProject Management Planning training in hyderabad
Project Management Planning training in hyderabadKelly Technologies
 

More from Kelly Technologies (20)

Hadoop trainting in hyderabad@kelly technologies
Hadoop trainting in hyderabad@kelly technologiesHadoop trainting in hyderabad@kelly technologies
Hadoop trainting in hyderabad@kelly technologies
 
Hadoop trainting-in-hyderabad@kelly technologies
Hadoop trainting-in-hyderabad@kelly technologiesHadoop trainting-in-hyderabad@kelly technologies
Hadoop trainting-in-hyderabad@kelly technologies
 
Hadoop training-in-hyderabad
Hadoop training-in-hyderabadHadoop training-in-hyderabad
Hadoop training-in-hyderabad
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabad
 
Data science institutes in hyderabad
Data science institutes in hyderabadData science institutes in hyderabad
Data science institutes in hyderabad
 
Data science training in hyderabad
Data science training in hyderabadData science training in hyderabad
Data science training in hyderabad
 
Hadoop institutes in hyderabad
Hadoop institutes in hyderabadHadoop institutes in hyderabad
Hadoop institutes in hyderabad
 
Sas training in hyderabad
Sas training in hyderabadSas training in hyderabad
Sas training in hyderabad
 
Websphere mb training in hyderabad
Websphere mb training in hyderabadWebsphere mb training in hyderabad
Websphere mb training in hyderabad
 
Hadoop institutes-in-bangalore
Hadoop institutes-in-bangaloreHadoop institutes-in-bangalore
Hadoop institutes-in-bangalore
 
Oracle training-institutes-in-hyderabad
Oracle training-institutes-in-hyderabadOracle training-institutes-in-hyderabad
Oracle training-institutes-in-hyderabad
 
Hadoop training institutes in bangalore
Hadoop training institutes in bangaloreHadoop training institutes in bangalore
Hadoop training institutes in bangalore
 
Hadoop training institute in bangalore
Hadoop training institute in bangaloreHadoop training institute in bangalore
Hadoop training institute in bangalore
 
Tableau training in bangalore
Tableau training in bangaloreTableau training in bangalore
Tableau training in bangalore
 
Salesforce crm-training-in-bangalore
Salesforce crm-training-in-bangaloreSalesforce crm-training-in-bangalore
Salesforce crm-training-in-bangalore
 
Oracle training in hyderabad
Oracle training in hyderabadOracle training in hyderabad
Oracle training in hyderabad
 
Qlikview training in hyderabad
Qlikview training in hyderabadQlikview training in hyderabad
Qlikview training in hyderabad
 
Spark training-in-bangalore
Spark training-in-bangaloreSpark training-in-bangalore
Spark training-in-bangalore
 
Project Management Planning training in hyderabad
Project Management Planning training in hyderabadProject Management Planning training in hyderabad
Project Management Planning training in hyderabad
 
Hadoop training in bangalore
Hadoop training in bangaloreHadoop training in bangalore
Hadoop training in bangalore
 

Recently uploaded

Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfadityarao40181
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupJonathanParaisoCruz
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerunnathinaik
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 

Recently uploaded (20)

Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdf
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized Group
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of India
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developer
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 

Hadoop training institute in hyderabad

  • 1.
  • 2. Hadoop manages: › processor time › memory › disk space › network bandwidth  Does not have a security model  Can handle HW failure www.kellytechno.com
  • 3.  Issues: › race conditions › synchronization › deadlock  i.e., same issues as distributed OS & distributed filesystem www.kellytechno.com
  • 4.  Grid computing: (What is this?)  e.g. Condor › MPI model is more complicated › does not automatically distribute data › requires separate managed SAN www.kellytechno.com
  • 5.  Hadoop: › simplified programming model › data distributed as it is loaded  HDFS splits large data files across machines  HDFS replicates data › failure causes additional replication www.kellytechno.com
  • 7.  Core idea: records are processed in isolation  Benefit: reduced communication  Jargon: › mapper – task that processes records › Reducer – task that aggregates results from mappers www.kellytechno.com
  • 9. How is the previous picture different from normal grid/cluster computing? Grid/cluster: Programmer manages communication via MPI vs Hadoop: communication is implicit Hadoop manages data transfer and cluster topology issues www.kellytechno.com
  • 10.  Hadoop overhead › MPI does better for small numbers of nodes  Hadoop – flat scalabity  pays off with large data › Little extra work to go from few to many nodes  MPI – requires explicit refactoring from small to larger number of nodes www.kellytechno.com
  • 11.  NFS: the Network File System › Saw this in OS class › Supports file system exporting › Supports mounting of remote file system www.kellytechno.com
  • 14.  Establishes logical connection between server and client.  Mount operation: name of remote directory & name of server › Mount request is mapped to corresponding RPC and forwarded to mount server running on server machine. › Export list – specifies local file systems that server exports for mounting, along with names of machines that are permitted to mount them. www.kellytechno.com
  • 15.  server returns a file handle—a key for further accesses.  File handle – a file-system identifier, and an inode number to identify the mounted directory  The mount operation changes only the user’s view and does not affect the server side. www.kellytechno.com
  • 16. NFS Advantages › Transparency – clients unaware of local vs remote › Standard operations - open(), close(), fread(), etc. NFS disadvantages › Files in an NFS volume reside on a single machine › No reliability guarantees if that machine goes down › All clients must go to this machine to retrieve their data www.kellytechno.com
  • 17.  HDFS Advantages: › designed to store terabytes or petabytes › data spread across a large number of machines › supports much larger file sizes than NFS › stores data reliably (replication) www.kellytechno.com
  • 18.  HDFS Advantages: ›  provides fast, scalable access › serve more clients by adding more machines › integrates with MapReduce local computation www.kellytechno.com
  • 19.  HDFS Disadvantages › Not as general-purpose as NFS › Design restricts use to a particular class of applications › HDFS optimized for streaming read performance not good at random access www.kellytechno.com
  • 20.  HDFS Disadvantages › Write once read many model › Updating a files after it has been closed is not supported (can’t append data) › System does not provide a mechanism for local caching of data www.kellytechno.com
  • 21.  HDFS – block-structured file system  File broken into blocks distributed among DataNodes  DataNodes – machines used to store data blocks www.kellytechno.com
  • 22.  Target machines chosen randomly on a block- by-block basis  Supports file sizes far larger than a single- machine DFS  Each block replicated across a number of machines (3, by default) www.kellytechno.com
  • 24.  Expects large file size › Small number of large files › Hundreds of MB to GB each  Expects sequential access  Default block size in HDFS is 64MB  Result: › Reduces amount of metadata storage per file ›  Supports fast streaming of data (large amounts of contiguous data) www.kellytechno.com
  • 25.  HDFS expects to read a block start-to- finish › Useful for MapReduce › Not good for random access › Not a good general purpose file system www.kellytechno.com
  • 26.  HDFS files are NOT part of the ordinary file system  HDFS files are in separate name space  Not possible to interact with files using ls, cp, mv, etc.  Don’t worry: HDFS provides similar utilities www.kellytechno.com
  • 27.  Meta data handled by NameNode › Deal with synchronization by only allowing one machine to handle it › Store meta data for entire file system › Not much data: file names, permissions, & locations of each block of each file www.kellytechno.com
  • 29.  What happens if the NameNode fails? › Bigger problem than failed DataNode › Better be using RAID ;-) › Cluster is kaput until NameNode restored  Not exactly relevant but: › DataNodes are more likely to fail. › Why? www.kellytechno.com
  • 30.  First download and unzip a copy of Hadoop ( http://hadoop.apache.org/releases.html)  Or better yet, follow this lecture first ;-)   Important links: › Hadoop website http://hadoop.apache.org/index.html › Hadoop Users Guide http://hadoop.apache.org/docs/current/hadoop- project-dist/hadoop-hdfs/HdfsUserGuide.html › 2012 Edition of Hadoop User’s Guide http://it-ebooks.info/book/635/ www.kellytechno.com
  • 31.   HDFS configuration is in conf/hadoop-defaults.xml › Don’t change this file. › Instead modify conf/hadoop-site.xml › Be sure to replicate this file across all nodes in your cluster › Format of entries in this file: <property> <name>property-name</name> <value>property-value</value> </property> www.kellytechno.com
  • 32. Necessary settings: 1.fs.default.name - describes the NameNode › Format: protocol specifier, hostname, port › Example: hdfs://punchbowl.cse.sc.edu:9000 1.dfs.data.dir – path on the local file system in which the DataNode instance should store its data › Format: pathname › Example: /home/sauron/hdfs/data › Can differ from DataNode to DataNode › Default is /tmp › /tmp is not a good idea in a production system ;-) www.kellytechno.com
  • 33. 3. dfs.name.dir - path on the local FS of the NameNode where the NameNode metadata is stored › Format: pathname › Example: /home/sauron/hdfs/name › Only used by NameNode › Default is /tmp › /tmp is not a good idea in a production system ;-) 3. dfs.replication – default replication factor › Default is 3 › Fewer than 3 will impact availability of data. www.kellytechno.com
  • 35.  The Master Node needs to know the names of the DataNode machines › Add hostnames to conf/slaves › One fully-qualified hostname per line › (NameNode runs on Master Node)  Create Necessary directories › user@EachMachine$ mkdir -p $HOME/hdfs/data › user@namenode$ mkdir -p $HOME/hdfs/name › Note: owner needs read/write access to all directories › Can run under your own name in a single machine cluster › Do not run Hadoop as root. Duh! www.kellytechno.com