SlideShare a Scribd company logo
www.company.com
PRESENTED BY :
SHWETA PATNAIK-120101CSR014
Apache Hadoop
Technology
www.company.com
Content :
• Introduction to Hadoop
• Hadoop architecture
• What is Apache Hadoop
• Data flow
• MapReduce
• HDFS
• YARN Framework
• Who uses Hadoop
• Hadoop in enterprises
• Advantage
• Conclusion
www.company.com
What is Hadoop :
• Hadoop is a free, Java-based programming framework
that supports the processing of large data sets in a
distributed computing environment. It is part of
the Apache project sponsored by the Apache Software
Foundation.
• At its core, Hadoop has two major layers namely:
– (a) Processing/Computation layer (MapReduce), and
– (b) Storage layer (Hadoop Distributed File System).
www.company.com
Hadoop Architecture :
www.company.com
What is Apache Hadoop :
• The Apache Hadoop software library is a framework that
allows for the distributed processing of large data sets
across clusters of computers using simple programming
models.
• It is designed to scale up from single servers to thousands
of machines, each offering local computation and storage..
www.company.com
Data flow :
Web Servers Scribe Servers
Network
Storage
Hadoop ClusterOracle RAC MySQL
www.company.com
MapReduce :
• Hadoop MapReduce is a software framework for easily
writing applications which process vast amounts of data
(multi-terabyte data-sets) in-parallel on large clusters
(thousands of nodes) of commodity hardware in a reliable,
fault-tolerant manner.
• A MapReduce job usually splits the input data-set into
independent chunks which are processed by the map
tasks in a completely parallel manner. The framework sorts
the outputs of the maps, which are then input to the reduce
tasks.
www.company.com
Cont..
• Job – A “full program” - an execution of a Mapper
and Reducer across a data set
• Task – An execution of a Mapper or a Reducer
on a slice of data
• a.k.a. Task-In-Progress (TIP)
• Task Attempt – A particular instance of an
attempt to execute a task on a machine
www.company.com
MapReduce High level :
JobTracker
MapReduce job
submitted by
client computer
Master node
TaskTracker
Slave node
Task instance
TaskTracker
Slave node
Task instance
TaskTracker
Slave node
Task instance
www.company.com
HDFS :
• A file system, that stores data in a very efficient
manner, which can be used easily. A distributed file
system that provides high throughput access to
application.
• Features :
– It is suitable for the distributed storage and processing.
– Hadoop provides a command interface to interact with HDFS.
– The built-in servers of namenode and datanode help users to
easily check the status of cluster.
– Streaming access to file system data.
– HDFS provides file permissions and authentication.
www.company.com
Architecture :
www.company.com
YARN Framework :
• Apache Hadoop YARN (Yet Another Resource Negotiator) is a
cluster management technology.
• YARN is the foundation of the new generation of Hadoop and is
enabling organizations everywhere to realize a modern data
architecture.
• It provides resource management and a central platform to
deliver consistent operations, security, and data governance tools
across Hadoop clusters.
• It provides, a consistent framework for writing data access
applications that run IN Hadoop, to the developers.
www.company.com
Cont. :
• Some features are :
– Multi Tangency
– Cluster Utilization
– Scalability
– Compatibility
www.company.com
Architecture :
www.company.com
Who Uses Hadoop :
• Amazon/A9
• Facebook
• Google
• IBM
• Joost
• Last.fm
• New York Times
• PowerSet
• Veoh
• Yahoo!
www.company.com
www.company.com
Hadoop in the Enterprise
• Accelerate nightly batch business processes
• Storage of extremely high volumes of data
• Creation of automatic, redundant backups
• Improving the scalability of applications
• Use of Java for data processing instead of SQL
• Producing JIT feeds for dashboards and BI
• Handling urgent, ad hoc request for data
• Turning unstructured data into relational data
• Taking on tasks that require massive parallelism
• Moving existing algorithms, code, frameworks, and
components to a highly distributed computing
environment
www.company.com
Advantage :
• Hadoop framework allows the user to quickly write and
test distributed systems. It is efficient, and it automatic
distributes the data and work across the machines and in
turn, utilizes the underlying parallelism of the CPU cores.
• Hadoop does not rely on hardware to provide fault-
tolerance and high availability (FTHA), rather Hadoop
library itself has been designed to detect and handle
failures at the application layer.
www.company.com
• Servers can be added or removed from the cluster
dynamically and Hadoop continues to operate without
interruption.
• Another big advantage of Hadoop is that apart from
being open source, it is compatible on all the
platforms since it is Java based.
www.company.com
Conclusion :
• Apache Hadoop is a fast-growing data framework
• Apache Hadoop offers a free, cohesive platform that
encapsulates:
• – Data integration
• – Data processing
• – Workflow scheduling
• – Monitoring
www.company.com
THANK
YOU

More Related Content

What's hot

Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
Flavio Vit
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation HadoopVarun Narang
 
Hadoop Ecosystem
Hadoop EcosystemHadoop Ecosystem
Hadoop Ecosystem
Sandip Darwade
 
Hadoop Technology
Hadoop TechnologyHadoop Technology
Hadoop Technology
Atul Kushwaha
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY ppt
sravya raju
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
joelcrabb
 
Hadoop Tutorial For Beginners
Hadoop Tutorial For BeginnersHadoop Tutorial For Beginners
Hadoop Tutorial For Beginners
Dataflair Web Services Pvt Ltd
 
Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystem
Stanley Wang
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
EMC
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
rebeccatho
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)
Prashant Gupta
 
Hadoop And Their Ecosystem ppt
 Hadoop And Their Ecosystem ppt Hadoop And Their Ecosystem ppt
Hadoop And Their Ecosystem ppt
sunera pathan
 
Hadoop introduction , Why and What is Hadoop ?
Hadoop introduction , Why and What is  Hadoop ?Hadoop introduction , Why and What is  Hadoop ?
Hadoop introduction , Why and What is Hadoop ?
sudhakara st
 
Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
Hadoop Hive Tutorial | Hive Fundamentals | Hive ArchitectureHadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
Skillspeed
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
Prashant Gupta
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
Dr. C.V. Suresh Babu
 
Hadoop Architecture
Hadoop ArchitectureHadoop Architecture
Hadoop Architecture
Dr. C.V. Suresh Babu
 
Intro to HBase
Intro to HBaseIntro to HBase
Intro to HBase
alexbaranau
 
Introduction to HiveQL
Introduction to HiveQLIntroduction to HiveQL
Introduction to HiveQL
kristinferrier
 
Unit-3_BDA.ppt
Unit-3_BDA.pptUnit-3_BDA.ppt
Unit-3_BDA.ppt
PoojaShah174393
 

What's hot (20)

Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation Hadoop
 
Hadoop Ecosystem
Hadoop EcosystemHadoop Ecosystem
Hadoop Ecosystem
 
Hadoop Technology
Hadoop TechnologyHadoop Technology
Hadoop Technology
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY ppt
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Hadoop Tutorial For Beginners
Hadoop Tutorial For BeginnersHadoop Tutorial For Beginners
Hadoop Tutorial For Beginners
 
Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystem
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)
 
Hadoop And Their Ecosystem ppt
 Hadoop And Their Ecosystem ppt Hadoop And Their Ecosystem ppt
Hadoop And Their Ecosystem ppt
 
Hadoop introduction , Why and What is Hadoop ?
Hadoop introduction , Why and What is  Hadoop ?Hadoop introduction , Why and What is  Hadoop ?
Hadoop introduction , Why and What is Hadoop ?
 
Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
Hadoop Hive Tutorial | Hive Fundamentals | Hive ArchitectureHadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Hadoop Architecture
Hadoop ArchitectureHadoop Architecture
Hadoop Architecture
 
Intro to HBase
Intro to HBaseIntro to HBase
Intro to HBase
 
Introduction to HiveQL
Introduction to HiveQLIntroduction to HiveQL
Introduction to HiveQL
 
Unit-3_BDA.ppt
Unit-3_BDA.pptUnit-3_BDA.ppt
Unit-3_BDA.ppt
 

Viewers also liked

Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : Beginners
Shweta Patnaik
 
Tdg 03 sio1 b cipolato joris
Tdg 03   sio1 b cipolato jorisTdg 03   sio1 b cipolato joris
Tdg 03 sio1 b cipolato joris
Joris Cipolato
 
Clase exitosa
Clase exitosaClase exitosa
Clase exitosa
Gloria Acosta Gálvez
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : Beginners
Shweta Patnaik
 
Midia Kit - Publicitários Criativos
Midia Kit - Publicitários CriativosMidia Kit - Publicitários Criativos
Midia Kit - Publicitários Criativos
Fillipe Luis
 
Powerpoint trade show
Powerpoint trade showPowerpoint trade show
Powerpoint trade show
frostymayfly
 
Nova Identidade Visual Oi
Nova Identidade Visual OiNova Identidade Visual Oi
Nova Identidade Visual Oi
Fillipe Luis
 
Computer security Description about SQL-Injection and SYN attacks
Computer security Description about SQL-Injection and SYN attacksComputer security Description about SQL-Injection and SYN attacks
Computer security Description about SQL-Injection and SYN attacks
Tesfahunegn Minwuyelet
 
FACEBOOK ADDICTION AMONG FEMALE UNIVERSITY STUDENTS
FACEBOOK ADDICTION AMONG FEMALE UNIVERSITY STUDENTSFACEBOOK ADDICTION AMONG FEMALE UNIVERSITY STUDENTS
FACEBOOK ADDICTION AMONG FEMALE UNIVERSITY STUDENTS
Tesfahunegn Minwuyelet
 
Fluid Mechanic and Hyraulics
Fluid Mechanic and Hyraulics Fluid Mechanic and Hyraulics
Fluid Mechanic and Hyraulics
Saroz Shrestha
 

Viewers also liked (11)

Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : Beginners
 
Tdg 03 sio1 b cipolato joris
Tdg 03   sio1 b cipolato jorisTdg 03   sio1 b cipolato joris
Tdg 03 sio1 b cipolato joris
 
Company Profile
Company ProfileCompany Profile
Company Profile
 
Clase exitosa
Clase exitosaClase exitosa
Clase exitosa
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : Beginners
 
Midia Kit - Publicitários Criativos
Midia Kit - Publicitários CriativosMidia Kit - Publicitários Criativos
Midia Kit - Publicitários Criativos
 
Powerpoint trade show
Powerpoint trade showPowerpoint trade show
Powerpoint trade show
 
Nova Identidade Visual Oi
Nova Identidade Visual OiNova Identidade Visual Oi
Nova Identidade Visual Oi
 
Computer security Description about SQL-Injection and SYN attacks
Computer security Description about SQL-Injection and SYN attacksComputer security Description about SQL-Injection and SYN attacks
Computer security Description about SQL-Injection and SYN attacks
 
FACEBOOK ADDICTION AMONG FEMALE UNIVERSITY STUDENTS
FACEBOOK ADDICTION AMONG FEMALE UNIVERSITY STUDENTSFACEBOOK ADDICTION AMONG FEMALE UNIVERSITY STUDENTS
FACEBOOK ADDICTION AMONG FEMALE UNIVERSITY STUDENTS
 
Fluid Mechanic and Hyraulics
Fluid Mechanic and Hyraulics Fluid Mechanic and Hyraulics
Fluid Mechanic and Hyraulics
 

Similar to Apache hadoop technology : Beginners

List of Engineering Colleges in Uttarakhand
List of Engineering Colleges in UttarakhandList of Engineering Colleges in Uttarakhand
List of Engineering Colleges in Uttarakhand
Roorkee College of Engineering, Roorkee
 
Hadoop.pptx
Hadoop.pptxHadoop.pptx
Hadoop.pptx
arslanhaneef
 
Hadoop.pptx
Hadoop.pptxHadoop.pptx
Hadoop.pptx
sonukumar379092
 
Getting started big data
Getting started big dataGetting started big data
Getting started big data
Kibrom Gebrehiwot
 
Introduction To Hadoop Ecosystem
Introduction To Hadoop EcosystemIntroduction To Hadoop Ecosystem
Introduction To Hadoop Ecosystem
InSemble
 
project--2 nd review_2
project--2 nd review_2project--2 nd review_2
project--2 nd review_2aswini pilli
 
project--2 nd review_2
project--2 nd review_2project--2 nd review_2
project--2 nd review_2Aswini Ashu
 
Hadoop ppt1
Hadoop ppt1Hadoop ppt1
Hadoop ppt1
chariorienit
 
Big data - Online Training
Big data - Online TrainingBig data - Online Training
Big data - Online Training
Learntek1
 
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
cdmaxime
 
Hadoop
HadoopHadoop
Hadoop
chandinisanz
 
Simple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSimple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform Concept
Satish Mohan
 
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
Venneladonthireddy1
 
Hadoop
HadoopHadoop
Hadoop
Oded Rotter
 
Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016
StampedeCon
 
Hadoop Primer
Hadoop PrimerHadoop Primer
Hadoop Primer
Steve Staso
 
Azure Cafe Marketplace with Hortonworks March 31 2016
Azure Cafe Marketplace with Hortonworks March 31 2016Azure Cafe Marketplace with Hortonworks March 31 2016
Azure Cafe Marketplace with Hortonworks March 31 2016
Joan Novino
 

Similar to Apache hadoop technology : Beginners (20)

List of Engineering Colleges in Uttarakhand
List of Engineering Colleges in UttarakhandList of Engineering Colleges in Uttarakhand
List of Engineering Colleges in Uttarakhand
 
Hadoop.pptx
Hadoop.pptxHadoop.pptx
Hadoop.pptx
 
Hadoop.pptx
Hadoop.pptxHadoop.pptx
Hadoop.pptx
 
Getting started big data
Getting started big dataGetting started big data
Getting started big data
 
Introduction To Hadoop Ecosystem
Introduction To Hadoop EcosystemIntroduction To Hadoop Ecosystem
Introduction To Hadoop Ecosystem
 
project--2 nd review_2
project--2 nd review_2project--2 nd review_2
project--2 nd review_2
 
project--2 nd review_2
project--2 nd review_2project--2 nd review_2
project--2 nd review_2
 
Hadoop ppt1
Hadoop ppt1Hadoop ppt1
Hadoop ppt1
 
Big data - Online Training
Big data - Online TrainingBig data - Online Training
Big data - Online Training
 
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
 
Hadoop and Distributed Computing
Hadoop and Distributed ComputingHadoop and Distributed Computing
Hadoop and Distributed Computing
 
Hadoop
HadoopHadoop
Hadoop
 
Simple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSimple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform Concept
 
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
 
Hortonworks.bdb
Hortonworks.bdbHortonworks.bdb
Hortonworks.bdb
 
Hadoop
HadoopHadoop
Hadoop
 
Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016
 
Hadoop Primer
Hadoop PrimerHadoop Primer
Hadoop Primer
 
Anju
AnjuAnju
Anju
 
Azure Cafe Marketplace with Hortonworks March 31 2016
Azure Cafe Marketplace with Hortonworks March 31 2016Azure Cafe Marketplace with Hortonworks March 31 2016
Azure Cafe Marketplace with Hortonworks March 31 2016
 

Recently uploaded

How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 

Recently uploaded (20)

How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 

Apache hadoop technology : Beginners

  • 1. www.company.com PRESENTED BY : SHWETA PATNAIK-120101CSR014 Apache Hadoop Technology
  • 2. www.company.com Content : • Introduction to Hadoop • Hadoop architecture • What is Apache Hadoop • Data flow • MapReduce • HDFS • YARN Framework • Who uses Hadoop • Hadoop in enterprises • Advantage • Conclusion
  • 3. www.company.com What is Hadoop : • Hadoop is a free, Java-based programming framework that supports the processing of large data sets in a distributed computing environment. It is part of the Apache project sponsored by the Apache Software Foundation. • At its core, Hadoop has two major layers namely: – (a) Processing/Computation layer (MapReduce), and – (b) Storage layer (Hadoop Distributed File System).
  • 5. www.company.com What is Apache Hadoop : • The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. • It is designed to scale up from single servers to thousands of machines, each offering local computation and storage..
  • 6. www.company.com Data flow : Web Servers Scribe Servers Network Storage Hadoop ClusterOracle RAC MySQL
  • 7. www.company.com MapReduce : • Hadoop MapReduce is a software framework for easily writing applications which process vast amounts of data (multi-terabyte data-sets) in-parallel on large clusters (thousands of nodes) of commodity hardware in a reliable, fault-tolerant manner. • A MapReduce job usually splits the input data-set into independent chunks which are processed by the map tasks in a completely parallel manner. The framework sorts the outputs of the maps, which are then input to the reduce tasks.
  • 8. www.company.com Cont.. • Job – A “full program” - an execution of a Mapper and Reducer across a data set • Task – An execution of a Mapper or a Reducer on a slice of data • a.k.a. Task-In-Progress (TIP) • Task Attempt – A particular instance of an attempt to execute a task on a machine
  • 9. www.company.com MapReduce High level : JobTracker MapReduce job submitted by client computer Master node TaskTracker Slave node Task instance TaskTracker Slave node Task instance TaskTracker Slave node Task instance
  • 10. www.company.com HDFS : • A file system, that stores data in a very efficient manner, which can be used easily. A distributed file system that provides high throughput access to application. • Features : – It is suitable for the distributed storage and processing. – Hadoop provides a command interface to interact with HDFS. – The built-in servers of namenode and datanode help users to easily check the status of cluster. – Streaming access to file system data. – HDFS provides file permissions and authentication.
  • 12. www.company.com YARN Framework : • Apache Hadoop YARN (Yet Another Resource Negotiator) is a cluster management technology. • YARN is the foundation of the new generation of Hadoop and is enabling organizations everywhere to realize a modern data architecture. • It provides resource management and a central platform to deliver consistent operations, security, and data governance tools across Hadoop clusters. • It provides, a consistent framework for writing data access applications that run IN Hadoop, to the developers.
  • 13. www.company.com Cont. : • Some features are : – Multi Tangency – Cluster Utilization – Scalability – Compatibility
  • 15. www.company.com Who Uses Hadoop : • Amazon/A9 • Facebook • Google • IBM • Joost • Last.fm • New York Times • PowerSet • Veoh • Yahoo!
  • 17. www.company.com Hadoop in the Enterprise • Accelerate nightly batch business processes • Storage of extremely high volumes of data • Creation of automatic, redundant backups • Improving the scalability of applications • Use of Java for data processing instead of SQL • Producing JIT feeds for dashboards and BI • Handling urgent, ad hoc request for data • Turning unstructured data into relational data • Taking on tasks that require massive parallelism • Moving existing algorithms, code, frameworks, and components to a highly distributed computing environment
  • 18. www.company.com Advantage : • Hadoop framework allows the user to quickly write and test distributed systems. It is efficient, and it automatic distributes the data and work across the machines and in turn, utilizes the underlying parallelism of the CPU cores. • Hadoop does not rely on hardware to provide fault- tolerance and high availability (FTHA), rather Hadoop library itself has been designed to detect and handle failures at the application layer.
  • 19. www.company.com • Servers can be added or removed from the cluster dynamically and Hadoop continues to operate without interruption. • Another big advantage of Hadoop is that apart from being open source, it is compatible on all the platforms since it is Java based.
  • 20. www.company.com Conclusion : • Apache Hadoop is a fast-growing data framework • Apache Hadoop offers a free, cohesive platform that encapsulates: • – Data integration • – Data processing • – Workflow scheduling • – Monitoring