SlideShare a Scribd company logo
1 of 16
Presented by : Nabin Nayak
Enrollment No : 01512002017
Contents
oWhat is Hadoop Technology ?
oDeveloper of Hadoop
oHadoop Features
oTwo main features of Hadoop
oGoals/Requirement
oHadoop Framework and Tools
oPros of Hadoop
oCons of Hadoop
What is Hadoop Technology ?
 The most well known technology
used for Big Data is Hadoop.
 Open source software framework
designed for storage and
processing of large scale data on
clusters of commodity hardware.
 The Apache Hadoop software
library is a framework that
allows for the distributed
processing of large data sets
across clusters of computers
using simple programming
models.
 It is a flexible and highly-
available architecture for
large scale computation and
data processing on a network
of commodity hardware.
 It is made by apache software
foundation in 2011.
 Written in JAVA.
Developer of Hadoop
Michael j. cafarella Doug cutting
 Doug Cutting
and Michael J.
Cafarella
developed Hadoop
to support
distribution for
the Nutch search
engine project.
 The project was
funded by Yahoo
Features of Hadoop
 Hadoop provides access to the file systems
 The Hadoop Common package contains the
 necessary JAR files and scripts
 The package also provides source code, documentation and a
contribution section that includes projects from the Hadoop
Community.
Problems Before Hadoop
1. Processing that large data is very difficult in relational
database.
2. It would take too much time to process data and cost.
We can solve this problem by Distributed
Computing.
• But the problems in distributed computing is –
Hardware failure
Chances of hardware failure is always there.
Combine the data after analysis
Data from all disks have to be combined from all the disks
which is a mess.
To Solve all the Problems Hadoop Came.
It has two main parts –
• Hadoop Distributed File System (HDFS),
• MapReduce
Two main features of Hadoop
1.Hadoop Distributed File
System
• It ties so many small and reasonable
priced machines together into a single
cost effective computer cluster.
• Data and application processing are
protected against hardware failure.
• If a node goes down, jobs are
automatically redirected to other
nodes to make sure the distributed
computing does not fail.
• it automatically stores multiple copies
of all data.
2. MapReduce
• MapReduce is a programming model for
processing and generating large data sets with
a parallel, distributed algorithm on a cluster.
• It is an associative implementation for
processing and generating large data sets.
• MAP function that process a key pair to
generates a set of intermediate key pairs.
• REDUCE function that merges all intermediate
values associated with the same intermediate
key.
Goals / requirement
 Abstract and facilitate the storage and processing of large and/or rapidly growing data
sets
• Structured and non-structured data
• Simple programming models
 High scalability and availability
 Use commodity (cheap!) hardware with little redundancy
 Fault-tolerance
 Move computation rather than data
Hadoop Framework and Tools
Pros of Hadoop
1. Computing power
2. Flexibility
3. Fault Tolerance
4. Low Cost
5. Scalability
Cons of Hadoop
 Integration with existing systems
Hadoop is not optimized for ease for use. Installing and
integrating with existing databases might prove to be
difficult, especially since there is no software support
provided.
 Administration and ease of use
Hadoop requires knowledge of MapReduce, while most data
practitioners use SQL. This means significant training may
be required to administer Hadoop clusters.
 Security
Hadoop lacks the level of security functionality needed for
safe enterprise deployment, especially if it concerns
sensitive data.
Benefits of Hadoop
• Cost Saving and efficient and reliable data processing
• Provides an economically scalable solution
• Storing and processing of large amount of data
• Data grid operating system
• It is deployed on industry standard servers rather than expensive
specialized data storage systems
Famous user of Hadoop
Hadoop

More Related Content

What's hot

Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr
Cloudera, Inc.
 
Impala use case @ Zoosk
Impala use case @ ZooskImpala use case @ Zoosk
Impala use case @ ZooskCloudera, Inc.
 
Introduction to Big Data Technologies: Hadoop/EMR/Map Reduce & Redshift
Introduction to Big Data Technologies:  Hadoop/EMR/Map Reduce & RedshiftIntroduction to Big Data Technologies:  Hadoop/EMR/Map Reduce & Redshift
Introduction to Big Data Technologies: Hadoop/EMR/Map Reduce & RedshiftDataKitchen
 
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise NetworksUsing Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise NetworksDataWorks Summit
 
Daniel Abadi HadoopWorld 2010
Daniel Abadi HadoopWorld 2010Daniel Abadi HadoopWorld 2010
Daniel Abadi HadoopWorld 2010Daniel Abadi
 
Big Data Open Source Technologies
Big Data Open Source TechnologiesBig Data Open Source Technologies
Big Data Open Source Technologiesneeraj rathore
 
Building Data Quality pipelines with Apache Spark and Delta Lake
Building Data Quality pipelines with Apache Spark and Delta LakeBuilding Data Quality pipelines with Apache Spark and Delta Lake
Building Data Quality pipelines with Apache Spark and Delta LakeDatabricks
 
Spark in the Enterprise - 2 Years Later by Alan Saldich
Spark in the Enterprise - 2 Years Later by Alan SaldichSpark in the Enterprise - 2 Years Later by Alan Saldich
Spark in the Enterprise - 2 Years Later by Alan SaldichSpark Summit
 
Combat Cyber Threats with Cloudera Impala & Apache Hadoop
Combat Cyber Threats with Cloudera Impala & Apache HadoopCombat Cyber Threats with Cloudera Impala & Apache Hadoop
Combat Cyber Threats with Cloudera Impala & Apache HadoopCloudera, Inc.
 
Building a Data Pipeline With Tools From the Hadoop Ecosystem - StampedeCon 2016
Building a Data Pipeline With Tools From the Hadoop Ecosystem - StampedeCon 2016Building a Data Pipeline With Tools From the Hadoop Ecosystem - StampedeCon 2016
Building a Data Pipeline With Tools From the Hadoop Ecosystem - StampedeCon 2016StampedeCon
 
Transitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to SparkTransitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to SparkSlim Baltagi
 
Harnessing the Power of Apache Hadoop
Harnessing the Power of Apache Hadoop Harnessing the Power of Apache Hadoop
Harnessing the Power of Apache Hadoop Cloudera, Inc.
 
An Introduction to Apache Spark
An Introduction to Apache SparkAn Introduction to Apache Spark
An Introduction to Apache SparkDona Mary Philip
 
Big data processing with apache spark
Big data processing with apache sparkBig data processing with apache spark
Big data processing with apache sparksarith divakar
 
Part 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science WorkbenchPart 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science WorkbenchCloudera, Inc.
 
Hadoop in the cloud – The what, why and how from the experts
Hadoop in the cloud – The what, why and how from the expertsHadoop in the cloud – The what, why and how from the experts
Hadoop in the cloud – The what, why and how from the expertsDataWorks Summit
 

What's hot (20)

Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr

 
Big data course
Big data  courseBig data  course
Big data course
 
Impala use case @ Zoosk
Impala use case @ ZooskImpala use case @ Zoosk
Impala use case @ Zoosk
 
Introduction to Big Data Technologies: Hadoop/EMR/Map Reduce & Redshift
Introduction to Big Data Technologies:  Hadoop/EMR/Map Reduce & RedshiftIntroduction to Big Data Technologies:  Hadoop/EMR/Map Reduce & Redshift
Introduction to Big Data Technologies: Hadoop/EMR/Map Reduce & Redshift
 
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise NetworksUsing Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
 
Daniel Abadi HadoopWorld 2010
Daniel Abadi HadoopWorld 2010Daniel Abadi HadoopWorld 2010
Daniel Abadi HadoopWorld 2010
 
Big Data Open Source Technologies
Big Data Open Source TechnologiesBig Data Open Source Technologies
Big Data Open Source Technologies
 
Building Data Quality pipelines with Apache Spark and Delta Lake
Building Data Quality pipelines with Apache Spark and Delta LakeBuilding Data Quality pipelines with Apache Spark and Delta Lake
Building Data Quality pipelines with Apache Spark and Delta Lake
 
Spark in the Enterprise - 2 Years Later by Alan Saldich
Spark in the Enterprise - 2 Years Later by Alan SaldichSpark in the Enterprise - 2 Years Later by Alan Saldich
Spark in the Enterprise - 2 Years Later by Alan Saldich
 
Combat Cyber Threats with Cloudera Impala & Apache Hadoop
Combat Cyber Threats with Cloudera Impala & Apache HadoopCombat Cyber Threats with Cloudera Impala & Apache Hadoop
Combat Cyber Threats with Cloudera Impala & Apache Hadoop
 
Resume
ResumeResume
Resume
 
Building a Data Pipeline With Tools From the Hadoop Ecosystem - StampedeCon 2016
Building a Data Pipeline With Tools From the Hadoop Ecosystem - StampedeCon 2016Building a Data Pipeline With Tools From the Hadoop Ecosystem - StampedeCon 2016
Building a Data Pipeline With Tools From the Hadoop Ecosystem - StampedeCon 2016
 
Transitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to SparkTransitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to Spark
 
Harnessing the Power of Apache Hadoop
Harnessing the Power of Apache Hadoop Harnessing the Power of Apache Hadoop
Harnessing the Power of Apache Hadoop
 
An Introduction to Apache Spark
An Introduction to Apache SparkAn Introduction to Apache Spark
An Introduction to Apache Spark
 
Hadoop Everywhere
Hadoop EverywhereHadoop Everywhere
Hadoop Everywhere
 
Big data processing with apache spark
Big data processing with apache sparkBig data processing with apache spark
Big data processing with apache spark
 
Part 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science WorkbenchPart 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science Workbench
 
Spark mhug2
Spark mhug2Spark mhug2
Spark mhug2
 
Hadoop in the cloud – The what, why and how from the experts
Hadoop in the cloud – The what, why and how from the expertsHadoop in the cloud – The what, why and how from the experts
Hadoop in the cloud – The what, why and how from the experts
 

Similar to Hadoop

M. Florence Dayana - Hadoop Foundation for Analytics.pptx
M. Florence Dayana - Hadoop Foundation for Analytics.pptxM. Florence Dayana - Hadoop Foundation for Analytics.pptx
M. Florence Dayana - Hadoop Foundation for Analytics.pptxDr.Florence Dayana
 
Introduction to Apache hadoop
Introduction to Apache hadoopIntroduction to Apache hadoop
Introduction to Apache hadoopOmar Jaber
 
Big Data Hadoop Technology
Big Data Hadoop TechnologyBig Data Hadoop Technology
Big Data Hadoop TechnologyRahul Sharma
 
Hadoop Distriubted File System (HDFS) presentation 27- 5-2015
Hadoop Distriubted File System (HDFS) presentation 27- 5-2015Hadoop Distriubted File System (HDFS) presentation 27- 5-2015
Hadoop Distriubted File System (HDFS) presentation 27- 5-2015Abdul Nasir
 
Overview of big data & hadoop v1
Overview of big data & hadoop   v1Overview of big data & hadoop   v1
Overview of big data & hadoop v1Thanh Nguyen
 
Hadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHitendra Kumar
 
Big Data: An Overview
Big Data: An OverviewBig Data: An Overview
Big Data: An OverviewC. Scyphers
 
Hadoop introduction , Why and What is Hadoop ?
Hadoop introduction , Why and What is  Hadoop ?Hadoop introduction , Why and What is  Hadoop ?
Hadoop introduction , Why and What is Hadoop ?sudhakara st
 

Similar to Hadoop (20)

PPT on Hadoop
PPT on HadoopPPT on Hadoop
PPT on Hadoop
 
M. Florence Dayana - Hadoop Foundation for Analytics.pptx
M. Florence Dayana - Hadoop Foundation for Analytics.pptxM. Florence Dayana - Hadoop Foundation for Analytics.pptx
M. Florence Dayana - Hadoop Foundation for Analytics.pptx
 
Introduction to Apache hadoop
Introduction to Apache hadoopIntroduction to Apache hadoop
Introduction to Apache hadoop
 
Hadoop info
Hadoop infoHadoop info
Hadoop info
 
Anju
AnjuAnju
Anju
 
Big Data Hadoop Technology
Big Data Hadoop TechnologyBig Data Hadoop Technology
Big Data Hadoop Technology
 
Hadoop seminar
Hadoop seminarHadoop seminar
Hadoop seminar
 
Unit IV.pdf
Unit IV.pdfUnit IV.pdf
Unit IV.pdf
 
Hadoop
HadoopHadoop
Hadoop
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Hadoop in action
Hadoop in actionHadoop in action
Hadoop in action
 
Hadoop Distriubted File System (HDFS) presentation 27- 5-2015
Hadoop Distriubted File System (HDFS) presentation 27- 5-2015Hadoop Distriubted File System (HDFS) presentation 27- 5-2015
Hadoop Distriubted File System (HDFS) presentation 27- 5-2015
 
Overview of big data & hadoop v1
Overview of big data & hadoop   v1Overview of big data & hadoop   v1
Overview of big data & hadoop v1
 
Big data Analytics Hadoop
Big data Analytics HadoopBig data Analytics Hadoop
Big data Analytics Hadoop
 
Hadoop jon
Hadoop jonHadoop jon
Hadoop jon
 
Hadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log Processing
 
Talend for big_data_intorduction
Talend for big_data_intorductionTalend for big_data_intorduction
Talend for big_data_intorduction
 
Big Data: An Overview
Big Data: An OverviewBig Data: An Overview
Big Data: An Overview
 
Hadoop introduction , Why and What is Hadoop ?
Hadoop introduction , Why and What is  Hadoop ?Hadoop introduction , Why and What is  Hadoop ?
Hadoop introduction , Why and What is Hadoop ?
 
Big data
Big dataBig data
Big data
 

Recently uploaded

Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
Romantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxRomantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxsqpmdrvczh
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxChelloAnnAsuncion2
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 

Recently uploaded (20)

Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
Romantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxRomantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptx
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 

Hadoop

  • 1. Presented by : Nabin Nayak Enrollment No : 01512002017
  • 2. Contents oWhat is Hadoop Technology ? oDeveloper of Hadoop oHadoop Features oTwo main features of Hadoop oGoals/Requirement oHadoop Framework and Tools oPros of Hadoop oCons of Hadoop
  • 3. What is Hadoop Technology ?  The most well known technology used for Big Data is Hadoop.  Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware.  The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models.  It is a flexible and highly- available architecture for large scale computation and data processing on a network of commodity hardware.  It is made by apache software foundation in 2011.  Written in JAVA.
  • 4. Developer of Hadoop Michael j. cafarella Doug cutting  Doug Cutting and Michael J. Cafarella developed Hadoop to support distribution for the Nutch search engine project.  The project was funded by Yahoo
  • 5. Features of Hadoop  Hadoop provides access to the file systems  The Hadoop Common package contains the  necessary JAR files and scripts  The package also provides source code, documentation and a contribution section that includes projects from the Hadoop Community.
  • 6. Problems Before Hadoop 1. Processing that large data is very difficult in relational database. 2. It would take too much time to process data and cost.
  • 7. We can solve this problem by Distributed Computing. • But the problems in distributed computing is – Hardware failure Chances of hardware failure is always there. Combine the data after analysis Data from all disks have to be combined from all the disks which is a mess.
  • 8. To Solve all the Problems Hadoop Came. It has two main parts – • Hadoop Distributed File System (HDFS), • MapReduce
  • 9. Two main features of Hadoop 1.Hadoop Distributed File System • It ties so many small and reasonable priced machines together into a single cost effective computer cluster. • Data and application processing are protected against hardware failure. • If a node goes down, jobs are automatically redirected to other nodes to make sure the distributed computing does not fail. • it automatically stores multiple copies of all data. 2. MapReduce • MapReduce is a programming model for processing and generating large data sets with a parallel, distributed algorithm on a cluster. • It is an associative implementation for processing and generating large data sets. • MAP function that process a key pair to generates a set of intermediate key pairs. • REDUCE function that merges all intermediate values associated with the same intermediate key.
  • 10. Goals / requirement  Abstract and facilitate the storage and processing of large and/or rapidly growing data sets • Structured and non-structured data • Simple programming models  High scalability and availability  Use commodity (cheap!) hardware with little redundancy  Fault-tolerance  Move computation rather than data
  • 12. Pros of Hadoop 1. Computing power 2. Flexibility 3. Fault Tolerance 4. Low Cost 5. Scalability
  • 13. Cons of Hadoop  Integration with existing systems Hadoop is not optimized for ease for use. Installing and integrating with existing databases might prove to be difficult, especially since there is no software support provided.  Administration and ease of use Hadoop requires knowledge of MapReduce, while most data practitioners use SQL. This means significant training may be required to administer Hadoop clusters.  Security Hadoop lacks the level of security functionality needed for safe enterprise deployment, especially if it concerns sensitive data.
  • 14. Benefits of Hadoop • Cost Saving and efficient and reliable data processing • Provides an economically scalable solution • Storing and processing of large amount of data • Data grid operating system • It is deployed on industry standard servers rather than expensive specialized data storage systems
  • 15. Famous user of Hadoop

Editor's Notes

  1. Notes to presenter: Description of what you learned in your own words on one side. Include information about the topic Details about the topic will also be helpful here. Tell the story of your learning experience. Just like a story there should always be a beginning, middle and an end. On the other side, you can add a graphic that provides evidence of what you learned. Feel free to use more than one slide to reflect upon your process. It also helps to add some video of your process.