SlideShare a Scribd company logo
1 of 24
What’s in it for you?
Performance
Cost
Fault tolerance
Data Processing
Ease of Use
1
2
3
4
5
Scalability
Security
Machine Learning
Language Support6
7
8
9
We will compare Hadoop, and Spark based on the following categories:
VS
Performance
Cost
Fault tolerance
Data Processing
Ease of Use
Scalability
Security
Machine Learning
Language Support
Comparison based on below criteria
Scheduler
Hadoop is generally slow as it performs
operations on the disk and cannot deliver
near real-time analytics from the data
No real-time
analytics
Performance
What’s in it for you?
Performance
Cost
Fault tolerance
Data Processing
Ease of Use
1
2
3
4
5
Scalability
Security
Machine Learning
Language Support6
7
8
9
We will compare Hadoop, and Spark based on the following categories:
VS
Click here to watch the video
Hadoop is generally slow as it performs
operations on the disk and cannot deliver
near real-time analytics from the data
Spark runs 100 times faster in-memory,
and 10 times faster on disk. If Spark
runs on YARN with other resources
demanding services, there could be major
degradation
No real-time
analytics
Faster in-memory
processing
Performance
Hadoop is less expensive as it is an open-
source software. It requires more memory
on disk which is relatively an inexpensive
commodity
Cost
Hadoop is less expensive as it is an open-
source software. It requires more memory
on disk which is relatively an inexpensive
commodity
Spark is open-source but requires a lot of
RAM to run in-memory. This increases the
cluster size and its cost
Cost
Hadoop is highly fault-tolerant because it
was designed to replicate data across
many nodes. Each file is split into blocks
and replicated numerous times across many
machines
Fault Tolerance
Hadoop is highly fault-tolerant because it
was designed to replicate data across
many nodes. Each file is split into blocks
and replicated numerous times across many
machines
Spark uses Resilient Distributed Datasets
(RDDs), which are fault-tolerant collections
of elements that can be operated on in
parallel
Fault Tolerance
Hadoop processes data in batches.
MapReduce operates in sequential steps
by reading data from the cluster, performing
its operations on the data, writing the results
back to the cluster
Output data
Data Processing
Batches of
input data
Hadoop processes data in batches.
MapReduce operates in sequential steps
by reading data from the cluster, performing
its operations on the data, writing the results
back to the cluster
Sparks performs batch, real-time, and
graph processing of data. It reads data
from the cluster, performs its operation on
the data, and then writes it back to the
cluster
Output data
Data Processing
Batches of
input data
Batch
Real-time
Graph
Hadoop’s MapReduce has no interactive
mode and is complex. It needs to handle
low-level APIs to process the data, which
requires lots of coding
Ease of Use
Hadoop’s MapReduce has no interactive
mode and is complex. It needs to handle
low-level APIs to process the data, which
requires lots of coding
Spark supports user-friendly APIs for
different languages. It has an interactive
mode and provides intermediate feedback
for queries and actions
Ease of Use
Hadoop framework is developed in Java
programming language. While, MapReduce
applications can be written in Python, R
and C++
MapReduce supports
programming
languages
Language Support
Hadoop framework is developed in Java
programming language. While, MapReduce
applications can be written in Python, R
and C++
Apache Spark is developed in Scala
language and supports other programming
languages like Python, R, and Java
MapReduce supports
programming
languages
Spark supports other
programming
languages
Language Support
Hadoop is highly scalable as we can add n
number of nodes in the cluster. Yahoo
reportedly used a 42,000 node Hadoop
cluster
Scalability
Hadoop is highly scalable as we can add n
number of nodes in the cluster. Yahoo
reportedly used a 42,000 node Hadoop
cluster
The largest known Spark cluster has 8,000
nodes. But as big data grows, it’s expected
that cluster sizes will increase to maintain
throughput expectations.
Scalability
Hadoop supports Kerberos and LDAP for
authentication. It also supports access
control lists (ACLs) and a traditional file
permissions model
Security
Hadoop supports Kerberos and LDAP for
authentication. It also supports access
control lists (ACLs) and a traditional file
permissions model
Spark’s security is a bit sparse as it
supports authentication via passwords. If
you run Spark on HDFS, it can use HDFS
ACLs and file-level permissions.
Additionally, Spark can run on YARN, giving
it the capability of using Kerberos
authentication.
Security
Hadoop uses Mahout for processing data
and building models. Also, Samsara, a
Scala-backed DSL language can be used
for in-memory algebraic operations and
allows users to write their own algorithms
Machine Learning
Hadoop uses Mahout for processing data
and building models. Also, Samsara, a
Scala-backed DSL language can be used
for in-memory algebraic operations and
allows users to write their own algorithms
Spark has a built-in machine learning
library that can be used for classification,
and regression. It can also build machine-
learning pipelines with hyperparameter
tuning
MLlib
Machine Learning
Hadoop MapReduce is dependent on an
external scheduler
Scheduler
Hadoop MapReduce is dependent on an
external scheduler
Apache Spark has its own scheduler
Scheduler
Hadoop vs Spark | Hadoop And Spark Difference | Hadoop And Spark Training | Simplilearn

More Related Content

More from Simplilearn

What is LSTM ?| Long Short Term Memory Explained with Example | Deep Learning...
What is LSTM ?| Long Short Term Memory Explained with Example | Deep Learning...What is LSTM ?| Long Short Term Memory Explained with Example | Deep Learning...
What is LSTM ?| Long Short Term Memory Explained with Example | Deep Learning...
Simplilearn
 
Top 10 Chat GPT Use Cases | ChatGPT Applications | ChatGPT Tutorial For Begin...
Top 10 Chat GPT Use Cases | ChatGPT Applications | ChatGPT Tutorial For Begin...Top 10 Chat GPT Use Cases | ChatGPT Applications | ChatGPT Tutorial For Begin...
Top 10 Chat GPT Use Cases | ChatGPT Applications | ChatGPT Tutorial For Begin...
Simplilearn
 
React JS Vs Next JS - What's The Difference | Next JS Tutorial For Beginners ...
React JS Vs Next JS - What's The Difference | Next JS Tutorial For Beginners ...React JS Vs Next JS - What's The Difference | Next JS Tutorial For Beginners ...
React JS Vs Next JS - What's The Difference | Next JS Tutorial For Beginners ...
Simplilearn
 
Backpropagation in Neural Networks | Back Propagation Algorithm with Examples...
Backpropagation in Neural Networks | Back Propagation Algorithm with Examples...Backpropagation in Neural Networks | Back Propagation Algorithm with Examples...
Backpropagation in Neural Networks | Back Propagation Algorithm with Examples...
Simplilearn
 
How to Become a Business Analyst ?| Roadmap to Become Business Analyst | Simp...
How to Become a Business Analyst ?| Roadmap to Become Business Analyst | Simp...How to Become a Business Analyst ?| Roadmap to Become Business Analyst | Simp...
How to Become a Business Analyst ?| Roadmap to Become Business Analyst | Simp...
Simplilearn
 
Career Opportunities In Artificial Intelligence 2023 | AI Job Opportunities |...
Career Opportunities In Artificial Intelligence 2023 | AI Job Opportunities |...Career Opportunities In Artificial Intelligence 2023 | AI Job Opportunities |...
Career Opportunities In Artificial Intelligence 2023 | AI Job Opportunities |...
Simplilearn
 
Programming for Beginners | How to Start Coding in 2023? | Introduction to Pr...
Programming for Beginners | How to Start Coding in 2023? | Introduction to Pr...Programming for Beginners | How to Start Coding in 2023? | Introduction to Pr...
Programming for Beginners | How to Start Coding in 2023? | Introduction to Pr...
Simplilearn
 
Best IDE for Programming in 2023 | Top 8 Programming IDE You Should Know | Si...
Best IDE for Programming in 2023 | Top 8 Programming IDE You Should Know | Si...Best IDE for Programming in 2023 | Top 8 Programming IDE You Should Know | Si...
Best IDE for Programming in 2023 | Top 8 Programming IDE You Should Know | Si...
Simplilearn
 
React 18 Overview | React 18 New Features and Changes | React 18 Tutorial 202...
React 18 Overview | React 18 New Features and Changes | React 18 Tutorial 202...React 18 Overview | React 18 New Features and Changes | React 18 Tutorial 202...
React 18 Overview | React 18 New Features and Changes | React 18 Tutorial 202...
Simplilearn
 
What Is Next JS ? | Introduction to Next JS | Basics of Next JS | Next JS Tut...
What Is Next JS ? | Introduction to Next JS | Basics of Next JS | Next JS Tut...What Is Next JS ? | Introduction to Next JS | Basics of Next JS | Next JS Tut...
What Is Next JS ? | Introduction to Next JS | Basics of Next JS | Next JS Tut...
Simplilearn
 
How To Become an SEO Expert In 2023 | SEO Expert Tutorial | SEO For Beginners...
How To Become an SEO Expert In 2023 | SEO Expert Tutorial | SEO For Beginners...How To Become an SEO Expert In 2023 | SEO Expert Tutorial | SEO For Beginners...
How To Become an SEO Expert In 2023 | SEO Expert Tutorial | SEO For Beginners...
Simplilearn
 
WordPress Tutorial for Beginners 2023 | What Is WordPress and How Does It Wor...
WordPress Tutorial for Beginners 2023 | What Is WordPress and How Does It Wor...WordPress Tutorial for Beginners 2023 | What Is WordPress and How Does It Wor...
WordPress Tutorial for Beginners 2023 | What Is WordPress and How Does It Wor...
Simplilearn
 
Blogging For Beginners 2023 | How To Create A Blog | Blogging Tutorial | Simp...
Blogging For Beginners 2023 | How To Create A Blog | Blogging Tutorial | Simp...Blogging For Beginners 2023 | How To Create A Blog | Blogging Tutorial | Simp...
Blogging For Beginners 2023 | How To Create A Blog | Blogging Tutorial | Simp...
Simplilearn
 
How To Start A Blog In 2023 | Pros And Cons Of Blogging | Blogging Tutorial |...
How To Start A Blog In 2023 | Pros And Cons Of Blogging | Blogging Tutorial |...How To Start A Blog In 2023 | Pros And Cons Of Blogging | Blogging Tutorial |...
How To Start A Blog In 2023 | Pros And Cons Of Blogging | Blogging Tutorial |...
Simplilearn
 
How to Increase Website Traffic ? | 10 Ways To Increase Website Traffic in 20...
How to Increase Website Traffic ? | 10 Ways To Increase Website Traffic in 20...How to Increase Website Traffic ? | 10 Ways To Increase Website Traffic in 20...
How to Increase Website Traffic ? | 10 Ways To Increase Website Traffic in 20...
Simplilearn
 
Google Keyword Planner Tutorial For 2023 | How to Use Google Keyword Planner?...
Google Keyword Planner Tutorial For 2023 | How to Use Google Keyword Planner?...Google Keyword Planner Tutorial For 2023 | How to Use Google Keyword Planner?...
Google Keyword Planner Tutorial For 2023 | How to Use Google Keyword Planner?...
Simplilearn
 
Content Writing Tutorial for Beginners | What Is Content Writing | Content Wr...
Content Writing Tutorial for Beginners | What Is Content Writing | Content Wr...Content Writing Tutorial for Beginners | What Is Content Writing | Content Wr...
Content Writing Tutorial for Beginners | What Is Content Writing | Content Wr...
Simplilearn
 
YouTube SEO 2023 | How to Rank YouTube Videos ? | YouTube SEO Tutorial | Simp...
YouTube SEO 2023 | How to Rank YouTube Videos ? | YouTube SEO Tutorial | Simp...YouTube SEO 2023 | How to Rank YouTube Videos ? | YouTube SEO Tutorial | Simp...
YouTube SEO 2023 | How to Rank YouTube Videos ? | YouTube SEO Tutorial | Simp...
Simplilearn
 

More from Simplilearn (20)

Types Of Cloud Jobs In 2024
Types Of Cloud Jobs In 2024Types Of Cloud Jobs In 2024
Types Of Cloud Jobs In 2024
 
Top 12 AI Technologies To Learn 2024 | Top AI Technologies in 2024 | AI Trend...
Top 12 AI Technologies To Learn 2024 | Top AI Technologies in 2024 | AI Trend...Top 12 AI Technologies To Learn 2024 | Top AI Technologies in 2024 | AI Trend...
Top 12 AI Technologies To Learn 2024 | Top AI Technologies in 2024 | AI Trend...
 
What is LSTM ?| Long Short Term Memory Explained with Example | Deep Learning...
What is LSTM ?| Long Short Term Memory Explained with Example | Deep Learning...What is LSTM ?| Long Short Term Memory Explained with Example | Deep Learning...
What is LSTM ?| Long Short Term Memory Explained with Example | Deep Learning...
 
Top 10 Chat GPT Use Cases | ChatGPT Applications | ChatGPT Tutorial For Begin...
Top 10 Chat GPT Use Cases | ChatGPT Applications | ChatGPT Tutorial For Begin...Top 10 Chat GPT Use Cases | ChatGPT Applications | ChatGPT Tutorial For Begin...
Top 10 Chat GPT Use Cases | ChatGPT Applications | ChatGPT Tutorial For Begin...
 
React JS Vs Next JS - What's The Difference | Next JS Tutorial For Beginners ...
React JS Vs Next JS - What's The Difference | Next JS Tutorial For Beginners ...React JS Vs Next JS - What's The Difference | Next JS Tutorial For Beginners ...
React JS Vs Next JS - What's The Difference | Next JS Tutorial For Beginners ...
 
Backpropagation in Neural Networks | Back Propagation Algorithm with Examples...
Backpropagation in Neural Networks | Back Propagation Algorithm with Examples...Backpropagation in Neural Networks | Back Propagation Algorithm with Examples...
Backpropagation in Neural Networks | Back Propagation Algorithm with Examples...
 
How to Become a Business Analyst ?| Roadmap to Become Business Analyst | Simp...
How to Become a Business Analyst ?| Roadmap to Become Business Analyst | Simp...How to Become a Business Analyst ?| Roadmap to Become Business Analyst | Simp...
How to Become a Business Analyst ?| Roadmap to Become Business Analyst | Simp...
 
Career Opportunities In Artificial Intelligence 2023 | AI Job Opportunities |...
Career Opportunities In Artificial Intelligence 2023 | AI Job Opportunities |...Career Opportunities In Artificial Intelligence 2023 | AI Job Opportunities |...
Career Opportunities In Artificial Intelligence 2023 | AI Job Opportunities |...
 
Programming for Beginners | How to Start Coding in 2023? | Introduction to Pr...
Programming for Beginners | How to Start Coding in 2023? | Introduction to Pr...Programming for Beginners | How to Start Coding in 2023? | Introduction to Pr...
Programming for Beginners | How to Start Coding in 2023? | Introduction to Pr...
 
Best IDE for Programming in 2023 | Top 8 Programming IDE You Should Know | Si...
Best IDE for Programming in 2023 | Top 8 Programming IDE You Should Know | Si...Best IDE for Programming in 2023 | Top 8 Programming IDE You Should Know | Si...
Best IDE for Programming in 2023 | Top 8 Programming IDE You Should Know | Si...
 
React 18 Overview | React 18 New Features and Changes | React 18 Tutorial 202...
React 18 Overview | React 18 New Features and Changes | React 18 Tutorial 202...React 18 Overview | React 18 New Features and Changes | React 18 Tutorial 202...
React 18 Overview | React 18 New Features and Changes | React 18 Tutorial 202...
 
What Is Next JS ? | Introduction to Next JS | Basics of Next JS | Next JS Tut...
What Is Next JS ? | Introduction to Next JS | Basics of Next JS | Next JS Tut...What Is Next JS ? | Introduction to Next JS | Basics of Next JS | Next JS Tut...
What Is Next JS ? | Introduction to Next JS | Basics of Next JS | Next JS Tut...
 
How To Become an SEO Expert In 2023 | SEO Expert Tutorial | SEO For Beginners...
How To Become an SEO Expert In 2023 | SEO Expert Tutorial | SEO For Beginners...How To Become an SEO Expert In 2023 | SEO Expert Tutorial | SEO For Beginners...
How To Become an SEO Expert In 2023 | SEO Expert Tutorial | SEO For Beginners...
 
WordPress Tutorial for Beginners 2023 | What Is WordPress and How Does It Wor...
WordPress Tutorial for Beginners 2023 | What Is WordPress and How Does It Wor...WordPress Tutorial for Beginners 2023 | What Is WordPress and How Does It Wor...
WordPress Tutorial for Beginners 2023 | What Is WordPress and How Does It Wor...
 
Blogging For Beginners 2023 | How To Create A Blog | Blogging Tutorial | Simp...
Blogging For Beginners 2023 | How To Create A Blog | Blogging Tutorial | Simp...Blogging For Beginners 2023 | How To Create A Blog | Blogging Tutorial | Simp...
Blogging For Beginners 2023 | How To Create A Blog | Blogging Tutorial | Simp...
 
How To Start A Blog In 2023 | Pros And Cons Of Blogging | Blogging Tutorial |...
How To Start A Blog In 2023 | Pros And Cons Of Blogging | Blogging Tutorial |...How To Start A Blog In 2023 | Pros And Cons Of Blogging | Blogging Tutorial |...
How To Start A Blog In 2023 | Pros And Cons Of Blogging | Blogging Tutorial |...
 
How to Increase Website Traffic ? | 10 Ways To Increase Website Traffic in 20...
How to Increase Website Traffic ? | 10 Ways To Increase Website Traffic in 20...How to Increase Website Traffic ? | 10 Ways To Increase Website Traffic in 20...
How to Increase Website Traffic ? | 10 Ways To Increase Website Traffic in 20...
 
Google Keyword Planner Tutorial For 2023 | How to Use Google Keyword Planner?...
Google Keyword Planner Tutorial For 2023 | How to Use Google Keyword Planner?...Google Keyword Planner Tutorial For 2023 | How to Use Google Keyword Planner?...
Google Keyword Planner Tutorial For 2023 | How to Use Google Keyword Planner?...
 
Content Writing Tutorial for Beginners | What Is Content Writing | Content Wr...
Content Writing Tutorial for Beginners | What Is Content Writing | Content Wr...Content Writing Tutorial for Beginners | What Is Content Writing | Content Wr...
Content Writing Tutorial for Beginners | What Is Content Writing | Content Wr...
 
YouTube SEO 2023 | How to Rank YouTube Videos ? | YouTube SEO Tutorial | Simp...
YouTube SEO 2023 | How to Rank YouTube Videos ? | YouTube SEO Tutorial | Simp...YouTube SEO 2023 | How to Rank YouTube Videos ? | YouTube SEO Tutorial | Simp...
YouTube SEO 2023 | How to Rank YouTube Videos ? | YouTube SEO Tutorial | Simp...
 

Recently uploaded

會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
中 央社
 
ppt your views.ppt your views of your college in your eyes
ppt your views.ppt your views of your college in your eyesppt your views.ppt your views of your college in your eyes
ppt your views.ppt your views of your college in your eyes
ashishpaul799
 
The basics of sentences session 4pptx.pptx
The basics of sentences session 4pptx.pptxThe basics of sentences session 4pptx.pptx
The basics of sentences session 4pptx.pptx
heathfieldcps1
 

Recently uploaded (20)

REPRODUCTIVE TOXICITY STUDIE OF MALE AND FEMALEpptx
REPRODUCTIVE TOXICITY  STUDIE OF MALE AND FEMALEpptxREPRODUCTIVE TOXICITY  STUDIE OF MALE AND FEMALEpptx
REPRODUCTIVE TOXICITY STUDIE OF MALE AND FEMALEpptx
 
factors influencing drug absorption-final-2.pptx
factors influencing drug absorption-final-2.pptxfactors influencing drug absorption-final-2.pptx
factors influencing drug absorption-final-2.pptx
 
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdfDanh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
 
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
 
Basic Civil Engg Notes_Chapter-6_Environment Pollution & Engineering
Basic Civil Engg Notes_Chapter-6_Environment Pollution & EngineeringBasic Civil Engg Notes_Chapter-6_Environment Pollution & Engineering
Basic Civil Engg Notes_Chapter-6_Environment Pollution & Engineering
 
Post Exam Fun(da) Intra UEM General Quiz - Finals.pdf
Post Exam Fun(da) Intra UEM General Quiz - Finals.pdfPost Exam Fun(da) Intra UEM General Quiz - Finals.pdf
Post Exam Fun(da) Intra UEM General Quiz - Finals.pdf
 
MichaelStarkes_UncutGemsProjectSummary.pdf
MichaelStarkes_UncutGemsProjectSummary.pdfMichaelStarkes_UncutGemsProjectSummary.pdf
MichaelStarkes_UncutGemsProjectSummary.pdf
 
The Ultimate Guide to Social Media Marketing in 2024.pdf
The Ultimate Guide to Social Media Marketing in 2024.pdfThe Ultimate Guide to Social Media Marketing in 2024.pdf
The Ultimate Guide to Social Media Marketing in 2024.pdf
 
Matatag-Curriculum and the 21st Century Skills Presentation.pptx
Matatag-Curriculum and the 21st Century Skills Presentation.pptxMatatag-Curriculum and the 21st Century Skills Presentation.pptx
Matatag-Curriculum and the 21st Century Skills Presentation.pptx
 
Removal Strategy _ FEFO _ Working with Perishable Products in Odoo 17
Removal Strategy _ FEFO _ Working with Perishable Products in Odoo 17Removal Strategy _ FEFO _ Working with Perishable Products in Odoo 17
Removal Strategy _ FEFO _ Working with Perishable Products in Odoo 17
 
ppt your views.ppt your views of your college in your eyes
ppt your views.ppt your views of your college in your eyesppt your views.ppt your views of your college in your eyes
ppt your views.ppt your views of your college in your eyes
 
....................Muslim-Law notes.pdf
....................Muslim-Law notes.pdf....................Muslim-Law notes.pdf
....................Muslim-Law notes.pdf
 
Basic Civil Engineering notes on Transportation Engineering, Modes of Transpo...
Basic Civil Engineering notes on Transportation Engineering, Modes of Transpo...Basic Civil Engineering notes on Transportation Engineering, Modes of Transpo...
Basic Civil Engineering notes on Transportation Engineering, Modes of Transpo...
 
Capitol Tech Univ Doctoral Presentation -May 2024
Capitol Tech Univ Doctoral Presentation -May 2024Capitol Tech Univ Doctoral Presentation -May 2024
Capitol Tech Univ Doctoral Presentation -May 2024
 
Post Exam Fun(da) Intra UEM General Quiz 2024 - Prelims q&a.pdf
Post Exam Fun(da) Intra UEM General Quiz 2024 - Prelims q&a.pdfPost Exam Fun(da) Intra UEM General Quiz 2024 - Prelims q&a.pdf
Post Exam Fun(da) Intra UEM General Quiz 2024 - Prelims q&a.pdf
 
2024_Student Session 2_ Set Plan Preparation.pptx
2024_Student Session 2_ Set Plan Preparation.pptx2024_Student Session 2_ Set Plan Preparation.pptx
2024_Student Session 2_ Set Plan Preparation.pptx
 
Operations Management - Book1.p - Dr. Abdulfatah A. Salem
Operations Management - Book1.p  - Dr. Abdulfatah A. SalemOperations Management - Book1.p  - Dr. Abdulfatah A. Salem
Operations Management - Book1.p - Dr. Abdulfatah A. Salem
 
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdfINU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
 
The basics of sentences session 4pptx.pptx
The basics of sentences session 4pptx.pptxThe basics of sentences session 4pptx.pptx
The basics of sentences session 4pptx.pptx
 
Features of Video Calls in the Discuss Module in Odoo 17
Features of Video Calls in the Discuss Module in Odoo 17Features of Video Calls in the Discuss Module in Odoo 17
Features of Video Calls in the Discuss Module in Odoo 17
 

Hadoop vs Spark | Hadoop And Spark Difference | Hadoop And Spark Training | Simplilearn

  • 1. What’s in it for you? Performance Cost Fault tolerance Data Processing Ease of Use 1 2 3 4 5 Scalability Security Machine Learning Language Support6 7 8 9 We will compare Hadoop, and Spark based on the following categories: VS
  • 2. Performance Cost Fault tolerance Data Processing Ease of Use Scalability Security Machine Learning Language Support Comparison based on below criteria Scheduler
  • 3. Hadoop is generally slow as it performs operations on the disk and cannot deliver near real-time analytics from the data No real-time analytics Performance
  • 4. What’s in it for you? Performance Cost Fault tolerance Data Processing Ease of Use 1 2 3 4 5 Scalability Security Machine Learning Language Support6 7 8 9 We will compare Hadoop, and Spark based on the following categories: VS Click here to watch the video
  • 5. Hadoop is generally slow as it performs operations on the disk and cannot deliver near real-time analytics from the data Spark runs 100 times faster in-memory, and 10 times faster on disk. If Spark runs on YARN with other resources demanding services, there could be major degradation No real-time analytics Faster in-memory processing Performance
  • 6. Hadoop is less expensive as it is an open- source software. It requires more memory on disk which is relatively an inexpensive commodity Cost
  • 7. Hadoop is less expensive as it is an open- source software. It requires more memory on disk which is relatively an inexpensive commodity Spark is open-source but requires a lot of RAM to run in-memory. This increases the cluster size and its cost Cost
  • 8. Hadoop is highly fault-tolerant because it was designed to replicate data across many nodes. Each file is split into blocks and replicated numerous times across many machines Fault Tolerance
  • 9. Hadoop is highly fault-tolerant because it was designed to replicate data across many nodes. Each file is split into blocks and replicated numerous times across many machines Spark uses Resilient Distributed Datasets (RDDs), which are fault-tolerant collections of elements that can be operated on in parallel Fault Tolerance
  • 10. Hadoop processes data in batches. MapReduce operates in sequential steps by reading data from the cluster, performing its operations on the data, writing the results back to the cluster Output data Data Processing Batches of input data
  • 11. Hadoop processes data in batches. MapReduce operates in sequential steps by reading data from the cluster, performing its operations on the data, writing the results back to the cluster Sparks performs batch, real-time, and graph processing of data. It reads data from the cluster, performs its operation on the data, and then writes it back to the cluster Output data Data Processing Batches of input data Batch Real-time Graph
  • 12. Hadoop’s MapReduce has no interactive mode and is complex. It needs to handle low-level APIs to process the data, which requires lots of coding Ease of Use
  • 13. Hadoop’s MapReduce has no interactive mode and is complex. It needs to handle low-level APIs to process the data, which requires lots of coding Spark supports user-friendly APIs for different languages. It has an interactive mode and provides intermediate feedback for queries and actions Ease of Use
  • 14. Hadoop framework is developed in Java programming language. While, MapReduce applications can be written in Python, R and C++ MapReduce supports programming languages Language Support
  • 15. Hadoop framework is developed in Java programming language. While, MapReduce applications can be written in Python, R and C++ Apache Spark is developed in Scala language and supports other programming languages like Python, R, and Java MapReduce supports programming languages Spark supports other programming languages Language Support
  • 16. Hadoop is highly scalable as we can add n number of nodes in the cluster. Yahoo reportedly used a 42,000 node Hadoop cluster Scalability
  • 17. Hadoop is highly scalable as we can add n number of nodes in the cluster. Yahoo reportedly used a 42,000 node Hadoop cluster The largest known Spark cluster has 8,000 nodes. But as big data grows, it’s expected that cluster sizes will increase to maintain throughput expectations. Scalability
  • 18. Hadoop supports Kerberos and LDAP for authentication. It also supports access control lists (ACLs) and a traditional file permissions model Security
  • 19. Hadoop supports Kerberos and LDAP for authentication. It also supports access control lists (ACLs) and a traditional file permissions model Spark’s security is a bit sparse as it supports authentication via passwords. If you run Spark on HDFS, it can use HDFS ACLs and file-level permissions. Additionally, Spark can run on YARN, giving it the capability of using Kerberos authentication. Security
  • 20. Hadoop uses Mahout for processing data and building models. Also, Samsara, a Scala-backed DSL language can be used for in-memory algebraic operations and allows users to write their own algorithms Machine Learning
  • 21. Hadoop uses Mahout for processing data and building models. Also, Samsara, a Scala-backed DSL language can be used for in-memory algebraic operations and allows users to write their own algorithms Spark has a built-in machine learning library that can be used for classification, and regression. It can also build machine- learning pipelines with hyperparameter tuning MLlib Machine Learning
  • 22. Hadoop MapReduce is dependent on an external scheduler Scheduler
  • 23. Hadoop MapReduce is dependent on an external scheduler Apache Spark has its own scheduler Scheduler